prediction of amines capacity for carbon dioxide absorption in gas
DESCRIPTION
AminesTRANSCRIPT
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 19
Prediction of amines capacity for carbon dioxide absorption in gas
sweetening processes
Mohammadreza Momeni Siavash Riahi
Institute of Petroleum Engineering Faculty of Chemical Engineering College of Engineering University of Tehran Tehran Iran
a r t i c l e i n f o
Article history
Received 23 July 2014
Received in revised form
30 August 2014
Accepted 1 September 2014
Available online 26 September 2014
Keywords
Gas sweetening
Rich loading
Carbon dioxide
Absorption
Amines
QSPR
a b s t r a c t
Almost all gas reservoirs around the world produce sour gas that contains considerable amounts of acidgases including carbon dioxide and hydrogen sul1047297de Because carbon dioxide in water tends to cause
corrosion and the presence of CO2 in natural gas reduces its heating value it must be removed prior to
preparation of natural gas for marketing Many technologies have offered various solutions to remove
carbon dioxide from natural gas based on regenerable amine-based solvents In order to make these
technologies more ef 1047297cient and economical further research is required in terms of experiment and
modeling to identify the main parameters which in1047298uence the capacity of amines for CO2 absorption
Numerous studies of amines have shown evidence that some relationships exist between the structure of
amine and its capacity for carbon dioxide absorption Quantitative Structure PropertyActivity Rela-
tionship (QSPRQSAR) provides an effective method for predicting amines capacity for CO 2 absorption In
this paper 1047297rst Density functional theory (DFT) method level of B3LYP and 6-311 thorn g (dp) basis set was
employed to complete molecular geometrical optimization Then the Quantitative relationship between
the absorption capacities data and calculated descriptors was achieved by the multiple linear regression
(MLR) and model variables were selected by genetic algorithms (GA) The accuracy of the model was
veri1047297ed by different statistical methods and the result proved high statistical qualities of the model
Unlike other QSPR researches the reported equation in this paper consists of simple and easy-calculated
descriptors which form a robust model for predicting amines capacity of carbon dioxide absorptioncopy 2014 Elsevier BV All rights reserved
1 Introduction
Amines are molecules containing nitrogen atoms attached to a
carbon-based chain structure They can be applied in various 1047297elds
of engineering and science One of the most important applications
of amines is using them as an acidic gas absorption liquid for
removing carbon dioxide from natural gas or oxygen containing
systems for instance 1047298ue gas (Singh et al 2007 2009) The Ab-
sorption capacity of amines is an important characteristic More-
over Different aspects of the molecules behavior of toxicity andenvironmental protection to technical issues can be affected by this
feature The solubility and absorption rate of carbon dioxide in
amine based CO2 absorbents are not only important due to tech-
nical considerations but also are vital for environmental issues
Since experimental determination of absorption capacity (or rich
loading) is very time-consuming and expensive and the values are
not always available in literature sources estimation plays an
important role (Pourbasheer et al 2011) Hence the development
of capable methods for predicting absorption capacity of different
amines becomes an urgent task
Gas sweetening or acid gas removal (for instance CO2 and H2S)is
conventionally used in various industries (Bohloul et al 2014)
Almost all gas reservoirs around the world produce sour gas that
contains considerable amounts of acid gases including carbon di-
oxide and hydrogen sul1047297de Owing tothe fact that carbon dioxide in
water tends to cause corrosion and the presence of CO2 in naturalgas reduces its heating value it must be removed prior to the
preparation of natural gas for marketing (Mokhatab and Poe 2012)
The most common absorption media for this purpose are aqueous
amine solutions Amine derivatives including monoethanolamine
(MEA) diethanolamine (DEA) and methyldiethanolamine (MDEA)
are widely being used in commercial and industrial applications
(Kohl and Nielsen 1997) Due to the importance of amines in acid
gas removal technologies a descriptive and a novel model has to be
developed from which amine chemical properties can be predicted
There are evidences in the literature indicating the existence
of relationships between the structure of an amine and its
Corresponding author University of Tehran Tehran 11365-4563 Iran Tel thorn98
21 61114714
E-mail address riahiutacir (S Riahi)
Contents lists available at ScienceDirect
Journal of Natural Gas Science and Engineering
j o u r n a l h o m e p a g e w w w e l s e v i e r c om l o c a t e j n g s e
httpdxdoiorg101016jjngse201409002
1875-5100copy
2014 Elsevier BV All rights reserved
Journal of Natural Gas Science and Engineering 21 (2014) 442e450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 29
capacity for carbon dioxide absorption (rich loading) Signi1047297cant
contribution to analyzing the relationships between the structure
and absorption capacity of amines has been made by Chakraborty
et al In their work it has been shown that the existence of
substituents at a-carbon causes a carbamate instability which
results in an accelerated hydrolysis as a result the amount of
bicarbonate increases which leads to higher carbon dioxide
loading (Chakraborty et al 1986) In addition it was explained by
Sartori and Savage that steric hindrance effects produced by a-
substituent are responsible for these instabilities (Sartori and
Savage 1983) In addition Chakraborty studied the electronic
effects of substituents and suggested that substitution at carbon
atom causes an interaction of the p and p methyl group orbital
with the lone pair of the nitrogen Since nitrogen charge is
reduced by this interaction it reduces the strength of the NeH
bond which results in the raise of the hydrolysis in the aqueous
solution It seems that the rate of the initial reaction can be
reduced by the steric hindrance effects however the number of
amine available to react with CO2 grows noticeably (Chakraborty
et al 1988) Furthermore solvent screening experiments and
investigation of the effects of some variables for example chain
length the number of functional groups position of side chains
and functional group etc has been conducted by Singh et al Theyperformed semi-quantitative study of these effects on the ca-
pacity of amines for CO2 absorption (Singh et al 2007 2009) In
addition a computational study in the reactions between func-
tionalized amines and CO2 was performed by Lee and Kitchin
They highlighted the molecular descriptors by which reactivity
trends can be obtained Their work revealed that electron with-
drawing and donating groups tend to destabilize and stabilize
CO2 reaction products respectively (Lee and Kitchin 2012) All of
the results in this paper are based on mathematical calculations
and model development To the best of the authors knowledge
this work is the 1047297rst quantitative research on amines capacity for
CO2 absorption based on the simple and robust model
To achieve this goal a close observation of the relationship be-
tween the chemical structure and the activity of different amine-based solutions is required An effective method for processing
analyzing and predicting the characteristics of different molecules
can be provided by Quantitative Structure PropertyActivity Rela-
tionship (QSPRQSAR) (Beheshti et al 2012 2009 Freire et al
2010 Liang et al 2013 Godavarthy et al 2006 Riahi 2009
2008 Riahi et al 2008) Quantitative structureeproperty rela-
tionship technique relates chemical or physical properties of
compounds to their molecular structures This technique is used to
quantitatively develop a correlation which can predict speci1047297c
molecular properties for example environmental functions or
physico-chemical behaviors The QSPR approach is based on the
assumption that differences of molecules behaviors can be corre-
lated with deviation of some molecular features that are technically
termed descriptors The descriptors are numerical values thatbelong to the shape and structure of the molecule For using QSPR
method the knowledge of molecules chemical structures is quite
adequate and there is no necessity to conduct experimental con-
ditions QSPR often requires consecutive procedures consequently
the following steps were taken (Fini et al 2012)
1 A data set of molecules was taken from the literature with their
corresponding absorption capacities
2 The structural properties of molecules were extracted and
calculated by using computer software
3 The best model which contains an optimum number of de-
scriptors was selected by the means of several alternative al-
gorithms for example genetic algorithm (GA) and MLR
4 The selected model was validated using statistical tests and
validation methods for instance leave-one-out-cross-validation
method
In QSPR approaches selecting the proper method for con-
structing a robust and precise model is very important Multiple
linear regression (MLR) principle component regression (PCR) and
partial least squares (PLS) are most widely used in QSPR modeling
(Katritzky et al 2000 Marengo et al 1992) Variable selection for
building a well-1047297tted model is a further step Genetic algorithm
(GA) is one famous method by which this task can be accom-
plished This paper focuses on the development of a descriptive
novel model in QSPR analysis by which the prediction of absorp-
tion capacity (or rich loading) of various amines used in industrial
carbon capturing units can be predicted The quantitative rela-
tionship between the absorption capacities data and calculated
descriptors is achieved by the multiple linear regressions (MLR)
and model variables were selected by genetic algorithm (GA)
(Depczynski et al 2000 Jouan-Rimbaud et al 1995) The accuracy
of the model was veri1047297ed by different statistical methods and the
result proved high statistical qualities of that model One of the
main disadvantages of QSPR technique is that for most of the re-
searches conducted in this area the 1047297nal equation reported as bestmodel contains unfamiliar descriptors which are not only hard to
be calculated but also are dif 1047297cult or impossible to be interpreted
Fortunately the equation reported in this paper consists of de-
scriptors which are simple in terms of both calculation and
interpretation The model also demonstrates high statistical
qualities by which the predictive power and robustness of the
model can be guarantee
2 Materials and methods
The absorption capacity (rich loading) of 23 amines-based sol-
vents for carbon dioxide absorption (Table 1) were taken from the
literature (Singh et al 2007) Firstly density functional theory
(DFT) at the level of B3LYP and 6-311 thorn G (d p) basis set wasemployed to perform geometrical optimization (Cramer 2005 da
Silva and Svendsen 2004) These calculations were performed by
Gaussian software (Frisch et al 1998) The input of Gaussian soft-
ware was pre-optimized molecule structures using semi-empirical
geometry optimization method AM1 This process calculates a
group of precise and applicable descriptors introducing electronic
and quantum chemical properties of molecules Quantum chemical
descriptors include properties for example dipole moment sum of
the electronic and thermal free energies atomic charges HOMO
energy (highest occupied molecular orbital energy) LUMO energies
(Lowest Unoccupied molecular orbital energy) exact polarizability
etc Consequently a total number of 31 quantum chemical de-
scriptors were calculated for each molecule
Next geometrically optimized structures of each molecule werefed into the Dragon software developed by the Milano Chemo-
metrics and QSAR research group (Todeschini et al 2002) As a
result for each molecule more than 1486 theoretical molecular
descriptors were calculated These descriptors can be divided into
different groups for instance constitutional descriptors topologi-
cal descriptors functional group counts molecular properties etc
Because of the large amount of numerical data that result in
imprecise and slow further calculation the number of calculated
descriptors was decreased by the accepted procedure below
1 Constant and near constant value descriptors were eliminated
(361 excluded)
2 One of the collinear descriptors (R gt 098) that had better cor-
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 443
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 39
Table 1
The structure experimental and calculated values of amines capacities for CO 2 absorption (rich loading)
No Name Structure Exp Eq (2) (Model) Eq (1)
1 12-diamino propane 127 129 123
2 13-diamino propane 130 129 125
3 14-Diamino butane(T) 126 137 129
4 2-Amino-1-butanol 088 079 080
5 2-Methyl pyridine 006 009 008
6 2-Pyridylamine 028 059 023
7 3-Amino-1-Propanol 088 071 072
8 4-Amino-1-butanol 083 079 076
9 5-Ami no-1-pentanol(T) 084 087 085
10 Butylamine 086 079 084
11 Diethylenetriamine 183 181 177
12 Ethylamine 091 063 082
13 Ethylenediamine 108 121 120
14 Hexamethylenediamine 148 153 146
15 Isobutyl ami ne(T) 078 079 082
16 Monoethanolamine 072 063 061
17 N-(2-Hydroxyethyl)ethylenediamine 115 123 117
18 NN-bis(2-hydroxyethyl)ethylenediamine 120 125 127
19 N-Pentylamine 072 087 090
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450444
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49
relation with absorption capacity was saved and other de-
scriptors were eliminated (611 excluded)
After the above constraints a total of 514 descriptors were
selected for each molecule as an output of this stage
Finally the calculated descriptors formed a (23 545) data
matrix where 23 represents the number of compounds and 545
were the number of descriptors
3 Model development
After descriptors calculation GA-MLR was applied as a variable
selection and model development procedure for obtaining the best
model with the highest predictive power based on the training set
The procedure of constructing training and test sets will be dis-
cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following
linear equation was built based on molecules with the training set
AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN
(1)
AC is used instead of absorption capacity Mor09v is one of the
3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by
van der Waals volume RDF035m belongs to the group of RDF de-
scriptors and it describes the radial distribution function-035
weighted by mass and nN represent the number of Nitrogen
atoms As can be noticed the calculation of two descriptors in the
above model is dif 1047297cult because these calculations should be per-
formed by computer It also seems it is not easy to describe the
relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and
descriptors is a necessary and important step So it was decided to
investigate some new models with new simple descriptors In
addition due to the chemical reaction of amines with carbon di-
oxide it is concluded that the number of amino groups may affect
amines capacity of carbon dioxide absorption The information on
the chemistry of carbon dioxide reactions with amine-based sol-
vents will be presented in the discussion section After developing
numerous simple equations and evaluating them with different
statistical methods the following model was selected
AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR
(2)
Table 2 shows some statistical factors in order to provide a
better comparison between the two models The 1047297rst equation
demonstrates higher statistical parameters But the simpler de-
scriptors of the second model either in the calculation or inter-
pretation of results are more important Therefore we introduce
the second equation as a preferred model to predict absorption
capacity of amines and the rest of this paper including discussion
and conclusion section will focus on this model
Molecular descriptors and their de1047297nitions are given in Table 3
The correlation matrix of descriptors is also shown in Table 4 The
linear correlation value for each of the two descriptors is less than
065 which demonstrates these descriptors are independent of
each other and can be used to develop a QSPR model
As can be observed the three descriptors appeared in the model
are easily calculated and thus there is no need for computational
calculation Moreover this model demonstrates high statistical
qualities Indeed to the best of our knowledge the above model is
the simplest equation that can ever predict the capacity of amines
for carbon dioxide absorption under speci1047297c conditions
4 Results
One of the most critical factors that in1047298uence the quality of
regression model is how to select and construct training and test
set in order to warrant the molecular diversity on both of them To
take this into account from the total 23 amine-based carbon di-
oxide absorbents 18 molecules (about 80 of molecules) were
selected to construct a training set and 5 molecules built test set
(about 20) The test set was used for external cross-validation of
Table 1 (continued )
No Name Structure Exp Eq (2) (Model) Eq (1)
20 Propylamine(T) 077 071 080
21 Pyridine(T) 005 001 012
22 sec-Butylamine 084 079 087
23 Triethylenetetramine 251 241 247
All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules
Table 2
Some basic statistical values for two models
Models Descriptors R2 Q 2 F s
Eq (1) nN mor09V RDF035m 0979 0971 30096 0082
Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127
All statistic parameters in this table calculated before training and test procedure
Table 3
The three molecular descriptors used in Eq (2)
Descriptor Type De1047297nition
nH Constitutional indices Number of Hydrogen atoms
nRNH2 Functional group counts Number of primary amines (aliphatic)
nRNHR Functional group counts Number of secondary amines (aliphatic)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59
the model One of the common techniques in QSPR approach for
constructing training and test set with the constraint of structural
diversity is a PCA (principal component analysis) method (Hu et al
2009 Riahi et al 2008) In the current work PCA was employed to
classify data set of molecules into training and test sets For this
purpose PC1 and PC2 were calculated based on descriptors in the
model The result showed that these two principal components
made 575 and 335 of the variation in data respectively and
played the main roles Fig 1 shows the distribution of the data for
PC1 and PC2 and by observing this 1047297gure it can be concluded that
the compounds in the training and test sets were representatives of
the whole data
The training set was used to build the model while the test set
was used to validate the predicting power During the model
development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different
resulting models The Q 2LOO was calculated for each obtained
equation and then the best model was selected based on the high
value of this parameter There are some statistical tests and pa-
rameters that need to be considered Coef 1047297cient of determination
(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)
the slopes of regression lines forced through zero (k k0) root mean
square error (RMSE) and standard error of the estimate (s) are the
most important ones The 1047297rst 1047297ve parameters should be near to
unity while RMSE and s should be low enough near to zero
Furthermore the intercepts of the model should be close to zero
Moreover the Fisher function (F ) is another vital statistical test
High values of the F -ratio test indicate reliable models All statistical
parameters formulas used in this paper are mentioned below
R2 frac14 1
Pnifrac141
yexp
i ycalc
i
2
Pnifrac141
yexp
i y2
(3)
RMSE frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
n
v uut (4)
F frac14
Pnifrac141
yexp ycalc
i
2
df M
Pnifrac141
yexpi ycalci
2df E
(5)
k frac14
P yexp
i ycalc
iP ycalc
i
2 (6)
k0 frac14
P yexp
i ycalc
iP yexp
i
2 (7)
s frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
df E
v uut (8)
where df M and df E refer to the degrees of freedom of the model and
error respectively
Also the following criteria described by Golbaraikh and Tropsha
were applied to check the predictability of the QSPR model
(Golbraikh and Tropsha 2002)
1
Q 2 gt05 (9)
2
R2gt 06 (10)
3 R2 R2
0R2 lt0
1 and 0
85 k 1
15 (11)
where R20 is the coef 1047297cient of determination characterizing linear
regression with Y -intercept set at zero The predicted result of all
molecules either in training or test set with statistical parameters
are given in Table 5
Table 4
Correlation matrix of three descriptors used in Eq (2)
Descriptor nH nRNH2 nRNHR
nH 1000 0344 0642
nRNH2 0344 1000 0005
nRNHR 0642 0005 1000
Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69
In Table6 Y-scrambling test was applied in order to examine the
robustness of the model (Tropsha et al 2003) In Y-scrambling test
the dependent variable (Absorption Capacity) is randomly dedi-
cated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables It is ex-
pectedthat newly developedQSPR models shouldhave lowenough
R2 and Q 2 values If it happens differently the reported model is not
accurate for the particular data set and method of modeling
The applicability domain of the model was studied by Williams
plot in Fig 2 (OECD 2007) In Williams plot the standardized re-
siduals (R) versus the leverage (hat diagonal) values (h) were
plotted Leverage demonstrates the distance of a compound from
the centroid of the X where X is the descriptor matrix The leverage
of a compound is calculated by the following equation (Netzeva
et al 2005)
hi frac14 xT i
X T X
1 xi (12)
where xi is the descriptor vector of the relevant compound The
warning leverage (h) is de1047297ned as (Eriksson et al 2003)
h frac143eth p thorn 1THORN
n (13)
n is the number of training objects and p is the number of de-
scriptors in the model Williams plot is used to identify both the
response outlier and the structurally in1047298uential chemicals in the
model A compound with hi gt h in1047298uence the regression line but it
does not consider as an outlier as its corresponding standardized
residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-
ized residual rather than three standard residual unit (gt3s) is
considered as an outlier compound It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction re-
sults of the model
Fig 2 demonstrates that there is no chemical with leverage
higher than the warning h value of 067 It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig 3 against corresponding calculated values
for QSPR model
Furthermore mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor
individually or relative to other descriptors Fig 4 shows the stan-
dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)
The 1047297gure is used to compare the relative weights of the de-
scriptors The higher the standardized coef 1047297cients value of a
descriptor the more important the weight of the corresponding
variable in the model This 1047297gure demonstrates the mean effect of
the descriptors in the model
By observing this 1047297gure it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) de-
scriptors play the main role in the amines capacity for carbon di-
oxide absorption respectively and the number of hydrogen (nH) has
the least effect This 1047297gure shows all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted In fact the predominant difference between this study
and the previous ones is that this work concentrates 1047297rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption
5 Discussion
Although high statistical parameters are signi1047297cant in demon-
strating the capability of the model QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent For this reason an acceptable interpretation of de-
scriptors in the QSPR model should be provided It is better to di-
agnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968)
CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)
R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)
R 1 and R 2 demonstrate substituted group attached to amine
group B is a base molecule which can be a water molecule The
intermediate in the reaction is zwitterion But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen 2004) It led to the
assumption of the single-step mechanism of these reactions (re-
action (3)) A termolecular single step mechanism suggested by
Crooks and Donnellan (1989)
B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)
where B is again the base molecule In this mechanism NH group is
attacked by base molecule and deprotonation of amine occurs The
bonding between amine and carbon dioxide also takes place
simultaneously
Table 6
R2 train values after several Y-scrambling tests
Iteration R2 train
1 0060
2 0074
3 0119
4 0027
5 0188
6 0102
7 0119
8 0209
9 0096
10 0039
Table 5
Validation parameters and statistical result of GA-MLR model
n R2 R2adj RMSE F k k0 s
Train 18 0942 0930 0127 7650 1004 0984 0144
Test 5 0976 0904 0060 1731 0962 1035 0135
Overall 23 0950 0942 0116 12379 0999 0990 0128
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 29
capacity for carbon dioxide absorption (rich loading) Signi1047297cant
contribution to analyzing the relationships between the structure
and absorption capacity of amines has been made by Chakraborty
et al In their work it has been shown that the existence of
substituents at a-carbon causes a carbamate instability which
results in an accelerated hydrolysis as a result the amount of
bicarbonate increases which leads to higher carbon dioxide
loading (Chakraborty et al 1986) In addition it was explained by
Sartori and Savage that steric hindrance effects produced by a-
substituent are responsible for these instabilities (Sartori and
Savage 1983) In addition Chakraborty studied the electronic
effects of substituents and suggested that substitution at carbon
atom causes an interaction of the p and p methyl group orbital
with the lone pair of the nitrogen Since nitrogen charge is
reduced by this interaction it reduces the strength of the NeH
bond which results in the raise of the hydrolysis in the aqueous
solution It seems that the rate of the initial reaction can be
reduced by the steric hindrance effects however the number of
amine available to react with CO2 grows noticeably (Chakraborty
et al 1988) Furthermore solvent screening experiments and
investigation of the effects of some variables for example chain
length the number of functional groups position of side chains
and functional group etc has been conducted by Singh et al Theyperformed semi-quantitative study of these effects on the ca-
pacity of amines for CO2 absorption (Singh et al 2007 2009) In
addition a computational study in the reactions between func-
tionalized amines and CO2 was performed by Lee and Kitchin
They highlighted the molecular descriptors by which reactivity
trends can be obtained Their work revealed that electron with-
drawing and donating groups tend to destabilize and stabilize
CO2 reaction products respectively (Lee and Kitchin 2012) All of
the results in this paper are based on mathematical calculations
and model development To the best of the authors knowledge
this work is the 1047297rst quantitative research on amines capacity for
CO2 absorption based on the simple and robust model
To achieve this goal a close observation of the relationship be-
tween the chemical structure and the activity of different amine-based solutions is required An effective method for processing
analyzing and predicting the characteristics of different molecules
can be provided by Quantitative Structure PropertyActivity Rela-
tionship (QSPRQSAR) (Beheshti et al 2012 2009 Freire et al
2010 Liang et al 2013 Godavarthy et al 2006 Riahi 2009
2008 Riahi et al 2008) Quantitative structureeproperty rela-
tionship technique relates chemical or physical properties of
compounds to their molecular structures This technique is used to
quantitatively develop a correlation which can predict speci1047297c
molecular properties for example environmental functions or
physico-chemical behaviors The QSPR approach is based on the
assumption that differences of molecules behaviors can be corre-
lated with deviation of some molecular features that are technically
termed descriptors The descriptors are numerical values thatbelong to the shape and structure of the molecule For using QSPR
method the knowledge of molecules chemical structures is quite
adequate and there is no necessity to conduct experimental con-
ditions QSPR often requires consecutive procedures consequently
the following steps were taken (Fini et al 2012)
1 A data set of molecules was taken from the literature with their
corresponding absorption capacities
2 The structural properties of molecules were extracted and
calculated by using computer software
3 The best model which contains an optimum number of de-
scriptors was selected by the means of several alternative al-
gorithms for example genetic algorithm (GA) and MLR
4 The selected model was validated using statistical tests and
validation methods for instance leave-one-out-cross-validation
method
In QSPR approaches selecting the proper method for con-
structing a robust and precise model is very important Multiple
linear regression (MLR) principle component regression (PCR) and
partial least squares (PLS) are most widely used in QSPR modeling
(Katritzky et al 2000 Marengo et al 1992) Variable selection for
building a well-1047297tted model is a further step Genetic algorithm
(GA) is one famous method by which this task can be accom-
plished This paper focuses on the development of a descriptive
novel model in QSPR analysis by which the prediction of absorp-
tion capacity (or rich loading) of various amines used in industrial
carbon capturing units can be predicted The quantitative rela-
tionship between the absorption capacities data and calculated
descriptors is achieved by the multiple linear regressions (MLR)
and model variables were selected by genetic algorithm (GA)
(Depczynski et al 2000 Jouan-Rimbaud et al 1995) The accuracy
of the model was veri1047297ed by different statistical methods and the
result proved high statistical qualities of that model One of the
main disadvantages of QSPR technique is that for most of the re-
searches conducted in this area the 1047297nal equation reported as bestmodel contains unfamiliar descriptors which are not only hard to
be calculated but also are dif 1047297cult or impossible to be interpreted
Fortunately the equation reported in this paper consists of de-
scriptors which are simple in terms of both calculation and
interpretation The model also demonstrates high statistical
qualities by which the predictive power and robustness of the
model can be guarantee
2 Materials and methods
The absorption capacity (rich loading) of 23 amines-based sol-
vents for carbon dioxide absorption (Table 1) were taken from the
literature (Singh et al 2007) Firstly density functional theory
(DFT) at the level of B3LYP and 6-311 thorn G (d p) basis set wasemployed to perform geometrical optimization (Cramer 2005 da
Silva and Svendsen 2004) These calculations were performed by
Gaussian software (Frisch et al 1998) The input of Gaussian soft-
ware was pre-optimized molecule structures using semi-empirical
geometry optimization method AM1 This process calculates a
group of precise and applicable descriptors introducing electronic
and quantum chemical properties of molecules Quantum chemical
descriptors include properties for example dipole moment sum of
the electronic and thermal free energies atomic charges HOMO
energy (highest occupied molecular orbital energy) LUMO energies
(Lowest Unoccupied molecular orbital energy) exact polarizability
etc Consequently a total number of 31 quantum chemical de-
scriptors were calculated for each molecule
Next geometrically optimized structures of each molecule werefed into the Dragon software developed by the Milano Chemo-
metrics and QSAR research group (Todeschini et al 2002) As a
result for each molecule more than 1486 theoretical molecular
descriptors were calculated These descriptors can be divided into
different groups for instance constitutional descriptors topologi-
cal descriptors functional group counts molecular properties etc
Because of the large amount of numerical data that result in
imprecise and slow further calculation the number of calculated
descriptors was decreased by the accepted procedure below
1 Constant and near constant value descriptors were eliminated
(361 excluded)
2 One of the collinear descriptors (R gt 098) that had better cor-
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 443
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 39
Table 1
The structure experimental and calculated values of amines capacities for CO 2 absorption (rich loading)
No Name Structure Exp Eq (2) (Model) Eq (1)
1 12-diamino propane 127 129 123
2 13-diamino propane 130 129 125
3 14-Diamino butane(T) 126 137 129
4 2-Amino-1-butanol 088 079 080
5 2-Methyl pyridine 006 009 008
6 2-Pyridylamine 028 059 023
7 3-Amino-1-Propanol 088 071 072
8 4-Amino-1-butanol 083 079 076
9 5-Ami no-1-pentanol(T) 084 087 085
10 Butylamine 086 079 084
11 Diethylenetriamine 183 181 177
12 Ethylamine 091 063 082
13 Ethylenediamine 108 121 120
14 Hexamethylenediamine 148 153 146
15 Isobutyl ami ne(T) 078 079 082
16 Monoethanolamine 072 063 061
17 N-(2-Hydroxyethyl)ethylenediamine 115 123 117
18 NN-bis(2-hydroxyethyl)ethylenediamine 120 125 127
19 N-Pentylamine 072 087 090
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450444
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49
relation with absorption capacity was saved and other de-
scriptors were eliminated (611 excluded)
After the above constraints a total of 514 descriptors were
selected for each molecule as an output of this stage
Finally the calculated descriptors formed a (23 545) data
matrix where 23 represents the number of compounds and 545
were the number of descriptors
3 Model development
After descriptors calculation GA-MLR was applied as a variable
selection and model development procedure for obtaining the best
model with the highest predictive power based on the training set
The procedure of constructing training and test sets will be dis-
cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following
linear equation was built based on molecules with the training set
AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN
(1)
AC is used instead of absorption capacity Mor09v is one of the
3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by
van der Waals volume RDF035m belongs to the group of RDF de-
scriptors and it describes the radial distribution function-035
weighted by mass and nN represent the number of Nitrogen
atoms As can be noticed the calculation of two descriptors in the
above model is dif 1047297cult because these calculations should be per-
formed by computer It also seems it is not easy to describe the
relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and
descriptors is a necessary and important step So it was decided to
investigate some new models with new simple descriptors In
addition due to the chemical reaction of amines with carbon di-
oxide it is concluded that the number of amino groups may affect
amines capacity of carbon dioxide absorption The information on
the chemistry of carbon dioxide reactions with amine-based sol-
vents will be presented in the discussion section After developing
numerous simple equations and evaluating them with different
statistical methods the following model was selected
AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR
(2)
Table 2 shows some statistical factors in order to provide a
better comparison between the two models The 1047297rst equation
demonstrates higher statistical parameters But the simpler de-
scriptors of the second model either in the calculation or inter-
pretation of results are more important Therefore we introduce
the second equation as a preferred model to predict absorption
capacity of amines and the rest of this paper including discussion
and conclusion section will focus on this model
Molecular descriptors and their de1047297nitions are given in Table 3
The correlation matrix of descriptors is also shown in Table 4 The
linear correlation value for each of the two descriptors is less than
065 which demonstrates these descriptors are independent of
each other and can be used to develop a QSPR model
As can be observed the three descriptors appeared in the model
are easily calculated and thus there is no need for computational
calculation Moreover this model demonstrates high statistical
qualities Indeed to the best of our knowledge the above model is
the simplest equation that can ever predict the capacity of amines
for carbon dioxide absorption under speci1047297c conditions
4 Results
One of the most critical factors that in1047298uence the quality of
regression model is how to select and construct training and test
set in order to warrant the molecular diversity on both of them To
take this into account from the total 23 amine-based carbon di-
oxide absorbents 18 molecules (about 80 of molecules) were
selected to construct a training set and 5 molecules built test set
(about 20) The test set was used for external cross-validation of
Table 1 (continued )
No Name Structure Exp Eq (2) (Model) Eq (1)
20 Propylamine(T) 077 071 080
21 Pyridine(T) 005 001 012
22 sec-Butylamine 084 079 087
23 Triethylenetetramine 251 241 247
All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules
Table 2
Some basic statistical values for two models
Models Descriptors R2 Q 2 F s
Eq (1) nN mor09V RDF035m 0979 0971 30096 0082
Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127
All statistic parameters in this table calculated before training and test procedure
Table 3
The three molecular descriptors used in Eq (2)
Descriptor Type De1047297nition
nH Constitutional indices Number of Hydrogen atoms
nRNH2 Functional group counts Number of primary amines (aliphatic)
nRNHR Functional group counts Number of secondary amines (aliphatic)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59
the model One of the common techniques in QSPR approach for
constructing training and test set with the constraint of structural
diversity is a PCA (principal component analysis) method (Hu et al
2009 Riahi et al 2008) In the current work PCA was employed to
classify data set of molecules into training and test sets For this
purpose PC1 and PC2 were calculated based on descriptors in the
model The result showed that these two principal components
made 575 and 335 of the variation in data respectively and
played the main roles Fig 1 shows the distribution of the data for
PC1 and PC2 and by observing this 1047297gure it can be concluded that
the compounds in the training and test sets were representatives of
the whole data
The training set was used to build the model while the test set
was used to validate the predicting power During the model
development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different
resulting models The Q 2LOO was calculated for each obtained
equation and then the best model was selected based on the high
value of this parameter There are some statistical tests and pa-
rameters that need to be considered Coef 1047297cient of determination
(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)
the slopes of regression lines forced through zero (k k0) root mean
square error (RMSE) and standard error of the estimate (s) are the
most important ones The 1047297rst 1047297ve parameters should be near to
unity while RMSE and s should be low enough near to zero
Furthermore the intercepts of the model should be close to zero
Moreover the Fisher function (F ) is another vital statistical test
High values of the F -ratio test indicate reliable models All statistical
parameters formulas used in this paper are mentioned below
R2 frac14 1
Pnifrac141
yexp
i ycalc
i
2
Pnifrac141
yexp
i y2
(3)
RMSE frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
n
v uut (4)
F frac14
Pnifrac141
yexp ycalc
i
2
df M
Pnifrac141
yexpi ycalci
2df E
(5)
k frac14
P yexp
i ycalc
iP ycalc
i
2 (6)
k0 frac14
P yexp
i ycalc
iP yexp
i
2 (7)
s frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
df E
v uut (8)
where df M and df E refer to the degrees of freedom of the model and
error respectively
Also the following criteria described by Golbaraikh and Tropsha
were applied to check the predictability of the QSPR model
(Golbraikh and Tropsha 2002)
1
Q 2 gt05 (9)
2
R2gt 06 (10)
3 R2 R2
0R2 lt0
1 and 0
85 k 1
15 (11)
where R20 is the coef 1047297cient of determination characterizing linear
regression with Y -intercept set at zero The predicted result of all
molecules either in training or test set with statistical parameters
are given in Table 5
Table 4
Correlation matrix of three descriptors used in Eq (2)
Descriptor nH nRNH2 nRNHR
nH 1000 0344 0642
nRNH2 0344 1000 0005
nRNHR 0642 0005 1000
Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69
In Table6 Y-scrambling test was applied in order to examine the
robustness of the model (Tropsha et al 2003) In Y-scrambling test
the dependent variable (Absorption Capacity) is randomly dedi-
cated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables It is ex-
pectedthat newly developedQSPR models shouldhave lowenough
R2 and Q 2 values If it happens differently the reported model is not
accurate for the particular data set and method of modeling
The applicability domain of the model was studied by Williams
plot in Fig 2 (OECD 2007) In Williams plot the standardized re-
siduals (R) versus the leverage (hat diagonal) values (h) were
plotted Leverage demonstrates the distance of a compound from
the centroid of the X where X is the descriptor matrix The leverage
of a compound is calculated by the following equation (Netzeva
et al 2005)
hi frac14 xT i
X T X
1 xi (12)
where xi is the descriptor vector of the relevant compound The
warning leverage (h) is de1047297ned as (Eriksson et al 2003)
h frac143eth p thorn 1THORN
n (13)
n is the number of training objects and p is the number of de-
scriptors in the model Williams plot is used to identify both the
response outlier and the structurally in1047298uential chemicals in the
model A compound with hi gt h in1047298uence the regression line but it
does not consider as an outlier as its corresponding standardized
residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-
ized residual rather than three standard residual unit (gt3s) is
considered as an outlier compound It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction re-
sults of the model
Fig 2 demonstrates that there is no chemical with leverage
higher than the warning h value of 067 It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig 3 against corresponding calculated values
for QSPR model
Furthermore mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor
individually or relative to other descriptors Fig 4 shows the stan-
dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)
The 1047297gure is used to compare the relative weights of the de-
scriptors The higher the standardized coef 1047297cients value of a
descriptor the more important the weight of the corresponding
variable in the model This 1047297gure demonstrates the mean effect of
the descriptors in the model
By observing this 1047297gure it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) de-
scriptors play the main role in the amines capacity for carbon di-
oxide absorption respectively and the number of hydrogen (nH) has
the least effect This 1047297gure shows all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted In fact the predominant difference between this study
and the previous ones is that this work concentrates 1047297rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption
5 Discussion
Although high statistical parameters are signi1047297cant in demon-
strating the capability of the model QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent For this reason an acceptable interpretation of de-
scriptors in the QSPR model should be provided It is better to di-
agnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968)
CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)
R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)
R 1 and R 2 demonstrate substituted group attached to amine
group B is a base molecule which can be a water molecule The
intermediate in the reaction is zwitterion But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen 2004) It led to the
assumption of the single-step mechanism of these reactions (re-
action (3)) A termolecular single step mechanism suggested by
Crooks and Donnellan (1989)
B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)
where B is again the base molecule In this mechanism NH group is
attacked by base molecule and deprotonation of amine occurs The
bonding between amine and carbon dioxide also takes place
simultaneously
Table 6
R2 train values after several Y-scrambling tests
Iteration R2 train
1 0060
2 0074
3 0119
4 0027
5 0188
6 0102
7 0119
8 0209
9 0096
10 0039
Table 5
Validation parameters and statistical result of GA-MLR model
n R2 R2adj RMSE F k k0 s
Train 18 0942 0930 0127 7650 1004 0984 0144
Test 5 0976 0904 0060 1731 0962 1035 0135
Overall 23 0950 0942 0116 12379 0999 0990 0128
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 39
Table 1
The structure experimental and calculated values of amines capacities for CO 2 absorption (rich loading)
No Name Structure Exp Eq (2) (Model) Eq (1)
1 12-diamino propane 127 129 123
2 13-diamino propane 130 129 125
3 14-Diamino butane(T) 126 137 129
4 2-Amino-1-butanol 088 079 080
5 2-Methyl pyridine 006 009 008
6 2-Pyridylamine 028 059 023
7 3-Amino-1-Propanol 088 071 072
8 4-Amino-1-butanol 083 079 076
9 5-Ami no-1-pentanol(T) 084 087 085
10 Butylamine 086 079 084
11 Diethylenetriamine 183 181 177
12 Ethylamine 091 063 082
13 Ethylenediamine 108 121 120
14 Hexamethylenediamine 148 153 146
15 Isobutyl ami ne(T) 078 079 082
16 Monoethanolamine 072 063 061
17 N-(2-Hydroxyethyl)ethylenediamine 115 123 117
18 NN-bis(2-hydroxyethyl)ethylenediamine 120 125 127
19 N-Pentylamine 072 087 090
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450444
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49
relation with absorption capacity was saved and other de-
scriptors were eliminated (611 excluded)
After the above constraints a total of 514 descriptors were
selected for each molecule as an output of this stage
Finally the calculated descriptors formed a (23 545) data
matrix where 23 represents the number of compounds and 545
were the number of descriptors
3 Model development
After descriptors calculation GA-MLR was applied as a variable
selection and model development procedure for obtaining the best
model with the highest predictive power based on the training set
The procedure of constructing training and test sets will be dis-
cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following
linear equation was built based on molecules with the training set
AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN
(1)
AC is used instead of absorption capacity Mor09v is one of the
3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by
van der Waals volume RDF035m belongs to the group of RDF de-
scriptors and it describes the radial distribution function-035
weighted by mass and nN represent the number of Nitrogen
atoms As can be noticed the calculation of two descriptors in the
above model is dif 1047297cult because these calculations should be per-
formed by computer It also seems it is not easy to describe the
relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and
descriptors is a necessary and important step So it was decided to
investigate some new models with new simple descriptors In
addition due to the chemical reaction of amines with carbon di-
oxide it is concluded that the number of amino groups may affect
amines capacity of carbon dioxide absorption The information on
the chemistry of carbon dioxide reactions with amine-based sol-
vents will be presented in the discussion section After developing
numerous simple equations and evaluating them with different
statistical methods the following model was selected
AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR
(2)
Table 2 shows some statistical factors in order to provide a
better comparison between the two models The 1047297rst equation
demonstrates higher statistical parameters But the simpler de-
scriptors of the second model either in the calculation or inter-
pretation of results are more important Therefore we introduce
the second equation as a preferred model to predict absorption
capacity of amines and the rest of this paper including discussion
and conclusion section will focus on this model
Molecular descriptors and their de1047297nitions are given in Table 3
The correlation matrix of descriptors is also shown in Table 4 The
linear correlation value for each of the two descriptors is less than
065 which demonstrates these descriptors are independent of
each other and can be used to develop a QSPR model
As can be observed the three descriptors appeared in the model
are easily calculated and thus there is no need for computational
calculation Moreover this model demonstrates high statistical
qualities Indeed to the best of our knowledge the above model is
the simplest equation that can ever predict the capacity of amines
for carbon dioxide absorption under speci1047297c conditions
4 Results
One of the most critical factors that in1047298uence the quality of
regression model is how to select and construct training and test
set in order to warrant the molecular diversity on both of them To
take this into account from the total 23 amine-based carbon di-
oxide absorbents 18 molecules (about 80 of molecules) were
selected to construct a training set and 5 molecules built test set
(about 20) The test set was used for external cross-validation of
Table 1 (continued )
No Name Structure Exp Eq (2) (Model) Eq (1)
20 Propylamine(T) 077 071 080
21 Pyridine(T) 005 001 012
22 sec-Butylamine 084 079 087
23 Triethylenetetramine 251 241 247
All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules
Table 2
Some basic statistical values for two models
Models Descriptors R2 Q 2 F s
Eq (1) nN mor09V RDF035m 0979 0971 30096 0082
Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127
All statistic parameters in this table calculated before training and test procedure
Table 3
The three molecular descriptors used in Eq (2)
Descriptor Type De1047297nition
nH Constitutional indices Number of Hydrogen atoms
nRNH2 Functional group counts Number of primary amines (aliphatic)
nRNHR Functional group counts Number of secondary amines (aliphatic)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59
the model One of the common techniques in QSPR approach for
constructing training and test set with the constraint of structural
diversity is a PCA (principal component analysis) method (Hu et al
2009 Riahi et al 2008) In the current work PCA was employed to
classify data set of molecules into training and test sets For this
purpose PC1 and PC2 were calculated based on descriptors in the
model The result showed that these two principal components
made 575 and 335 of the variation in data respectively and
played the main roles Fig 1 shows the distribution of the data for
PC1 and PC2 and by observing this 1047297gure it can be concluded that
the compounds in the training and test sets were representatives of
the whole data
The training set was used to build the model while the test set
was used to validate the predicting power During the model
development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different
resulting models The Q 2LOO was calculated for each obtained
equation and then the best model was selected based on the high
value of this parameter There are some statistical tests and pa-
rameters that need to be considered Coef 1047297cient of determination
(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)
the slopes of regression lines forced through zero (k k0) root mean
square error (RMSE) and standard error of the estimate (s) are the
most important ones The 1047297rst 1047297ve parameters should be near to
unity while RMSE and s should be low enough near to zero
Furthermore the intercepts of the model should be close to zero
Moreover the Fisher function (F ) is another vital statistical test
High values of the F -ratio test indicate reliable models All statistical
parameters formulas used in this paper are mentioned below
R2 frac14 1
Pnifrac141
yexp
i ycalc
i
2
Pnifrac141
yexp
i y2
(3)
RMSE frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
n
v uut (4)
F frac14
Pnifrac141
yexp ycalc
i
2
df M
Pnifrac141
yexpi ycalci
2df E
(5)
k frac14
P yexp
i ycalc
iP ycalc
i
2 (6)
k0 frac14
P yexp
i ycalc
iP yexp
i
2 (7)
s frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
df E
v uut (8)
where df M and df E refer to the degrees of freedom of the model and
error respectively
Also the following criteria described by Golbaraikh and Tropsha
were applied to check the predictability of the QSPR model
(Golbraikh and Tropsha 2002)
1
Q 2 gt05 (9)
2
R2gt 06 (10)
3 R2 R2
0R2 lt0
1 and 0
85 k 1
15 (11)
where R20 is the coef 1047297cient of determination characterizing linear
regression with Y -intercept set at zero The predicted result of all
molecules either in training or test set with statistical parameters
are given in Table 5
Table 4
Correlation matrix of three descriptors used in Eq (2)
Descriptor nH nRNH2 nRNHR
nH 1000 0344 0642
nRNH2 0344 1000 0005
nRNHR 0642 0005 1000
Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69
In Table6 Y-scrambling test was applied in order to examine the
robustness of the model (Tropsha et al 2003) In Y-scrambling test
the dependent variable (Absorption Capacity) is randomly dedi-
cated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables It is ex-
pectedthat newly developedQSPR models shouldhave lowenough
R2 and Q 2 values If it happens differently the reported model is not
accurate for the particular data set and method of modeling
The applicability domain of the model was studied by Williams
plot in Fig 2 (OECD 2007) In Williams plot the standardized re-
siduals (R) versus the leverage (hat diagonal) values (h) were
plotted Leverage demonstrates the distance of a compound from
the centroid of the X where X is the descriptor matrix The leverage
of a compound is calculated by the following equation (Netzeva
et al 2005)
hi frac14 xT i
X T X
1 xi (12)
where xi is the descriptor vector of the relevant compound The
warning leverage (h) is de1047297ned as (Eriksson et al 2003)
h frac143eth p thorn 1THORN
n (13)
n is the number of training objects and p is the number of de-
scriptors in the model Williams plot is used to identify both the
response outlier and the structurally in1047298uential chemicals in the
model A compound with hi gt h in1047298uence the regression line but it
does not consider as an outlier as its corresponding standardized
residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-
ized residual rather than three standard residual unit (gt3s) is
considered as an outlier compound It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction re-
sults of the model
Fig 2 demonstrates that there is no chemical with leverage
higher than the warning h value of 067 It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig 3 against corresponding calculated values
for QSPR model
Furthermore mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor
individually or relative to other descriptors Fig 4 shows the stan-
dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)
The 1047297gure is used to compare the relative weights of the de-
scriptors The higher the standardized coef 1047297cients value of a
descriptor the more important the weight of the corresponding
variable in the model This 1047297gure demonstrates the mean effect of
the descriptors in the model
By observing this 1047297gure it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) de-
scriptors play the main role in the amines capacity for carbon di-
oxide absorption respectively and the number of hydrogen (nH) has
the least effect This 1047297gure shows all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted In fact the predominant difference between this study
and the previous ones is that this work concentrates 1047297rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption
5 Discussion
Although high statistical parameters are signi1047297cant in demon-
strating the capability of the model QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent For this reason an acceptable interpretation of de-
scriptors in the QSPR model should be provided It is better to di-
agnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968)
CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)
R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)
R 1 and R 2 demonstrate substituted group attached to amine
group B is a base molecule which can be a water molecule The
intermediate in the reaction is zwitterion But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen 2004) It led to the
assumption of the single-step mechanism of these reactions (re-
action (3)) A termolecular single step mechanism suggested by
Crooks and Donnellan (1989)
B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)
where B is again the base molecule In this mechanism NH group is
attacked by base molecule and deprotonation of amine occurs The
bonding between amine and carbon dioxide also takes place
simultaneously
Table 6
R2 train values after several Y-scrambling tests
Iteration R2 train
1 0060
2 0074
3 0119
4 0027
5 0188
6 0102
7 0119
8 0209
9 0096
10 0039
Table 5
Validation parameters and statistical result of GA-MLR model
n R2 R2adj RMSE F k k0 s
Train 18 0942 0930 0127 7650 1004 0984 0144
Test 5 0976 0904 0060 1731 0962 1035 0135
Overall 23 0950 0942 0116 12379 0999 0990 0128
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49
relation with absorption capacity was saved and other de-
scriptors were eliminated (611 excluded)
After the above constraints a total of 514 descriptors were
selected for each molecule as an output of this stage
Finally the calculated descriptors formed a (23 545) data
matrix where 23 represents the number of compounds and 545
were the number of descriptors
3 Model development
After descriptors calculation GA-MLR was applied as a variable
selection and model development procedure for obtaining the best
model with the highest predictive power based on the training set
The procedure of constructing training and test sets will be dis-
cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following
linear equation was built based on molecules with the training set
AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN
(1)
AC is used instead of absorption capacity Mor09v is one of the
3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by
van der Waals volume RDF035m belongs to the group of RDF de-
scriptors and it describes the radial distribution function-035
weighted by mass and nN represent the number of Nitrogen
atoms As can be noticed the calculation of two descriptors in the
above model is dif 1047297cult because these calculations should be per-
formed by computer It also seems it is not easy to describe the
relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and
descriptors is a necessary and important step So it was decided to
investigate some new models with new simple descriptors In
addition due to the chemical reaction of amines with carbon di-
oxide it is concluded that the number of amino groups may affect
amines capacity of carbon dioxide absorption The information on
the chemistry of carbon dioxide reactions with amine-based sol-
vents will be presented in the discussion section After developing
numerous simple equations and evaluating them with different
statistical methods the following model was selected
AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR
(2)
Table 2 shows some statistical factors in order to provide a
better comparison between the two models The 1047297rst equation
demonstrates higher statistical parameters But the simpler de-
scriptors of the second model either in the calculation or inter-
pretation of results are more important Therefore we introduce
the second equation as a preferred model to predict absorption
capacity of amines and the rest of this paper including discussion
and conclusion section will focus on this model
Molecular descriptors and their de1047297nitions are given in Table 3
The correlation matrix of descriptors is also shown in Table 4 The
linear correlation value for each of the two descriptors is less than
065 which demonstrates these descriptors are independent of
each other and can be used to develop a QSPR model
As can be observed the three descriptors appeared in the model
are easily calculated and thus there is no need for computational
calculation Moreover this model demonstrates high statistical
qualities Indeed to the best of our knowledge the above model is
the simplest equation that can ever predict the capacity of amines
for carbon dioxide absorption under speci1047297c conditions
4 Results
One of the most critical factors that in1047298uence the quality of
regression model is how to select and construct training and test
set in order to warrant the molecular diversity on both of them To
take this into account from the total 23 amine-based carbon di-
oxide absorbents 18 molecules (about 80 of molecules) were
selected to construct a training set and 5 molecules built test set
(about 20) The test set was used for external cross-validation of
Table 1 (continued )
No Name Structure Exp Eq (2) (Model) Eq (1)
20 Propylamine(T) 077 071 080
21 Pyridine(T) 005 001 012
22 sec-Butylamine 084 079 087
23 Triethylenetetramine 251 241 247
All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules
Table 2
Some basic statistical values for two models
Models Descriptors R2 Q 2 F s
Eq (1) nN mor09V RDF035m 0979 0971 30096 0082
Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127
All statistic parameters in this table calculated before training and test procedure
Table 3
The three molecular descriptors used in Eq (2)
Descriptor Type De1047297nition
nH Constitutional indices Number of Hydrogen atoms
nRNH2 Functional group counts Number of primary amines (aliphatic)
nRNHR Functional group counts Number of secondary amines (aliphatic)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59
the model One of the common techniques in QSPR approach for
constructing training and test set with the constraint of structural
diversity is a PCA (principal component analysis) method (Hu et al
2009 Riahi et al 2008) In the current work PCA was employed to
classify data set of molecules into training and test sets For this
purpose PC1 and PC2 were calculated based on descriptors in the
model The result showed that these two principal components
made 575 and 335 of the variation in data respectively and
played the main roles Fig 1 shows the distribution of the data for
PC1 and PC2 and by observing this 1047297gure it can be concluded that
the compounds in the training and test sets were representatives of
the whole data
The training set was used to build the model while the test set
was used to validate the predicting power During the model
development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different
resulting models The Q 2LOO was calculated for each obtained
equation and then the best model was selected based on the high
value of this parameter There are some statistical tests and pa-
rameters that need to be considered Coef 1047297cient of determination
(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)
the slopes of regression lines forced through zero (k k0) root mean
square error (RMSE) and standard error of the estimate (s) are the
most important ones The 1047297rst 1047297ve parameters should be near to
unity while RMSE and s should be low enough near to zero
Furthermore the intercepts of the model should be close to zero
Moreover the Fisher function (F ) is another vital statistical test
High values of the F -ratio test indicate reliable models All statistical
parameters formulas used in this paper are mentioned below
R2 frac14 1
Pnifrac141
yexp
i ycalc
i
2
Pnifrac141
yexp
i y2
(3)
RMSE frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
n
v uut (4)
F frac14
Pnifrac141
yexp ycalc
i
2
df M
Pnifrac141
yexpi ycalci
2df E
(5)
k frac14
P yexp
i ycalc
iP ycalc
i
2 (6)
k0 frac14
P yexp
i ycalc
iP yexp
i
2 (7)
s frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
df E
v uut (8)
where df M and df E refer to the degrees of freedom of the model and
error respectively
Also the following criteria described by Golbaraikh and Tropsha
were applied to check the predictability of the QSPR model
(Golbraikh and Tropsha 2002)
1
Q 2 gt05 (9)
2
R2gt 06 (10)
3 R2 R2
0R2 lt0
1 and 0
85 k 1
15 (11)
where R20 is the coef 1047297cient of determination characterizing linear
regression with Y -intercept set at zero The predicted result of all
molecules either in training or test set with statistical parameters
are given in Table 5
Table 4
Correlation matrix of three descriptors used in Eq (2)
Descriptor nH nRNH2 nRNHR
nH 1000 0344 0642
nRNH2 0344 1000 0005
nRNHR 0642 0005 1000
Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69
In Table6 Y-scrambling test was applied in order to examine the
robustness of the model (Tropsha et al 2003) In Y-scrambling test
the dependent variable (Absorption Capacity) is randomly dedi-
cated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables It is ex-
pectedthat newly developedQSPR models shouldhave lowenough
R2 and Q 2 values If it happens differently the reported model is not
accurate for the particular data set and method of modeling
The applicability domain of the model was studied by Williams
plot in Fig 2 (OECD 2007) In Williams plot the standardized re-
siduals (R) versus the leverage (hat diagonal) values (h) were
plotted Leverage demonstrates the distance of a compound from
the centroid of the X where X is the descriptor matrix The leverage
of a compound is calculated by the following equation (Netzeva
et al 2005)
hi frac14 xT i
X T X
1 xi (12)
where xi is the descriptor vector of the relevant compound The
warning leverage (h) is de1047297ned as (Eriksson et al 2003)
h frac143eth p thorn 1THORN
n (13)
n is the number of training objects and p is the number of de-
scriptors in the model Williams plot is used to identify both the
response outlier and the structurally in1047298uential chemicals in the
model A compound with hi gt h in1047298uence the regression line but it
does not consider as an outlier as its corresponding standardized
residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-
ized residual rather than three standard residual unit (gt3s) is
considered as an outlier compound It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction re-
sults of the model
Fig 2 demonstrates that there is no chemical with leverage
higher than the warning h value of 067 It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig 3 against corresponding calculated values
for QSPR model
Furthermore mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor
individually or relative to other descriptors Fig 4 shows the stan-
dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)
The 1047297gure is used to compare the relative weights of the de-
scriptors The higher the standardized coef 1047297cients value of a
descriptor the more important the weight of the corresponding
variable in the model This 1047297gure demonstrates the mean effect of
the descriptors in the model
By observing this 1047297gure it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) de-
scriptors play the main role in the amines capacity for carbon di-
oxide absorption respectively and the number of hydrogen (nH) has
the least effect This 1047297gure shows all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted In fact the predominant difference between this study
and the previous ones is that this work concentrates 1047297rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption
5 Discussion
Although high statistical parameters are signi1047297cant in demon-
strating the capability of the model QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent For this reason an acceptable interpretation of de-
scriptors in the QSPR model should be provided It is better to di-
agnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968)
CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)
R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)
R 1 and R 2 demonstrate substituted group attached to amine
group B is a base molecule which can be a water molecule The
intermediate in the reaction is zwitterion But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen 2004) It led to the
assumption of the single-step mechanism of these reactions (re-
action (3)) A termolecular single step mechanism suggested by
Crooks and Donnellan (1989)
B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)
where B is again the base molecule In this mechanism NH group is
attacked by base molecule and deprotonation of amine occurs The
bonding between amine and carbon dioxide also takes place
simultaneously
Table 6
R2 train values after several Y-scrambling tests
Iteration R2 train
1 0060
2 0074
3 0119
4 0027
5 0188
6 0102
7 0119
8 0209
9 0096
10 0039
Table 5
Validation parameters and statistical result of GA-MLR model
n R2 R2adj RMSE F k k0 s
Train 18 0942 0930 0127 7650 1004 0984 0144
Test 5 0976 0904 0060 1731 0962 1035 0135
Overall 23 0950 0942 0116 12379 0999 0990 0128
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59
the model One of the common techniques in QSPR approach for
constructing training and test set with the constraint of structural
diversity is a PCA (principal component analysis) method (Hu et al
2009 Riahi et al 2008) In the current work PCA was employed to
classify data set of molecules into training and test sets For this
purpose PC1 and PC2 were calculated based on descriptors in the
model The result showed that these two principal components
made 575 and 335 of the variation in data respectively and
played the main roles Fig 1 shows the distribution of the data for
PC1 and PC2 and by observing this 1047297gure it can be concluded that
the compounds in the training and test sets were representatives of
the whole data
The training set was used to build the model while the test set
was used to validate the predicting power During the model
development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different
resulting models The Q 2LOO was calculated for each obtained
equation and then the best model was selected based on the high
value of this parameter There are some statistical tests and pa-
rameters that need to be considered Coef 1047297cient of determination
(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)
the slopes of regression lines forced through zero (k k0) root mean
square error (RMSE) and standard error of the estimate (s) are the
most important ones The 1047297rst 1047297ve parameters should be near to
unity while RMSE and s should be low enough near to zero
Furthermore the intercepts of the model should be close to zero
Moreover the Fisher function (F ) is another vital statistical test
High values of the F -ratio test indicate reliable models All statistical
parameters formulas used in this paper are mentioned below
R2 frac14 1
Pnifrac141
yexp
i ycalc
i
2
Pnifrac141
yexp
i y2
(3)
RMSE frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
n
v uut (4)
F frac14
Pnifrac141
yexp ycalc
i
2
df M
Pnifrac141
yexpi ycalci
2df E
(5)
k frac14
P yexp
i ycalc
iP ycalc
i
2 (6)
k0 frac14
P yexp
i ycalc
iP yexp
i
2 (7)
s frac14
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141
yexp
i ycalc
i
2
df E
v uut (8)
where df M and df E refer to the degrees of freedom of the model and
error respectively
Also the following criteria described by Golbaraikh and Tropsha
were applied to check the predictability of the QSPR model
(Golbraikh and Tropsha 2002)
1
Q 2 gt05 (9)
2
R2gt 06 (10)
3 R2 R2
0R2 lt0
1 and 0
85 k 1
15 (11)
where R20 is the coef 1047297cient of determination characterizing linear
regression with Y -intercept set at zero The predicted result of all
molecules either in training or test set with statistical parameters
are given in Table 5
Table 4
Correlation matrix of three descriptors used in Eq (2)
Descriptor nH nRNH2 nRNHR
nH 1000 0344 0642
nRNH2 0344 1000 0005
nRNHR 0642 0005 1000
Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69
In Table6 Y-scrambling test was applied in order to examine the
robustness of the model (Tropsha et al 2003) In Y-scrambling test
the dependent variable (Absorption Capacity) is randomly dedi-
cated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables It is ex-
pectedthat newly developedQSPR models shouldhave lowenough
R2 and Q 2 values If it happens differently the reported model is not
accurate for the particular data set and method of modeling
The applicability domain of the model was studied by Williams
plot in Fig 2 (OECD 2007) In Williams plot the standardized re-
siduals (R) versus the leverage (hat diagonal) values (h) were
plotted Leverage demonstrates the distance of a compound from
the centroid of the X where X is the descriptor matrix The leverage
of a compound is calculated by the following equation (Netzeva
et al 2005)
hi frac14 xT i
X T X
1 xi (12)
where xi is the descriptor vector of the relevant compound The
warning leverage (h) is de1047297ned as (Eriksson et al 2003)
h frac143eth p thorn 1THORN
n (13)
n is the number of training objects and p is the number of de-
scriptors in the model Williams plot is used to identify both the
response outlier and the structurally in1047298uential chemicals in the
model A compound with hi gt h in1047298uence the regression line but it
does not consider as an outlier as its corresponding standardized
residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-
ized residual rather than three standard residual unit (gt3s) is
considered as an outlier compound It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction re-
sults of the model
Fig 2 demonstrates that there is no chemical with leverage
higher than the warning h value of 067 It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig 3 against corresponding calculated values
for QSPR model
Furthermore mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor
individually or relative to other descriptors Fig 4 shows the stan-
dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)
The 1047297gure is used to compare the relative weights of the de-
scriptors The higher the standardized coef 1047297cients value of a
descriptor the more important the weight of the corresponding
variable in the model This 1047297gure demonstrates the mean effect of
the descriptors in the model
By observing this 1047297gure it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) de-
scriptors play the main role in the amines capacity for carbon di-
oxide absorption respectively and the number of hydrogen (nH) has
the least effect This 1047297gure shows all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted In fact the predominant difference between this study
and the previous ones is that this work concentrates 1047297rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption
5 Discussion
Although high statistical parameters are signi1047297cant in demon-
strating the capability of the model QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent For this reason an acceptable interpretation of de-
scriptors in the QSPR model should be provided It is better to di-
agnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968)
CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)
R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)
R 1 and R 2 demonstrate substituted group attached to amine
group B is a base molecule which can be a water molecule The
intermediate in the reaction is zwitterion But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen 2004) It led to the
assumption of the single-step mechanism of these reactions (re-
action (3)) A termolecular single step mechanism suggested by
Crooks and Donnellan (1989)
B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)
where B is again the base molecule In this mechanism NH group is
attacked by base molecule and deprotonation of amine occurs The
bonding between amine and carbon dioxide also takes place
simultaneously
Table 6
R2 train values after several Y-scrambling tests
Iteration R2 train
1 0060
2 0074
3 0119
4 0027
5 0188
6 0102
7 0119
8 0209
9 0096
10 0039
Table 5
Validation parameters and statistical result of GA-MLR model
n R2 R2adj RMSE F k k0 s
Train 18 0942 0930 0127 7650 1004 0984 0144
Test 5 0976 0904 0060 1731 0962 1035 0135
Overall 23 0950 0942 0116 12379 0999 0990 0128
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69
In Table6 Y-scrambling test was applied in order to examine the
robustness of the model (Tropsha et al 2003) In Y-scrambling test
the dependent variable (Absorption Capacity) is randomly dedi-
cated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables It is ex-
pectedthat newly developedQSPR models shouldhave lowenough
R2 and Q 2 values If it happens differently the reported model is not
accurate for the particular data set and method of modeling
The applicability domain of the model was studied by Williams
plot in Fig 2 (OECD 2007) In Williams plot the standardized re-
siduals (R) versus the leverage (hat diagonal) values (h) were
plotted Leverage demonstrates the distance of a compound from
the centroid of the X where X is the descriptor matrix The leverage
of a compound is calculated by the following equation (Netzeva
et al 2005)
hi frac14 xT i
X T X
1 xi (12)
where xi is the descriptor vector of the relevant compound The
warning leverage (h) is de1047297ned as (Eriksson et al 2003)
h frac143eth p thorn 1THORN
n (13)
n is the number of training objects and p is the number of de-
scriptors in the model Williams plot is used to identify both the
response outlier and the structurally in1047298uential chemicals in the
model A compound with hi gt h in1047298uence the regression line but it
does not consider as an outlier as its corresponding standardized
residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-
ized residual rather than three standard residual unit (gt3s) is
considered as an outlier compound It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction re-
sults of the model
Fig 2 demonstrates that there is no chemical with leverage
higher than the warning h value of 067 It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig 3 against corresponding calculated values
for QSPR model
Furthermore mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor
individually or relative to other descriptors Fig 4 shows the stan-
dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)
The 1047297gure is used to compare the relative weights of the de-
scriptors The higher the standardized coef 1047297cients value of a
descriptor the more important the weight of the corresponding
variable in the model This 1047297gure demonstrates the mean effect of
the descriptors in the model
By observing this 1047297gure it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) de-
scriptors play the main role in the amines capacity for carbon di-
oxide absorption respectively and the number of hydrogen (nH) has
the least effect This 1047297gure shows all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted In fact the predominant difference between this study
and the previous ones is that this work concentrates 1047297rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption
5 Discussion
Although high statistical parameters are signi1047297cant in demon-
strating the capability of the model QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent For this reason an acceptable interpretation of de-
scriptors in the QSPR model should be provided It is better to di-
agnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968)
CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)
R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)
R 1 and R 2 demonstrate substituted group attached to amine
group B is a base molecule which can be a water molecule The
intermediate in the reaction is zwitterion But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen 2004) It led to the
assumption of the single-step mechanism of these reactions (re-
action (3)) A termolecular single step mechanism suggested by
Crooks and Donnellan (1989)
B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)
where B is again the base molecule In this mechanism NH group is
attacked by base molecule and deprotonation of amine occurs The
bonding between amine and carbon dioxide also takes place
simultaneously
Table 6
R2 train values after several Y-scrambling tests
Iteration R2 train
1 0060
2 0074
3 0119
4 0027
5 0188
6 0102
7 0119
8 0209
9 0096
10 0039
Table 5
Validation parameters and statistical result of GA-MLR model
n R2 R2adj RMSE F k k0 s
Train 18 0942 0930 0127 7650 1004 0984 0144
Test 5 0976 0904 0060 1731 0962 1035 0135
Overall 23 0950 0942 0116 12379 0999 0990 0128
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79
As can be noticed the reaction between CO2 and amine based
solvent takes place because of the existence of NH bond So NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction Consequently
the amount of NH bonds or in other words the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model According to Fig 4 these
two descriptors have a positive effect and a higher mean effect All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model
The model also contains number of Hydrogen atoms (nH) as
another descriptor Fig 4 shows nH descriptor has a positive effect
which is considerably less than two other descriptors The reason of
nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al They showed that
an increase in the chain length between amines and other func-
tional groups in the amine structure result in an increase in amine
capacity for CO2 absorption (Singh et al 2007) Increasing with
chain length results in increasing numbers of hydrogen atoms so
apparently it seems it should have a positive effect due to the
experimental work
At last it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption Although the ac-
curacy of the model is good for linear amine compounds it is not
better for unsaturated cyclic amines This can be explained by two
Fig 2 Williams plot of GA-MLR model development
Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e
regression line
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89
main reasons First the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms primary and secondary amines Second unsatu-
rated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2) Therefore
according to the industrial point of view it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines
Fortunately the results of the 1047297rst equation (Eq (1)) for pre-
dicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5 6
and 21) This is because of the presence of RDF descriptor in this
model RDFdescriptorsare based on the distance distribution in the
geometrical representation of a molecule This function is inde-
pendent of the number of atoms and is invariant against translation
and rotation of the entire molecule The RDFcode provides valuable
information eg about bond distances ring types planar and non-
planar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni 2008)
6 Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accuratemodel which canpredict the chemical behavior
of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and appli-
cable Therefore QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity As a result two linear equations were developed
First model demonstrate high prediction powerwhile second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide Consequently second
equation introduced as a preferred model of this study The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2) number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH) respectively The accuracy and
predictive performance of the model validated with various sta-
tistical tests and examined with the test set of 1047297ve molecules
permits using this model to estimate other amines rich loading
under speci1047297c conditions According to the results it could be
argued that a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains In other words increasing
the number of primary and secondary amine groups results in
increasing the number of NH bonds active sites which causes the
amine reaction with CO2 to happen
The promising results of this study might aid other researchers
in the 1047297eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes New improved
solvents should also be compared to more conventional ones from
corrosively energy ef 1047297ciency and operability point of view
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE) University of Tehran
List of symbols
CO2 carbon dioxide
QSPRQSAR quantitative structure propertyactivity relationship
DFT Density Functional Theory
MLR Multiple Linear Regression
GA Genetic Algorithms
PCR principle component regression
PLS partial least square
HOMO Highest Occupied Molecular Orbital
LUMO Lowest Unoccupied Molecular Orbital
AC absorption capacity
PCA principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE root mean square errordf M degrees of freedom of the model
df E degrees of freedom of the error
References
Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375
Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821
Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111
Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803
Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954
Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003
Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom
Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333
da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418
Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227
Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD
McDowell Robert M Gramatica Paola 2003 Methods for reliability and
Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450
7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas
httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99
uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361
Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484
Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240
Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated
Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006
SVRCe
QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51
Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276
Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171
Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301
Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109
Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the
reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21
Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233
Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)
Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173
OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris
Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598
Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19
Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859
Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035
Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584
Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249
Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10
Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144
Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21
Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons
Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77
XLSTAT 2013 software XLSTAT-CCR module Trial version
M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450