Computational Modeling in Chemical Engineering
.edu/comocheng
Finding Transition States Algorithmically for Automatic Reaction Mechanism GenerationPierre L. BhoorasinghRichard H. West
1
Can you predict TS geometries from molecular groups alone?
2
(this would be great)
Length of bond being broken, at TS for Hydrogen abstraction
Can you predict TS geometries from molecular groups alone?
3
Radi
cal
Molecule
Length of bond being broken, at TS for Hydrogen abstraction
!"!#$ !"!%% !"!&' !"!($
!")() !")'& !"*+$ !"*!#
!")(' !")$% !"*%%
!")'+ !"*+& !"*&) !"*&$
Can you predict TS geometries from molecular groups alone?
3in Å with M06-2X/6-31+G(d,p)
Can you predict TS geometries from molecular groups alone?
4
!"!#$ !"!%% !"!&' !"!($
!")() !")'& !"*+$ !"*!#
!")(' !")$% !"*%%
!")'+ !"*+& !"*&) !"*&$
!"#$# !"#$%!"!#$ !"#$%
in Å with M06-2X/6-31+G(d,p)
You can predict TS geometries from molecular groups alone!
5
!"!#$ !"!%% !"!&' !"!($
!")() !")'& !"*+$ !"*!#
!")(' !")$% !"*%%
!")'+ !"*+& !"*&) !"*&$
!"#$%
in Å with M06-2X/6-31+G(d,p)
You can predict TS geometries from molecular groups alone!
6
But...
... you gave me a distance, not a geometry.
... I gave you 15 numbers then asked you for 1.
Automatic Transition State Theory (TST) would be a game-changer.
• Insight and predictions require detailed kinetic models.•Error-free detailed models require automatic generation.•Automatic generation requires reasonable estimates of millions of reaction rates.•Current estimates are often unreasonable due to scarcity of data.
7
Automatic TS searches remain an important energy research goal
“An accurate description of the often intricate mechanisms of large-molecule reactions requires a characterization of all relevant transition states... Development of automatic means to search for chemically relevant configurations is the computational-kinetics equivalent of improved electronic structure methods.”- Basic Research Needs for Clean and
Efficient Combustion of 21st Century Transportation Fuels.
US Dept of Energy (2006)8
Automatic TS searches remain an important energy research goal
“...transformation from by-hand calculations of single reactions to automated calculations of millions of reactions would be a game-changer for the field of chemistry, and would be a good ‘Grand Challenge’ target...”
- Combustion Energy Frontier Research Center (2010)
9
First Annual Conference of the Combustion Energy Frontier Research
Center (CEFRC)
September 23-24, 2010
Princeton
An introduction toReaction Mechanism Generator
Automatically builds detailed kinetic models
facebook.com/rmg.mitr m g . s o u rc e f o r g e . n e t
10
⇌RMG
Molecules are represented as graphs
CH3CH2. C C*
H
H
H H
H
=
11
Thermochemistry is often estimated by Benson group contributions
C-(C)(H)3
C-(C)2(H)2
Cb-(H)
C-(C)(Cb)(O)(H)
12
Reaction families propose all possible reactions with given species
bond breaking and hydrogen abstraction
intramolecularH-abstraction
13
•Template for recognizing reactive sites
•Recipe for changing the bonding at the site
•Rules for estimating the rate
14
Reaction families propose all possible reactions with given species
•Template for recognizing reactive sites
•Recipe for changing the bonding at the site
•Rules for estimating the rate
Octane autoxidation has many pathways
15
•Some pathways go further than others.
16
Faster pathways are explored further
AB
CD
E
FG
H
AB
CD
E
F
17
Edge requires many reaction rates
100 species1,000 reactions
18
Edge requires many reaction rates
100 species1,000 reactions
15,000 species180,000 reactions
18
Rate estimates are based on the local structure of the reacting sites.
•Hydrogen abstraction: XH + Y. → X. + YH•Rate depends on X and Y.
19
OH
O
20
Rate estimation rules are organized in a tree
Part of the tree for X
Part of the tree for Y21
Ideal tree: lots of data
22
Typical tree: sparse data
23
24
So that was RMG...
...but what about TS geometries?
Single method not feasible for all reaction types
Intra-H migration
Intra-OH migration
Birad recombination
Intra R addition exocyclic
Intra R addition endocyclic
1,2 birad to alkene
Beta scission
Diels-alder
Radical recombination
Radical addition
Peroxyradical HO2 elimination
1+2/2+2 cycloaddition
Cyclic ether formation
1,2 insertion
1,3 insertion CO2/ROR
Radical addition COO radical recombination
H abstraction
Dispropotionation
25
But a single method can apply to multiple reaction types
A B A B + C A + B C + DIntra-H migrationIntra-OH migrationBirad recombinationIntra R addition exocyclicIntra R addition endocyclic1,2 birad to alkene
Beta scissionDiels-alderRadical recombinationRadical addition
1+2/2+2 cycloadditionCyclic ether formation1,2 insertion1,3 insertion CO2/RORRadical addition COO radical recombination
H abstractionDispropotionation
Peroxyradical HO2 elimination
26
Want robust and user-friendly3D representation
• Internal coordinates•Alter distances and angles
•Cartesian coordinates•Translate, rotate atoms
•Distance geometry•Alter only distances
Atom X Y Z
1 x1 y1 z1
2 x2 y2 z2
3 x3 y3 z3
4 x4 y4 z4
27
Use RDKit’s geometry editing toolsfor atom positioning
⇌RMGMolecule
Connectivity3D
Structure
28
Use RDKit’s geometry editing toolsfor atom positioning
⇌RMGMolecule
Connectivity
Atoms List
Atom
s Li
st Upper limits
Lower limitsGenerate bounds matrix
Embedin 3D
28
Use RDKit’s geometry editing toolsfor atom positioning
⇌RMGMolecule
Connectivity
Atoms List
Atom
s Li
st Upper limits
Lower limitsGenerate bounds matrix
Atoms List
Atom
s Li
st
Embed in 3D
Editbounds matrix
28
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
2.0
2.1
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
2.0
2.1
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
2.0
2.1
2.5
2.6
29
C H H H H O O HCHHHHOOH
0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0
Edit multiple distances to preciselyposition atoms involved in reactions
2.0
2.1
2.5
2.6
29
Double-ended algorithms findtransition state estimates
Reactants
Products30
Double-ended algorithms findtransition state estimates
Reactants
Products30
R
P
Position molecules fordouble-ended searches
31
R
P
Best guess: just either side of TS
32
Method tested withsemi-empirical calculations
•Two double-ended algorithms tested•QST2 at PM6 in Gaussian09•SADDLE at PM7 in MOPAC2012
•Reaction path analysis validated the saddle points
Generate Bounds Matrix
Edit Bounds Matrixclose to TS
Embed Matrix in
3DReaction
from RMGOptimize TS geometry
Generate Bounds Matrix
Edit Bounds Matrixclose to TS
Embed Matrix in
3D
Double-ended Search
Reactants
Products
IRCCalculation
33
Path analysis algorithms descendto find the reactants and products
R
P
34
Path analysis algorithms descendto find the reactants and products
R
P
34
Path analysis algorithms descendto find the reactants and products
R
P
34
Path analysis algorithms descendto find the reactants and products
R
P
34
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
A closer look at the automatic TS search process for H abstraction
35
338 Reactions from the NIST Database
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
A closer look at the automatic TS search process for H abstraction
35
VdW collisions
338 Reactions from the NIST Database
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
A closer look at the automatic TS search process for H abstraction
35
VdW collisions
No TS at this ES level
338 Reactions from the NIST Database
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
A closer look at the automatic TS search process for H abstraction
35
VdW collisions
No TS at this ES level
338 Reactions from the NIST Database
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
A closer look at the automatic TS search process for H abstraction
35
VdW collisions
No TS at this ES level
338 Reactions from the NIST Database
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
A closer look at the automatic TS search process for H abstraction
35
VdW collisions
No TS at this ES level
338 Reactions from the NIST Database
Bond perception
Species matching returned false negativesdue to incorrect bond order perception.
CH4
36
R
P
Observed:
Expected:
Species matching returned false negativesdue to incorrect bond order perception.
Connectthe dots
CH4
36
R
P
Observed:
Expected:
Species matching returned false negativesdue to incorrect bond order perception.
Connectthe dots
Perceivebond order
CH4
36
R
P
Observed:
Expected:
Species matching returned false negativesdue to incorrect bond order perception.
Connectthe dots
Perceivebond order
CH4
CH4
Checkvalencies
36
R
P
Observed:
Expected:
Species matching returned false negativesdue to incorrect bond order perception.
Connectthe dots
Perceivebond order
CH4
CH4
Checkvalencies
36
R
P
Observed:
Expected:
Species matching returned false negativesdue to incorrect bond order perception.
Connectthe dots
Perceivebond order
CH4
CH4
Checkvalencies
Checkvalencies
CH4
36
R
P
Observed:
Expected:
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
Most failures involve reactions withsmall molecules
37
VdW collisions
No TS at this ES level
Bond perception
Small radicals need to be closer to the molecule they are abstracting from
38
•All abstractions by H. failed•Many with other small radicals (eg. .OH) also failed
Small radicals need to be closer to the molecule they are abstracting from
38
•All abstractions by H. failed•Many with other small radicals (eg. .OH) also failed
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
Learn from the successful saddle pointsto improve automatic searches
39
VdW collisions
No TS at this ES level
Bond perception
Semi-empirical estimates used forDFT calculations
40
•Check semi-empirical geometry validity•Use geometry as input to DFT calculations•Check DFT geometry validity
Generate Bounds Matrix
Edit Bounds Matrixclose to TS
Embed Matrix in
3DReaction
from RMGOptimize TS geometry
Generate Bounds Matrix
Edit Bounds Matrixclose to TS
Embed Matrix in
3D
Double-ended Search
Reactants
Products
IRCCalculation
Optimize TSgeometry at
DFT
IRC Calculation
at DFT
Trends observed in DFTsaddle point geometries
41
Structure method:Basis set:
M06-2X6-31+G(d,p)
X
Y•
H
Trends observed in DFTsaddle point geometries
41
Structure method:Basis set:
M06-2X6-31+G(d,p)
X
Y•
H
Trends observed in DFTsaddle point geometries
41
Structure method:Basis set:
M06-2X6-31+G(d,p)
X
Y•
H
Trends observed in DFTsaddle point geometries
41
Structure method:Basis set:
M06-2X6-31+G(d,p)
X
Y•
H
Estimate geometry directly viagroup additive distance estimates
42
Generate Bounds Matrix
Edit Bounds Matrixclose to TS
Embed Matrix in
3DReaction
from RMGOptimize TS geometry
Generate Bounds Matrix
Edit Bounds Matrixclose to TS
Embed Matrix in
3D
Double-ended SearchReactants
Products
IRCCalculation
Generate Bounds Matrix
Edit Bounds Matrix
for TS
Embed Matrix in
3D
•Database arranged in tree structure as for kinetics•Trained on successfully optimized transition states•Direct guess much faster than double ended search•Success depends on training data
Comparison of the developed methods
43
Double-Ended Searches Direct Estimates
Input requirements 2 rough estimates 1 good estimate
Distance specifications One rule for all Group based estimates
Optimization Methods
QST2, SADDLE, Surface Walking Surface Walking
Computational Speed Slower Faster
Small radical reactions Problematic Better
Multiple conformers Problematic Possible
Contributions
•Explained Reaction Mechanism Generator RMG.•Created framework to find TS geometries using RMG and RDKit for distance geometry.•Categorized reaction families, and chose H-abstraction as first target.• Implemented double-ended TS searches that work with no training data.• Identified trends in functional group contributions to TS geometries.• Implemented direct guesses based on group additive estimates, and started to train group values.
44
Department of Chemical Engineering