discovery and analysis of novel biochemical transformations
DESCRIPTION
Discovery and Analysis of Novel Biochemical Transformations. Linda J. Broadbelt. Department of Chemical and Biological Engineering Northwestern University Evanston, IL 60208. www.clemson.edu/edisto/ corn/corn.htm. How Can We Create Products from Natural Resources?. - PowerPoint PPT PresentationTRANSCRIPT
Discovery and Analysis of Novel Biochemical Transformations
Linda J. Broadbelt
Department of Chemical and Biological EngineeringNorthwestern University
Evanston, IL 60208
How Can We Create Products from Natural Resources?
•Biochemical processes are being explored as alternatives to traditionalchemical processes
Overall reaction
•Concern over dwindling petroleum-based resources sparks explorationof alternative feedstocks
www.clemson.edu/edisto/ corn/corn.htm
www.timberland.com/.../ tim_product_detail.jsp?OID=18298
biomass polysaccharides monosaccharides glucose ethanol
Reaction Networks of Novel Biochemical Transformations
•Reactants, intermediatesand products
•Reactions
•Thermodynamic parameters
DG1 DG2DG3
DG4
DG5DG7
DG6DG8
DG9
DG10
DG11
DG13DG12
DG14
DG15
DG16 DG17
DG18•Kinetic parameters
k1k2
k4
k6 k7
k9
k3
k8
k5
k10
k12k13
k15
k16
k14
k17
k11
k18
Challenges for Reaction Network Development
Reactive intermediates have not been detected
Pathways have not been elucidated experimentally
Thermodynamic and kinetic parameters are unknown
Reaction networks are large
Construction is tedious and prone to user’s bias and errors
Computer generation of reaction networks
Elements of Computer Generated Reaction Networks
• Graph Theory• Reaction Matrix Operations• Connectivity Scan• Uniqueness Determination• Property Calculation• Termination Criteria
ReactantsReactionTypesReactionRules
k1 k2
k4
k6 k7
k9
k3
k8k5
k10
k12k13
k15k16
k14
k17
k11
k18
DG1 DG2DG3
DG4
DG5DG7
DG6DG8
DG9
DG10
DG11
DG13DG12
DG14
DG15
DG16 DG17
DG18
Bond-Electron Representation Allows Implementation of Chemical Reaction
ij entries denote the bond order between atoms i and j ii entries designate the number of nonbonded electrons associated with atom i
methane methyl radical ethylene
C 0 1 1 1 1H 1 0 0 0 0H 1 0 0 0 0H 1 0 0 0 0H 1 0 0 0 0
C 0 2 1 0 0 1 C 2 0 0 1 1 0 H 1 0 0 0 0 0 H 0 1 0 0 0 0 H 0 1 0 0 0 0H 1 0 0 0 0 0
C 1 1 1 1H 1 0 0 0H 1 0 0 0
C 1 1 1 1H 1 0 0 0H 1 0 0 0H 1 0 0 0
Reaction OperationH 0 1 0C 1 0 0H• 0 0 1
H 0 0 1C• 0 1 0H 1 0 0
+ 0 -1 1-1 1 0 1 0 -1
ReactantMatrices Reactant
MatrixReordered
Reactant MatrixProductMatrix
C 0 1 1 1 1H 1 0 0 0 0H 1 0 0 0 0H 1 0 0 0 0H 1 0 0 0 0
H• 1
C 0 1 1 1 1 0H 1 0 0 0 0 0H 1 0 0 0 0 0H 1 0 0 0 0 0H 1 0 0 0 0 0H• 0 0 0 0 0 1
H 0 1 0 0 0 0C 1 0 0 1 1 1H• 0 0 1 0 0 0H 0 1 0 0 0 0H 0 1 0 0 0 0H 0 1 0 0 0 0
H 0 0 1 0 0 0C• 0 1 0 1 1 1H 1 0 0 0 0 0H 0 1 0 0 0 0H 0 1 0 0 0 0H 0 1 0 0 0 0
H • + CH4 •CH3 + H2
Chemical Reaction as a Matrix Addition Operation
EC i.j.k.l → unique enzyme
Tipton, S.B. and Boyce, S. Bioinformatics. 16 (2000), 34-40Kanehisa, M. and Goto, S. Nucleic Acid Research. 28 (2000), 27-30
i → main class j → functional groupk → cofactor / cosubstrate
Unique Patterns of Observed Enzyme Chemistry
4th level is specific to substrate
Enzyme commission (EC) code number provides systematic names for enzymes
EC i.j.k.l unique enzyme
i the main class j the specific functional groups k cofactors
l specific to the substrates
Formulation of Reaction Matrices Using Enzyme Classification System
Enzyme commission (EC) code number provides systematic names for enzymes
EC i.j.k.l unique enzyme
i the main class j the specific functional groups k cofactors
l specific to the substrates
Formulation of Reaction Matrices Using Enzyme Classification System
•More than 5,000 specific enzyme functions (i.j.k.l)•Fewer than 250 generalized enzyme functions (i.j.k)•Novel enzyme functions should be expected through genomic sequencing, proteomics and protein engineering
Generalized Enzyme Function Examined at the i.j.k Level
Example of a Generalized Enzyme Reaction
• EC 4.2.1.2 (fumarate hydratase)
• EC 4.2.1.3 ( aconitate hydratase)H-C-C-O-H
C=C + H2O
Generalized enzyme reaction
(EC 4.2.1)
H-C-C-O-H
C=C + H2O
- - - -
+HO2CCO2H
OH
HO2CCO2H
HO
H
+HO2CCO2H
CO2HH
OH
HO2CCO2H
HO
CO2H
Matrix Representation of Generalized Enzyme Function (i.j.k)
ProductsReaction operator
+
Reactant
++HO2CCO 2H
O H
HO2CCO 2H
O H
HO2CCO2H
HO2CCO2H
HO2CCO2H
H
O
HH
O
HH
O
H
4100O
1010C
0101C
0010H
OCCH
0-101O
-1010C
010-1C
10-1H
OCC
4001O
0020C
0200C
1000H
OCCH
Generalized enzyme reaction EC 4.2.1
C=C +H -C-C-O-H
H
HOH
=0
A + B + A + B
C C
D
C
I.J.KL.M.NQ.R.S
D
+ A + B
E
Generation1
Generation2
Generation3
A + B C D EI.J.K L.M.N Q.R.S
I.J.KL.M.NQ.R.S
I.J.KL.M.NQ.R.S
Generation0
Discovery of Novel Biosynthetic Routes
Implications for Novel Pathway Development
Given a novel reaction (reactant/product),can we identify enzymes (catalysts) thatcould be engineered (evolved) to carry thisnovel biotransformation ?
If A gives B under 2.4.1 action,
then target enzymes within the 2.4.1 class
Step 1Enumerate all enzymes in the EC system
Step 2Choose a specific pathway to explore its synthetic ability
ExampleAromatic amino acid biosynthesis
Application of Reaction Matrix Approach
Exists in higher plants and microorganisms Pathway does not exist in mammals
chorismate
prephenate
phenylalanine
tyrosine
4-hydroxyphenylpyruvate
phenylpyruvate
Aromatic Amino Acid Biosynthesis: Phenylalanine and Tyrosine
aromaticaminotransferase
prephenatedehydratase
chorismatemutase
prephenatedehydrogenase
glutamate
glutamate
chorismate
prephenate
phenylalanine
tyrosine
4-hydroxyphenylpyruvate
phenylpyruvate
aromaticaminotransferase
prephenatedehydratase
chorismatemutase
prephenatedehydrogenase
glutamate
glutamate
5.4.99.5
2.6.1.57
1.3.1.12 2.6.1.57
4.2.1.51
Aromatic Amino Acid Biosynthesis: Phenylalanine and Tyrosine
Reaction Misclassification (?)
Some reactions within classes are not generalGeneral 4.2.1 reaction (4. = lyase)Loses water (4.2.1 = hydrolyase) AND forms a double bond. However…
4.2.1.51 + H2O + CO2
It is both a carboxy-lyase (4.1.1)and a hydro-lyase (4.2.1)
prephenatedehydratase
4.2.1.51 + H2O + CO2
4.2.1.51 can be broken down into 3 general reactions:
4.1.1 will decarboxylate (4.1.1 is a carboxy-lyase)
5.3.3 will rearrange the double bond (5.3.3 transposes C=C bonds)
4.2.1 will lose H2O and form a double bond (4.2.1 is a hydro-lyase)
Reaction Decomposition
prephenatedehydratase
Mapping Results
•Although only 2500 reactions in the KEGG and 269 reactions in the iJR904 model were contained in the curated EC classes, 3267 (50%) of the KEGG reactions and 430 (46%) of the iJR904 reactions were reproduced using the 86 reaction rules
•The reproduced reactions are involved in 129 different third-level enzyme classes in the KEGG and iJR904
•100% of the reactions contained in 25 of the uncurated EC classes in the KEGG were mapped to the 86 existing reaction rules
Tryptophan Biosynthesis Pathway
Input Moleculesphosphoenolpyruvate (PEP), erythrose-4-phosphate (E4P), glutamine, serine, ribose-5-
phosphate (R5P)
CofactorsATP, NADPH
Specific Enzyme Actions12
The Evolution and Wealth of the Aromatic Amino Acid Biochemistry
Convergence
1
10
102
103
104
105
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Num
ber o
f Pro
duct
s
TYR
TRP
PHE
Generation Number
7-carboxyindole
Specialty Organic Chemicals
• 3-Hydroxypropanoate is a useful chemical with known biochemical production routes
3-Hydroxypropanoate from Pyruvate
• Generate all of the possible compounds and reactions from pyruvate using only the reaction rules involved in the known biosynthetic routes to 3HP
• Generate all of the possible compounds and reactions from pyruvate using all of the 86 current reaction rules
2 3 4 5 6 7 8 9 10
Novel Biosynthetic Pathways Discovered: Pyruvate to 3HP
Distribution of lengths of pathways
Pathway lengthN
umbe
r of p
athw
ays
1
102
103
104
10
105
106
A pathway of length two and a pathway of length three were both discovered using the additional reaction rules
pathways discovered using all reaction rules
pathways discovered using only the reaction rules involved in the known pathways to 3HP
Pyruvate
Oxaloacetate
Alanine
Lactate
Propenoate
Ethylamine
Hydroxyacetone
Homoserine
2-hydroxy-2,4-pentadienoate
Propan-2-ol
Propene
Propane-1-ol
Propanoate
Propanoate
TheonineLactaldehyde
Acrolein
Beta-alanine
3-hydroxypropanol
3-HP
Propane-1,3-diol
AspartateFumarate
Malate
Malonate semialdehyde
Ethanol
Allyl alcohol
Propane 1,2 diol
Ethylene
Acetaldehyde Propanoyl-
CoA
3HP-CoA
Lactoyl-CoAAcryloyl-CoA
Beta-alanyl-CoA
3-Oxopropionyl-CoA
Pyruvate
Oxaloacetate
Lactate
Beta-alanine
3-HP
Aspartate
Malonate semialdehyde
3HP-CoA
Lactoyl-CoAAcryloyl-CoA
Beta-alanyl-CoA
•Pathway length
•Fewest novel intermediates
•Thermodynamic feasibility
•Maximum achievable yield to 3HP from glucose during anaerobic growth
•Maximum achievable intracellular activity at which 3HP can be produced
•Protein docking calculations
•Quantum chemical investigations
What Screening Methods Can We Use to Identify the Most Attractive Pathways?
Shortest Novel Pathways to 3HP
Part of the Patented Pathway
KEGG Reaction not in Patented Pathways Not found in KEGG or Patented
Pathways
CO2
glu 2-oxoCO2
2-oxo
glu
nad
nadhH+
nad
nadhH+
H2O
CoA
H2O
1.1.
12.
3.1
4.2.
1
4.1.1 Rev 2.6.14.1.1
2.6.11.1.1
4.2.1 Rev 2.3.1 Rev
4.1.1 2.6.14.3.1
2.6.1
4.2.1 Rev 4.3.1
2.6.14.3.1
2.3.1
2.3.1
1.1.1
4.2.1 Rev
4.1.1 Rev
H2O
NH3
H2O CoA
H2O
CoA
glu
2-oxo
CoA
H2O
glu2-oxoCO2
CO2
NH3
H2O NH3
glu
2-oxo
H2O
nad
nadhH+
Pyruvate Oxaloacetate Aspartate
Lactate Β-alanine
Β-alanylCoA
Lactoyl CoA
Acryloyl CoA
3-HP-CoA 3-HP
MalonateSemialdehyde
Propenoate
Alanine Ethylamine Acetaldehyde
3-oxopropionylCoA
Pathway N1
Pathway N2
1.1.1 & 2.3.1
nadnadh H+
CoA
•Two-step pathway identified with only one novel reaction
•Maximum achievable yield to 3HP from glucose during anaerobic growth matches commercial pathway
•Slightly reduced maximum achievable intracellular activity at which 3HP can be produced
•Numerous other attractive candidates
Attractive Novel Pathways Successfully Identified
Can the enzyme that catalyzes decarboxylation of pyruvate perform catalysis of different substrates?
Decarboxylation reaction of ketoacids
PFOR (1.2.7.a) : pyruvate + CoA + Fd (ox) CO2+ acetyl-CoA + Fd (red)
Generalized enzyme operators can act on all of the above keto acids to give their corresponding products
Are These Novel Reactions Feasible?
• Substrate binding Docking analysis
• Ability to form initial enzyme-substrate bound species with no distortion to the active site of the enzyme or the cofactor
QM/MM structural studies• Follow the reaction pattern of the native substrate
Study of reaction mechanism using QM methods
Explore Novel Reactions Using Molecular Modeling
PFORSubstrate 1.2.7
pyruvic acid -10.72-ketobutyric acid -11.63
2-ketoisovaleric acid -11.562-ketovaleric acid -11.31
2-keto-3-methylvaleric acid -11.272-keto-4-methylpentanoic acid -11.01
phenylpyruvic acid X
Scored using GLIDE
Enzyme Docking Results
1 2
3
4 56
pyruvic acid
2-ketovaleric acid
2-keto-4-methylpentanoic acid
2-ketoisovaleric acid
2-ketobutyric acid
2-keto-3-methylvaleric acid
Enzyme Docking Poses
MM part : 50 Å of active site and solvent molecules ~20,000 atomsQM part : 63 atoms Geometry : B3LYP/6-31G*
Binding Using Quantum Mechanics/Molecular Mechanics
2-keto-4-methylpentanoic acid
pyruvic acid
2-keto-3-methylvaleric acid
2-ketoisovaleric acid
QM/MM structural studies suggest that the binding of the substrates does not cause distortions to the active site
Comparison of Bound Structures of Different Acids: QM/MM
N
S
H3C
N
N
NH2
CH3
N
S
H3C
N
N
NH2
CH3
CC
O
O
OH
C
N
S
H3C
N
N
NH2
CH3
CC
O
O
OH
E
N
S
H3C
N
N
NH2
CH3
C
C
O
O
OH
G
CC bond formation TSB
Proton Transfer TSD
C-C bond cleavage TS
FTS1 TS2
O
OH
A
HEThDP enamine
LThDP
ThDP ylide + KAVille et al., Nature Chemical Biology, 2(6), 2006, 324
Kinetics of Enzyme-Catalyzed Decarboxylation: Quantum Mechanics
-30
-25
-20
-15
-10
-5
0
5
10
15
Reaction coordinate
Rel
ativ
e fr
ee e
nerg
y ( K
cal/M
ol)
gas phase water dichloro ethane
TS 1
TS 2
ThDP + pyruvic acid
LThDP enamine + CO2
Free Energy Surface of Thiamine-Catalyzed Decarboxylation: Pyruvic Acid
Free energy barrier (∆Gactivation298K, DCE)
0
5
10
15
20
25
30
35
Ener
gy b
arrie
r (kc
al/m
ol)
C-C bond formation barrier C-C bond breaking barrier
O
OH
OO
OH
O
O
OH
O O
OH
OO
OH
O
O
O
OH
O
OH
O
Comparison of Thiamine-Catalyzed Decarboxylation
NOT present in KEGG NOT present in CAS REGISTRY
1,3,4,5-Tetrahydroxy-Cyclohexanecarboxylic acid
HO CO2H
HOOH
OH
3-[1-Carboxy-2-(1,4-dihydro-pyridin-3-yl)-ethoxy]-4-hydroxy-cyclohexa--1,5-dienecarboxylic acid
NH
O
CO2H
HO
CO2H
Present in KEGG(Kyoto Encyclopedia of Genes
and Genomes)
Exploring Novel Pathways and Molecules
New routes tobioavailable species New molecules
Migration to Biocatalytic Processes
HO CO2H
HOO
OH
NOT present in KEGGPresent in CAS REGISTRY
1,3,5-Trihydroxy-4-oxo-cyclohexanecarboxylic acid
New biochemical routesto existing chemicals
Acknowledgments
•Department of Energy•National Science Foundation Cyber-enabled Discoveryand Innovation
•Vassily Hatzimanikatis•Chunhui Li•Chris Henry•Goran Krilov•Raj Assary
Funding
Collaborators