retro*: learning retrosynthetic planning with neural

31
Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search Binghong Chen 1 , Chengtao Li 2 , Hanjun Dai 3 , Le Song 1,4 1 Georgia Institute of Technology, 2 Galixir, 3 Google Research, Brain Team, 4 Ant Financial ICML 2020 Presentation 1

Upload: others

Post on 12-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search

Binghong Chen1, Chengtao Li2, Hanjun Dai3, Le Song1,4

1Georgia Institute of Technology, 2Galixir, 3Google Research, Brain Team, 4Ant Financial

ICML 2020 Presentation

1

Too Long; Didn’t WatchOur paper proposes the optimal retrosynthetic planning

problem and, an A*-like algorithm which learns from experience

as solution, with state-of-the-art performance on a real-world

benchmark dataset.

2

Background: Retrosynthesis ProblemTask: predict synthesis routes for target molecules.

Challenge: combinatorial search space.

Sub-problems: One-step retrosynthesis Retrosynthetic planning

Target Molecule

Intermediate Compounds

Building Blocks

3

Background: One-step Retrosynthesis

: the 𝑖𝑖-th set of predicted reactants.: cost of the 𝑖𝑖-th reaction.

4

NH

OO

O

O

N

O

O

O

O

O

Product 𝑑𝑑

Given

PredictReactants O

O

O

O

O

NH

S

O

OONHO

O

O

O

Reactants

Background: Retrosynthetic PlanningPlan a synthesis route from the reaction candidates produced by one-step model 𝐡𝐡.

Motivation: find better routes shorter with higher yields, more chemically sound, more economically efficient, more environmentally friendly, … (you name it)

Target Molecule 𝒕𝒕

Building Blocks 𝑴𝑴

5

π‘Ήπ‘ΉπŸπŸ

π‘Ήπ‘ΉπŸπŸ

π‘Ήπ‘ΉπŸ‘πŸ‘

𝑐𝑐(𝑅𝑅1)

𝑐𝑐(𝑅𝑅2)

𝑐𝑐(𝑅𝑅3)

cost

Problem: Optimal Retrosynthetic PlanningGiven:

a target molecule 𝑑𝑑, a set of building blocks 𝑀𝑀, a one-step model 𝐡𝐡.

Optimal Planning:

Minimize 𝑐𝑐 𝑅𝑅1 + 𝑐𝑐 𝑅𝑅2 + β‹―+ 𝑐𝑐 π‘…π‘…π‘˜π‘˜

Where 𝑅𝑅1,𝑅𝑅2, … ,π‘…π‘…π‘˜π‘˜ is a series of possible reactions predicted with 𝐡𝐡 that start with molecules in 𝑀𝑀 and ultimately lead to synthesis of 𝑑𝑑.

Note: Practical constraint (efficiency) The number of calls to one-step model 𝐡𝐡 should be limited.

6

Retro*: Contributions

7

First algorithm to learn from previous planning experience. State-of-the-art performance on a real-world benchmark dataset. Able to induce a search algorithm that guarantees finding the optimal solution.

Retro*: AND-OR Tree Representation

Each molecule is encoded as an `OR` node (like π‘šπ‘š), requiring at least one of its children to be ready.

Each reaction is encoded as an `AND` node (like 𝑃𝑃), requiring all children to be ready.

All building blocks are ready. Solution found when the root is ready.

Example:Reaction 𝑷𝑷: π‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘π‘π‘šπ‘šπ‘šπ‘šπ‘šπ‘š 𝒄𝒄 + π‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘π‘π‘šπ‘šπ‘šπ‘šπ‘šπ‘š 𝒅𝒅 β†’ π‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘π‘π‘šπ‘šπ‘šπ‘šπ‘šπ‘š π’Žπ’Ž.Reaction 𝑸𝑸: π‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘π‘π‘šπ‘šπ‘šπ‘šπ‘šπ‘š 𝒇𝒇 β†’ π‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘šπ‘π‘π‘šπ‘šπ‘šπ‘šπ‘šπ‘š π’Žπ’Ž.

8

Retro*: Algorithm Framework

9

Key Idea: Prioritize the synthesis of the molecules in the current best plan.Definition of 𝑽𝑽𝒕𝒕(π’Žπ’Ž|𝑻𝑻): under the current search tree 𝑇𝑇, the cost of the current best plan containing π‘šπ‘š for synthesizing target 𝑑𝑑.

Retro*: Computing 𝑉𝑉𝑑𝑑(π‘šπ‘š|𝑇𝑇) via Tree-DPBy decomposing 𝑉𝑉𝑑𝑑 π‘šπ‘š 𝑇𝑇 into simpler components in a recursive fashion, we can compute its value efficiently via tree-structured dynamic programming.

(2) Define reaction number 𝒓𝒓𝒓𝒓(β‹… |𝑻𝑻): minimum estimated cost needed for a molecule/reaction to happen in the current tree.

(1) Boundary case: π‘½π‘½π’Žπ’Ž ≑ π‘½π‘½π’Žπ’Ž π’Žπ’Ž βˆ… , the cost of synthesizing frontier node π‘šπ‘š.

(3) Compute 𝑽𝑽𝒕𝒕(π’Žπ’Ž|𝑻𝑻) with 𝒓𝒓𝒓𝒓 β‹… 𝑻𝑻 .

10

Retro*: Example for Computing 𝑉𝑉𝑑𝑑(π‘šπ‘š|𝑇𝑇)

π‘Žπ‘Ž

𝑑𝑑

𝑃𝑃 𝑄𝑄

𝑏𝑏 𝑐𝑐 𝑑𝑑 π‘šπ‘š

𝑅𝑅

𝑓𝑓 π‘˜π‘˜

11

We learn π‘½π‘½π’Žπ’Ž ≑ π‘½π‘½π’Žπ’Ž π’Žπ’Ž βˆ… , the cost of synthesizing π‘šπ‘š.

𝑉𝑉𝑑𝑑 𝑓𝑓|𝑇𝑇 = 𝑐𝑐 𝑃𝑃 + 𝑐𝑐 𝑅𝑅 + π‘‰π‘‰π‘Žπ‘Ž + 𝑉𝑉𝑐𝑐 + 𝑉𝑉𝑓𝑓 + π‘‰π‘‰π‘˜π‘˜π‘”π‘”π‘‘π‘‘(𝑓𝑓|𝑇𝑇) β„Žπ‘‘π‘‘(𝑓𝑓|𝑇𝑇)π΄π΄βˆ— algorithm!

π΄π΄βˆ— Admissibility: guarantee finding an optimal solution if π‘‰π‘‰π‘šπ‘š is a lower-bound!

Note: 0 is the lower-bound of π‘‰π‘‰π‘šπ‘š for any molecule π‘šπ‘š if 𝑐𝑐 𝑅𝑅 = βˆ’ log Prob 𝑅𝑅 β‰₯ 0.

π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ 𝑑𝑑 𝑇𝑇 = min π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ 𝑃𝑃 𝑇𝑇 , π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ 𝑄𝑄 π‘‡π‘‡π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ 𝑄𝑄|𝑇𝑇 = 𝑐𝑐 𝑄𝑄 + 𝑉𝑉𝑑𝑑 + 𝑉𝑉𝑒𝑒

Retro*: Learning to Plan

12

…Collect planning data

Learn π‘‰π‘‰π‘šπ‘š

π‘Žπ‘Ž

𝑑𝑑

𝑃𝑃 𝑄𝑄

𝑏𝑏 𝑐𝑐 𝑑𝑑 π‘šπ‘š

𝑅𝑅

𝑓𝑓 π‘˜π‘˜

Plan with learned model

Retro*: Training Objective

Dataset: each tuple contains target molecule π‘šπ‘šπ‘–π‘– , best synthesis cost 𝑣𝑣𝑖𝑖 , expert reaction 𝑅𝑅𝑖𝑖 , and one-step retrosynthesis candidates 𝐡𝐡(π‘šπ‘šπ‘–π‘–).

Regression loss:Optimize

Consistency loss:

13

Constraint:

𝑐𝑐 𝑅𝑅𝑗𝑗 + οΏ½π‘šπ‘šβ€²βˆˆπ‘†π‘†π‘—π‘—

π‘‰π‘‰π‘šπ‘šβ€² > 𝑣𝑣𝑖𝑖 + πœ–πœ–

Exp: Creating Benchmark Dataset

14

USPTO reactions

A

B

C

eMoleculesBuilding blocks

80%

10%

10%

train routes

val routes

+

+ βˆ’

+test routes

βˆ’

189 routes

Train one-step model 𝐡𝐡

Exp: Baselines & Evaluation

Baselines:o Greedy - greedy Depth First Search: prioritize the reaction with the highest likelihood.o MCTS - Monte-Carlo Tree Search (Segler et al., 2018).o DFPN-E - a variant of Proof Number Search (Kishimoto et al., 2019).o π‘π‘π‘π‘π‘π‘π‘π‘π’π’βˆ—-𝟎𝟎 - obtained by setting π‘‰π‘‰π‘šπ‘š to a lower-bound, 0 (ablation study).

Evaluation:o Time: number of calls to the one-step model (β‰ˆ 0.3𝑠𝑠 per call, occupying > 99% time).o Solution quality: total costs of reactions / number of reactions (length).

15

Exp: Results

Figure: Counts of the best solutions among all algorithms in terms of length/cost.

Figure: Influence of time on performance.

16

Exp: Sample Solution

Expert route: requires 4 steps to synthesis the molecule

Retroβˆ— solution

Figure: Sample solution route produced by Retroβˆ—. Expert route requires 3 more steps to synthesize one molecule in the route.

17

Future Work

18

Learning to plan in theorem proving Same search space

Polymer retrosynthesis More patterns and constraints Chain reaction

No polymerization dataset Transfer knowledge from small molecules

Thanks for listening!

Paper Full Slides

For more details, please refer to our paper/full slides/poster:

Poster

19

Background: Reaction Template

20

NH

OO

O

O

N

O

O

O

O

O

Product C

C:2C:1

N:4C:3

C:5

S

OO

O

C:1C:2

C:3N:4

C:5+

O

O

O

O

O

NH

S

O

OONHO

O

O

O

Reactant A Reactant B

Known Retrosynthesis

Template: subgraph (i.e., reaction core highlighted in red) rewriting rules

Given a product, how to apply template?

RDKit’s runReactants will produce list of precursors

One-step Retrosynthesis

Template-based approach Graph Logic Network Dai, Hanjun, et al. "Retrosynthesis Prediction with Conditional Graph Logic

Network." Advances in Neural Information Processing Systems. 2019.

Template-free approach Seq2seq models Karpov, Pavel, Guillaume Godin, and Igor V. Tetko. "A transformer model for

retrosynthesis." International Conference on Artificial Neural Networks. Springer, Cham, 2019.

21

Existing Planners

Monte Carlo Tree SearchSegler, Marwin HS, Mike Preuss, and Mark P. Waller. "Planning chemical syntheses with deep neural networks and symbolic AI." Nature 555.7698 (2018): 604-610.

Proof Number SearchKishimoto, Akihiro, et al. "Depth-First Proof-Number Search with Heuristic Edge Cost and Application to Chemical Synthesis Planning." Advances in Neural Information Processing Systems. 2019.

22

Existing Planner – MCTS

23

Search Tree Representation

Existing Planner – MCTS

24Algorithm framework

Rollout time-consuming

Existing Planner – PNS

25

Formulate the retrosynthesis problem as a two-player game. Optimize for quickly finding one route. Require hand-designed criterion during search.

Search Tree Representation

Existing Planners Summary

Monte Carlo Tree Search Rollout time-consuming and comes

with high variance. Sparsity in variance estimation. Not optimized for total costs.

Proof Number Search Formulation mismatch. Hand-designed criterion during search,

hard to tune and generalize. Not optimized for total costs.

26

Algorithm Framework

(a) Select the most promising frontier node(b) Expand the node with one-step model

(c) Update current estimate of 𝑉𝑉 function

π‘šπ‘šπ‘›π‘›π‘’π‘’π‘›π‘›π‘‘π‘‘

𝑅𝑅𝑖𝑖

𝑆𝑆𝑖𝑖

27

Retro*: Theoretical Guarantees

ProofSimilar to π΄π΄βˆ— admissibility proof.

Remark0 is the lower-bound of π‘‰π‘‰π‘šπ‘š for any molecule π‘šπ‘š if cost is defined as the negative log-likelihood.

28

Retro*: Representing Value Function π‘‰π‘‰π‘šπ‘š

29

π‘‰π‘‰π‘šπ‘š

molecule fingerprint value prediction

One-step Model Training

Template-based MLP model. There are ∼380K distinct templates. Multi-class classification. Predicts top-50 reaction templates for each target. Apply templates to obtain the corresponding reactants.

Trained on training reaction set.

Cost defined as the negative log-likelihood of the predicted reaction. Minimize cost = maximize likelihood.

30

Exp: Results (2)

Performance Table: The number of shorter and better routes are obtained from the comparison against the expert routes, in terms of length and total costs.

31