molecular design synthesis using stochastic optimisation as a tool for scoping and screening

8
Pergamon Compurers them. Engng Vol. 22, SuppI., pp. Sl l-S18. 1998 0 1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain PII: s0098-1354(98)00033-7 009X-1354/98 $19.00 + 0.00 Molecular Design Synthesis Using Stochastic Optimisation as a Tool for Scoping and Screening E. C. Marcoulaki and A. C. Kokossis Dept. Process Integration, U.M.I.S.T., P.O. Box 88, Manchester M60 1 QD, U.K.. Abstract This paper presents optimisation technology for the computer-aided design of molecules. A new approach is presented that combines stochastic optimisation and group-contribution methods to select chemicals with optimised properties. Each molecule is represented as a set of functional groups. The search follows an iterative procedure, where new molecules are generated, evaluated and subjected to acceptance. The evaluation stage calls upon calculation of molecular properties using available group- contribution expressions and databases. The proposed methodology is illustrated with literature examples involving the design of refrigerants and liquid-liquid extraction solvents. The efftciency of the search and the thermodynamic models employed are validated through process simulation studies. The work reports novel molecular structures and significant improvements over conventional techniques. 0 1998 Elsevier Science Ltd. All rights reserved. Kqwords. stochastic optimisation, Markov processes, UNIFAC, refrigerants, extraction solvents. Introduction Molecular applications based on computational approaches assumed an increasing significance over the last years. Using a strong background on statistical and quantum mechanics, molecular and dynamic simulation studies aimed at developing rigorous methods, as alternatives to time consuming and expensive experiments. Research on group contribution-based formulations helped overcome the scarcity of experimental data, and provided fast and insightful tools for the prediction of pure component (Reid et al., 1987; Joback, 1989; Constantinou and Gani, 1994; Constantinou et aZ., 1995) and mixture properties (Fredenslund et al., 1977; Reid ef al., 1987). Group contribution methods came in assistance to computer-aided molecular synthesis tools. These tools enabled the screening of different options and suggested molecular configurations with advanced performance. The first systematic approaches in the field of molecular design were in terms of knowledge-based systems, which exploited system knowledge to generate solutions that satisfied a variety of physical, environmental or process constraints. Significant contributions in the field of ‘generate and test’ algorithms were achieved by Joback and Stephanopoulos (1989), Gani et al. (1991), Pretel et al. (1994) Joback and Sll simulated annealing, group-contribution methods, Stephanopoulos (1995) and Constantinou et al. (1996). These studies have been validated through a wide variety of examples on the synthesis of solvents, refrigerants, polymers, pharmaceuticals, etc. However this technology could lead to good rather than optimal alternatives. Additionally, the rule sets integrated in the expert models might bias the algorithms towards traditionally used materials. The shortcomings of knowledge-based systems can be overcome with the development of optimisation methods, where novelty is better facilitated while optimality is addressed. Optimisation using mathematical programming is expected to have limited success due to the complexity of the problem. Clearly, general-purpose programming bears limitations and can only be efficiently applied to simplified and small size cases. Machietto ef al. (1990) applied mathematical programming in the form of NLP’s in solvent selection. Duvedi and Achenie (1996) used MINLP’s for designing new refrigerants. Though stochastic optimisation appears a promising alternative to deterministic models, previous applications in the form of genetic algorithms (Venkatasubramanian et al., 1994) reported prohibitive computational times and relatively low success rates, especially as the problem size increased.

Upload: ec-marcoulaki

Post on 02-Jul-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

Pergamon Compurers them. Engng Vol. 22, SuppI., pp. Sl l-S18. 1998

0 1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain

PII: s0098-1354(98)00033-7 009X-1354/98 $19.00 + 0.00

Molecular Design Synthesis Using Stochastic Optimisation as a Tool for

Scoping and Screening

E. C. Marcoulaki and A. C. Kokossis

Dept. Process Integration, U.M.I.S.T., P.O. Box 88, Manchester M60 1 QD, U.K..

Abstract

This paper presents optimisation technology for the computer-aided design of molecules. A new approach is presented that combines stochastic optimisation and group-contribution methods to select chemicals with optimised properties. Each molecule is represented as a set of functional groups. The search follows an iterative procedure, where new molecules are generated, evaluated and subjected to acceptance. The evaluation stage calls upon calculation of molecular properties using available group- contribution expressions and databases. The proposed methodology is illustrated with literature examples involving the design of refrigerants and liquid-liquid extraction solvents. The efftciency of the search and the thermodynamic models employed are validated through process simulation studies. The work reports novel molecular structures and significant improvements over conventional techniques. 0 1998 Elsevier Science Ltd. All rights reserved.

Kqwords. stochastic optimisation, Markov processes, UNIFAC, refrigerants, extraction solvents.

Introduction

Molecular applications based on computational approaches assumed an increasing significance over the last years. Using a strong background on statistical and quantum mechanics, molecular and dynamic simulation studies aimed at developing rigorous methods, as alternatives to time consuming and expensive experiments. Research on group contribution-based formulations helped overcome the scarcity of experimental data, and provided fast and insightful tools for the prediction of pure component (Reid et al., 1987; Joback, 1989; Constantinou and Gani, 1994; Constantinou et aZ., 1995) and mixture properties (Fredenslund et al., 1977; Reid ef al., 1987). Group contribution methods came in assistance to computer-aided molecular synthesis tools. These tools enabled the screening of different options and suggested molecular configurations with advanced performance. The first systematic approaches in the field of molecular design were in terms of knowledge-based systems, which exploited system knowledge to generate solutions that satisfied a variety of physical, environmental or process constraints. Significant contributions in the field of ‘generate and test’ algorithms were achieved by Joback and Stephanopoulos (1989), Gani et al. (1991), Pretel et al. (1994) Joback and

Sll

simulated annealing, group-contribution methods,

Stephanopoulos (1995) and Constantinou et al. (1996). These studies have been validated through a wide variety of examples on the synthesis of solvents, refrigerants, polymers, pharmaceuticals, etc. However this technology could lead to good rather than optimal alternatives. Additionally, the rule sets integrated in the expert models might bias the algorithms towards traditionally used materials. The shortcomings of knowledge-based systems can be overcome with the development of optimisation methods, where novelty is better facilitated while optimality is addressed. Optimisation using mathematical programming is expected to have limited success due to the complexity of the problem. Clearly, general-purpose programming bears limitations and can only be efficiently applied to simplified and small size cases. Machietto ef al. (1990) applied mathematical programming in the form of NLP’s in solvent selection. Duvedi and Achenie (1996) used MINLP’s for designing new refrigerants. Though stochastic optimisation appears a promising alternative to deterministic models, previous applications in the form of genetic algorithms (Venkatasubramanian et al., 1994) reported prohibitive computational times and relatively low success rates, especially as the problem size increased.

Page 2: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

s12 European Symposium on Computer Aided Process Engineering-g

The advances in process design techniques enable targeting towards new process alternatives. However, in process design applications, materials are fixed and no attempt is usually made in seeking for better solvents, catalysts, refrigerants etc. Nevertheless, major benefits can be realised by improving or replacing an existing process chemical. These benefits might be in terms of capital investment, energy requirements, downstream processing or waste treatment. Further benefits can be realised from the use of environmentally friendly chemicals. Another interesting direction is that of searching for an optimum product. In this sense, new pharmaceuticals and polymers with improved performances can be designed. In effect, the recent developments in computer-aided molecular design tools reflect me need for new technologies, capable of suggesting novel materials to follow up with a more advanced and demanding chemical industry. This problem refers to the automated synthesis of molecular structures that should optimise various process objectives and satisfy a number of process constraints. Note that process requirements might in generally include safety or environmental regulations and even customer specifications, provided that these issues can be properly formulated. The formulation of the problem and the enormous number of candidate structures, increase the necessity for an efficient technology to search throughout the alternatives, set targets and generate options.

This work suggests a systematic framework for designing molecules possessing optimal properties. Initially, promising molecular features become identified and then they are analysed to a final set of optimal molecules, before proceeding to experiments. In this context, the work can be applied in three stages. In stage one, the available options are screened and a set of optimal solutions is generated through a systematic and robust stochastic search. The second stage consists of processing, validating and advancing the information been produced from the optimal search. A final third stage, involving experimental studies, is expected to follow, though this part of the problem is not being addressed here.

Stochastic Screening for Novel Chemicals

The paper proposes the use of stochastic optimisation from a viewpoint of targeting and scoping. The problem consists of constructing molecules that satisfy a certain process objective under a set of constraints on their thermodynamic properties. The variety of molecular configurations can only be represented using a large number of discrete variables that have to be optimised. Furthermore, the prediction of properties, required in the formulations of objectives and constraints, involves complex non- linear thermodynamic models. These features of the problem inhibit the use of deterministic methods, which are sensitive to initialisation and limited by the non-convexities in the mathematical formulation. Instead, this work applies stochastic optimisation to

develop screening and targeting procedures validated with excellent performance results.

Stochastic optimisation is applied in the form of Simulated Annealing (SA) algorithm, initially used in optimising very large-scale combinatorial systems (Kirkpatrick et al., 1983). SA has found applications to chemical engineering synthesis problems, which include heat exchanger network synthesis (Dolan et al., 1989) design of multiproduct batch plants (Pate1 et al., 1991), distillation column sequencing (Floquet et al., 1994), utility systems design (Maia et al., 1995) and reactor network optimisation (Marcoulaki and Kokossis, 1996; Mehta and Kokossis, 1997). In molecular design applications, there are nonlinear formulations for the prediction of properties and numerous alternative molecular structures, types and classes. Because of its stochastic nature, annealing can easily deal with highly non-linear models and large numbers of decision variables. In the same context of randomness, SA algorithms generate groups of alternatives rather a single solution. This feature of stochastic methods attracts significant importance within the purpose of a suggestive tool. Compared to other stochastic tools, simulated annealing can provide probabilistic guarantees for the quality of the final solution(s).

Simulated annealing-based algorithms use a strong background on Markov processes and probability theory. The annealing process is described in terms of states and moves. Each state refers to an instance of the system parameters, and moves are probabilistically applied in order to transit from one state to another. A sequence of states constitutes a stationary Markov chain, since the transition probabilities for further evolution of the chain depend entirely on the last chain node. The entire optimisation proceeds as a series of stationary chains, and its convergence follows the property of ergo&i@. For a stochastic process to be ergodic it is essential that there are non-zero transition probabilities linking all states to each other (losifescu, 1980). As a result of the property, the final state is independent of the initial state. In optimisation terms, the final states are identified as parts of an equilibrium distribution around the global optimum (Sorkin, 1991). The transition probability T,,, from state (i) to state G) is the product of the perturbation probability Pi,i and the acceptance probability B,,,:

T,. i (PI = Bi, i (p1 Pi, j (1) The perturbation probabilities control the application of the move set and remain constant throughout the process. Acceptance is based on the benefits realised by the move in terms of the objective function values (Metropolis et al., 1953):

Bi,, C/i, = UP (em,, 11 /?a @, 1 ’ 0; 4, (PI = 1. @r,, < 0 (2)

where, Bi, , (pI is the acceptance probability for the move from state i to statej, under p, LV~,~ is the difference in the objective function (0

Page 3: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

European Symposium on Computer Aided Process Engineering-8 s13

that is minimised, imposed by the move, p is the “annealing temperature”, a statistical cooling parameter (Aarts and van Laarhoven, 1985) updated by the cooling schedule:

wherpl =[I + (ln(l+s)flk/mkmar)l-‘flk (3)

6 is a parameter that controls the speed of the annealing process, and dF k,,,,ar is the maximum difference between the objective function values over the old temperature ,$, and the expected optimum objective.

The value of pk is altered after a certain number of moves under constant annealing temperature has been satisfied. The process starts from a sufficiently high and terminates at a sufficiently low temperature.

Proposed Design Methodology

The proposed methodology consists of three stages following a decreasing level of abstraction. In the first stage, stochastic optimisation is applied as to screen the various options and suggest a set of optimal designs. The proposed molecules have to be tested against more rigorous models before experiments are carried out to validate their performance. Finally, a number of experiments should be performed before a new chemical is introduced. The screening stage might be revisited with additional constraints, should this necessity arise from the analysis or the experimental stages (Figure 1).

I. Stochastic Search

Suggestions. insights and Targets

4 Robust and Systematic Screenmg

-w Set of Optimal Solutions

my l-rIsT_p_ ! II. Rigorous Simulation

Validation and Advancement of Screening Information

~~~ _ ~~__. u-- _~___ III. Experimental Results

~~~ -~----v --_x aaew ~~~~~~~

I

Figure 1: Proposed design methodology.

I. Stochastic Optimisation

For the application of the annealing algorithm one has to define the states of the problem and a set of possible modifications (moves), which link these states to one another. The algorithm runs as an iterative process where moves generate new states, according a set of perturbation probabilities. The new states are tested against the old ones based on probabilistic evaluation criteria. These acceptance probabilities are constantly updated based on the implied statistical cooling schedule.

States: In the stochastic optimisation framework, each molecule defines a problem state, and is represented as a feasible set of functional groups. The optimisation variables refer to the existence and number of occurrences of each group in the molecule. The variables are subject to physical constraints on the feasibility of the group arrangement. These constraints are associated with the group valencies so that the entire configuration has zero charge. Additionally, they ensure the stability of the arrangement, control the application of the move set and facilitate the simulation stage. Details on the actual molecular structure are not necessary, since they are not employed in the calculation of objectives and constraints. On the other hand, excluding this information brings significant decrease on the size of the problem and the computational time required for the search. Note that, aromatic rings are treated as one super- group, which can obtain different valencies, according to the number of ACH groups involved. This representation facilitates the application of feasibility rules in the case of aromatic arrangements. The search also accommodates for small-size molecules, represented as single groups of zero valence.

Moves: Moves are defined as slight alterations applied on each current state and leading to a new problem state. Unlike the case of deterministic optimisation, where local information is used to evolve the process, in stochastic optimisation the changes are random. In the case of molecular design, moves involve altering the existence or number of occurrences of each group. In this sense, one can introduce, remove or replace groups. When necessary, a modification initiates a series of additional alterations, which re- establish feasibility in the new arrangement. For the search to be ergodic, each of the actions involved in the move set should have a non-zero probability of being reversed. For instance if the action of adding a new group is included in the move set, then the action of removing groups should also be addressed. There are possibilities associated with each one of these modifications. The same happens with each of the subsequent alterations, which are applied to restore feasibility. The choice of the entire move can be expressed in terms of perturbation probabilities, which will control and guide the evolution process. The choice of these probabilities in the present application of the algorithm is such that the search is not biased towards specific molecular configurations. A sequence of such moves is illustrated in Figure 2. Starting from iso-butanoic acid one can replace the carboxyl group with other groups of the same valence. The next move involves removing a group, and finally reveals a serial arrangement. Note that when the CH group is removed, the valence of a CH3 group is increased. Propylamine is then expanded to include a hydroxyl group and an aromatic ring. The ring can further on be modiJied or even removed.

Page 4: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

s14 European Symposium on Computer Aided Process Engineering-8

CH, CH,

d REPLACE CH,- H-COOH w CH,dHCH,NH,

OH ADD

CH,-Ch-COOH v

I DEkZTE

CH,-CH,-CH,NH, .

‘CH2NH2

Figure 2: Illustration of the move set.

state Simulation and Implementation., As illustrated in Table 1, the algorithm is implemented as an iterative procedure. The procedure generates states based on a predetined stochastic matrix of perturbation probabilities (Table 2). Once a move has been proposed and realised, the performance of the new state is calculated. The performance depends on various thermodynamic properties of the molecule or the process mixture. These are predicted based on the groups involved in the new configuration using well-documented thermodynamic models. Once the performance is evaluated, the new state is compared to the current one and the algorithm decides probabilistically on whether to accept or reject the move. This decision is based on the Metropolis criteria, as they are implemented in SA (Equation 2). As the optimisation proceeds the algorithm becomes stricter on accepting moves that deteriorate the objective. The acceptance probability is controlled by an effective temperature parameter, and the system is statistically cooled according to the Aarts and van Laarhoven schedule (Equation 3). Finally, the search terminates if any of the following criteria is satisfied: - the annealing temperature reaches a very small value (freezing point), - over a large number of iterations the algorithm has been unable to propose any successful moves, - the standard deviations over the last three stationary chains have been significantly small.

Table 1: An outline of the implementation steps.

Start with an initial state; Set an initial p,,. Select and impose modification to current state. Calculate resulting difference in objective. Apply criteria, as to replace the current state with the modified.

If annealing temperature Co, must be decreased, apply cooling schedule.

To enable the simulation step, the sofhvare is interfaced with in-house routines and databases. Phase equilibria and activity coefficient calculations are carried out using the original UNIFAC method (Fredenslund et al., 1977; Reid et al., 1987). Other group contribution models are used for the prediction of primary properties such as critical values, acentric factor, boiling point temperature, standard heat of vaporisation, or liquid heat capacity (Constantinou and Gani, 1994; Constantinou et al., 1995; Reid et al., 1987). In order to remain consistent and speed up the search, data for small molecules (groups of zero valence) are titted into the employed group contribution expressions. Analytical correlations are applied for secondary properties such as vapour pressure and latent heat (Reid et al., 1977; 1987). In the general case, any kind of group-contribution model, routine or database can be incorporated in the software. In effect, the applicability of the methodology is only limited by the availability of group-contribution data and formulations. Indeed the method can find more applications and incorporate additional constraints, provided there are mathemati- cally formulated links between the molecular structure and the molecular properties. Eventually, the efficiency of the method is only limited by the simulation models involved, and their accuracy in reflecting the real performance trade-offs.

Table 2: Moves and Perturbation Probabilities:

Add new or increase occurrences of existing group

Delete existing group Replace existing group

Introduce new aromatic ring Remove existing aromatic ring Modify existing aromatic ring

2. Rigorous Simulation

Non-A

25 % 35 % 40 %

- Aromatic

25 % 40 % 35 %

25 % 40 % 35 %

Results from the screening stage can be further processed and analysed. As the solution suggests only the combination of groups in the molecule, further analysis is applied to interpret the results and justify the choices. A subsequent part of this involves employment of rigorous simulation tools that can enable a more detailed representation of the optimal molecules. In addition to this, given the limitations and assumptions behind group contribution models, one has to recognise the necessity to validate the optimisation results against a more accurate technology.

The validation stage can be performed in two different ways. One option is to carry out process simulations for the final solutions using rigorous phase equilibria models and accurate component data. However, most of the designs generated by the

Page 5: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

European Symposium on Computer Aided Process Engineering-8 S15

search are not included in the databases linked with commercial packages. This should not be seen as drawback of the methodology. It is rather as an advantage, considering the small proportion of the classified compounds compared to the trillions of choices available. Indeed the scope of the screening tool is to provide insights for targets, and insentives for investigating and experimenting with novel molecular arrangements. The other option for testing these arrangements could be through molecular simulation packages. However this technology is still in development and more research in this field is needed before one can safely apply these tools for validation purposes. In the same context of testing arbitrary molecular arrangements, dynamic simulation appears a promising alternative technology. In this paper, only process simulation is considered. This can only be applied to small-scale problems, in terms of the number of functional groups and the size of the molecules. An example of this application is discussed in the next section.

Case Studies

Two examples are presented, involving the design of refrigerants and liquid-liquid extraction solvents. Both examples are taken from the literature and are presented for the purpose of illustrating the application of the proposed tool and comparing the methodology with other available technologies.

I. Design of refiizerants for reulacina freon-12

The example was first introduced by Joback and Stephanopoulos (1989) and involves finding alternative refrigerants for freon-12. This case study has been considered by Joback and Stephanopoulos (1989; 1995) and Gani et al. (1991) using knowledge-based systems, and Duvedi and Achenie (I 996) using MINLP’s. The example is a typical case of molecular design based on pure component properties.

Problem definition The new refrigerant should operate between an evaporating temperature of 272.0 K, and a condensing temperature of 3 16.5 K. The constraints for the problem are such that the new compound assumes better properties compared to the base case refrigerant (Joback and Stephanopoulos, 1995). In this sense higher heat of vaporisation should reduce the amount of refrigerant required. Lower liquid heat capacity is also preferable, since it reduces the amount of refrigerant that flashes upon passage through the expansion valve. Additional constraints arise from the allowable vapour pressure values at the evaporator and the condenser conditions. The lowest pressure in the cycle should be higher than the atmospheric pressure. The maximum pressure in the cycle should be high enough to decrease the cost of the equipment.

These constraints assume the form: PYp (T=272.OK) > 1.4 bar P, (T=316SK) < 14. bar & (T=272.OK) > dJI,+,, (T=272.OK)

= 18.4 kJ/g-mol C,, (T=294.3K) < Cp, ,ic,,,, (T=294.3K)

= 32.2 Cal/g-mol K The objective function F is defined such that the liquid heat capacity is minimised, though heat of vaporisation is maximised:

min (F), F = Cp, / AHH,

Simulation models for refrigerants Group contribution models are called upon the calculation of vapour pressures, liquid heat capacities and heats of vaporisation. The liquid heat capacity is calculated using the Missenard method (Reid et al., 1987), as a sum of the contributions of the groups involved in the molecular arrangement. The method applies for different temperatures, and as the temperature increases the individual contributions slightly increase or remain constant. Based on this observation, the heat capacities are calculated at a temperature of 298.0K instead of 294.3K. Accordingly the maximum limit on C,, can either be increased or maintained at 32.2 cal / g-mol K, which then appears as a more conservative upper bound. The primary properties involved in the calculation of vapour pressures are the critical temperature, the critical pressure and the normal boiling point temperature. The properties are predicted using the group contribution methods proposed by Constantinou and Gani (1994). These models assume the general form:

where, X is the property to be predicted, n, is the number of occurrences of group i in the molecule, and x, is the contribution of group i in property X.

Vapour pressure predictions are carried out using the Riedel-Planck-Miller equation (Reid et al., 1977):

&‘“,, f T) / E ) =

=--G.(I-t2 +k.(3it).(I-t)‘)/t

k=(h/G-I-t,,)/(3-t,,)/(I-t,)2

G=0.483.5+0.460S.h, h=t,,/(I-t,,).ln(PC)

t,=TBP/Tc. t=T/T,

The latent heat of vaporisation can be estimated from the Watson relation (Reid et al., 1987):

AH,(T)=AH,(T,).[(I-T)/(l-T,)]”

where, T, is a reference temperature for which latent heat data are available, and:

n=(O.00264.AH,(T,)/(R.T,)+0.8794)’”

The work of Constantinou and Gani (1994) accommodates for standard heat of vaporisation (where, T, = 298.0 K), however some of the group contribution data are not available. In case a

Page 6: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

S16 European Symposium on Computer

particular arrangement includes such groups then the correlation of Vetere (Reid et al., 1987) is employed:

AH,.(T,,>) =

R T,), 0.4343.h P, - 0.69431+0.89584.x

0.37691-0.37306~x+0.15075~$~‘~x .’

x = T,j,, / T

Results This example is run using 97 groups (including 9 zero valence groups) and allowing up to 40 groups in each state. The default initial guess is ethane. The annealing temperature starts from pc,=105 and the employed Markov chain length is N=l50. The temperature is decreased after N moves or N/2 successful moves. Several runs are performed, starting from the same state (with different initial seeds for the random number generator) as well as randomly chosen states. Analysis of these runs provides confidence for the results. Indeed, after more than 50 runs the final solution remains to be formaldehyde. The average computational time required per run is 50sec in a HP 9000-C 100 workstation. Unlike other examples in this case the final solution is always the same. If the algorithm is run with additional constraints, one can generate the second and third best solutions, which are chloromethane and dimethyl ether, respectively. Chloromethane was included in the solutions provided by knowledge- based systems (Gani et al., 1991) and was suggested as optimal using MINLP’s (Duvedi and Achenie, 1996). The results are shown on Table 3. All the final designs correspond to simple molecules, for which accurate property data are available.

2. Design of liquid-liquid extraction solvents

The second example problem comes from the design of separation agents. This problem requires prediction of pure component as well as mixture properties. The extraction of ethanol from aqueous solutions is a very interesting case because of the affinity between water and ethanol. This example has been treated by several authors, including Pretel et al. (1994). Their results revealed the difficulty in finding compounds, which could satisfy the property constraints.

Pretel et al. (1994) suggested the use of original UNIFAC (Fredenslund et al., 1977) with VLE parameters (Reid et aZ., 1987). Indeed, the VLE tables are much more complete and extended compared to the LLE tables. This allows variety and increases the design options, and these issues are essential to a screening tool. The use of VLE tables was also justified based on comparisons of predictions using both tables.

Problem definition The liquid-liquid extraction scheme used in this case study assumes the extraction of a dilute component

Aided Process Engineering-8

and is carried out in three steps. The first stage is an extractor column. The extract is then processed in a distillation column to eliminate the raffmate. Finally, the solvent is recovered by simple distillation in the purification column. When dealing with a dilute solution it is essential that the solute is more volatile than the solvent, and that the solubility of the solvent in the raffinate is very low. Cockrem et al. (1989) examined this class of separation problems and identified the most dominant properties for solvent selection. The solute distribution coefficient determines the size of the extractor unit and the amount of solvent recycle. Low solvent losses provide a good indication for high selectivity and determine immissibility between the rafftnate and the extract. High selectivity leads to lower cost for the solute recovery units. Also, a substantial boiling point temperature difference between the solute and the solvent is essential for reducing the size and energy consumption of the purification column. However the boiling point of the solvent should not be extremely high, to facilitate the use of stream in the reboiler.

This example is run with the following constraints on the solvent properties:

solvent selectivity: ss > 10. wt./wt.

solute distribution coef.: M> 2.0 wt./wt.

solvent losses SIC 0.05 wt. %

boiling point temperature T/J,, > TAI: &OH + SK

T,,,J,,~ < 700 K

The objective is to maximise the solute distribution coefficient. To reduce the size of the optimal molecules, the molecular weight is also included in the objective. These are formulated as:

min (F), F = MW,y/M2

Simulation models for II-extraction solvents The formulations for the mixture properties are taken from Pretel et al. ( 1994):

Where, S is the solvent, E is the solute (ethanol) and R is the raffinate (water). The activity coefficients (Y) are calculated at temperature 298.K.

For the prediction of activity coefftcients at infinite dilution the original UNIFAC (VLE) is used. As expected, there is significant difference between the values obtained with group-contribution methods and those calculated rigorously. However, what is important in optimisation is that the overall trend is represented correctly. In effect, rigorous calculations for classified components reveal that the sequence of final solutions remains more or less the same (Pretel et al., 1994).

Page 7: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

European Symposium on Computer

Finally, normal boiling point temperatures can be predicted using the group contribution model developed by Constantinou and Gani (1994). The model is discussed above, in the refrigerant design example.

Results The search is launched from different random initial states, and the annealing starts from an initial temperature of lo’, while the Markov length (N) is 150. The temperature is decreased after N moves or N/2 successful moves. A total of 83 groups are considered, and the maximum number of groups allowed per state is 40. The number of groups is lower than in the previous example, due to lack of UNIFAC data. The mean number of iterations is similar to that in the previous example. However, the higher complexity of the simulations involved, increases the average CPU time to 117 sec.

Pretel et al. (1994) reported difficulties in satisfying the requirements for this particular example, while their constraints were much less strict than the ones imposed here. However all the results produced

Aided Process Engineering-8 s17

by the stochastic tool are well inside the process feasibility range. Some of the final solutions are shown on Table 4. Obviously, large aromatic alcohols appear as the best option, and this choice coincides with the results and the comments made by Pretel et al. (1994). It is expected that the hydrohyl group enhances the solubility of ethanol in the solvent phase. Additionally, the hydrophobic hydrocarbon part decreases the amount of solvent in the aqueous phase, as well as the amount of raffmate in the extract. Similar behaviour is expected from other oxygenated or amine compounds. However the performance of these groups is substantially inferior compared to that of the hydrohyl group. Subsequently, such arrangements could only be generated if the OH-group is excluded from the search.

Table 3 : Results from the design of refrigerants. F DHV CPL Pmin Pmax design

.615 22.5 13.8 1.74 8.58 c-l,=0

.855 20.4 17.4 2.45 9.19 CH,Cl

1.14 23.7 27.0 3.39 11.3 CH,OCH,

Table 4: Results for the design of liquid-liquid extraction solvents.

_selactivitv .__A 62 74 -A___ 7.844 disk coef. s-losses % 0.04634 TBP 552.5 MWJM 3.125

HaC

@:~-:H3

CHFH, CH,

HO H,

@-;H 1-nonanol

HO H&H&H, Ho

(Pretel et al.. 19%

62.75 63.10 63.11 _A_ 1348 7.840 7.802 : 7.796 0.806

0.04624 0.04604 0.04589 0.0902 549.9 558.7 554.0 494.3 3.129 3.159 3.184 _-

As expected, the methodology generates unconventional combinations of groups These compounds are not included in databases. However the search can be biased towards classified molecules. It is easy to find binary equilibrium data for most of the primary straight-chain alcohols, acids and amines with up to around 15 carbon atoms. Eventually, some of the most reported solvents for the ethanol-water problem, such as I-nonanol and n- octanoic acid (Munson and King, 1984) can be found amongst these compounds. In effect, if the search space is reduced to these arrangements, then all the runs generate the same solvent as final solution, namely I-nonanol. Process simulation studies using PRO II, point out that l- nonanol is indeed the best option amongst primary alcohols and acids. The evaluation of these solvents is in terms of total energy consumption, as well as the amount of solvent that is needed for make-up and the solvent fraction that can be reused.

Conclusions

A new methodology was presented for the systematic synthesis of novel chemicals. The method proposes a

general and robust stochastic search that suggests novel molecular arrangements and sets targets for advanced performances. The search employs group contribution models, therefore further analysis is required to validate the results of the screening stage and justify the choices. The paper proposed a synthesis framework with the stochastic tool located at the highest level of abstraction. The employment of more rigorous simulation tools at the subsequent design levels was briefly discussed.

The stochastic search addresses acyclic-aliphatic and aromatic compounds, or any combinations of these. The method is tested against two well known literature examples, involving the design of refrigerants and liquid-liquid extraction solvents. In both cases, more than 83 functional groups are considered and a maximum of 40 groups is allowed in the fmal molecules. The method is found to produce better solutions compared to those reported in the literature, whereas the computational time of the screening part is relatively low even for the most extensive searches. Other example cases can easily be addressed provided there are mathematical models to link the process requirements to a set of functional groups.

Page 8: Molecular design synthesis using stochastic optimisation as a tool for scoping and screening

S18 European Symposium on Computer Aided Process Engineering-8

Notation

T P B F P,, AH,, c I’!. P‘ T, T 111’ T, R N ss Sl A4 MW

Transition probability or temperature Perturbation probability Acceptance probability Objective function Vapour pressure Latent heat of vaporisation Liquid heat capacity Critical pressure Critical temperature Normal boiling point temperature Reference temperature (Watson equation) Ideal gas constant Markov chain length Solvent selectivity Solvent losses Solute distribution coefficient Molecular weight

Greek letters

P Simulated annealing temperature

Y” Infinite dilution activity coefficient 6 Parameter to control speed of annealing Subscripts i, .i Different problem states k Number of SA temperature iterations

References

Aarts, E.H.L. and van Laarhoven, P.G.M., 1985, Sta- tistical cooling: A general approach to combinatorial optimisation problems. Philips .I. Res., 40, 193-226. Cockrem, M., Flatt, J., and Lightfoot, E., 1989, Solvent selection for extraction from dilute solution. Sep. Sci. and Technol., 24,769-807. Constantinou, L., Bagherpour, K., Gani, R., Klein, J.A., and Wu, D.T., 1996. Computer aided product design: problem formulations, methodology and applications. Computers Chem. Engng, 20,685-702. Constantinou, L. and Gani, R., 1994, New group contribution method for estimating properties of pure compounds. AIChEJ, 40, 10, 1697-1710. Constantinou, L., Gani, R., and O’Connell, J.P., 1995, Estimation of the accentric factor and the liquid molar volume at 298K using a new group contribution method. F&d Phase Equilibria, 103, 1 l-22. Dolan, W.V., Cummings, P.T., and LeVan, M.D., 1989. Process optimisation via simulated annealing: Application to network design. AIChE J., 35, 725- 736. Duvedi, A.P. and Achenie, L.E.K., 1996. Designing environmentally friendly refrigerants using mathematical programming. Chem. Engng Sci., 51, 3727-3739. Floquet, P., Pibouleau, L., and Domenech, S., 1994. Separation sequence synthesis: How to use simulated annealing procedure? Computers Chem. Engng, 18, 1141-I 148. Fredenslund, Aa, Gmehling, J., and Rasmussen, P., 1977. Vapor liquid equilibria using UNIFAC. Elsevier Scientific, Amsterdam.

Gani, R., Nielsen, B., and Fredenslund, Aa, 1991. A group contribution approach to computer-aided molecular design. AIChEJ., 37, 1318-1332. losifescu, M., 1980. Wiley Series in Probability and Mathematical Statistics: Finite Markov process and their applications. John Wiley and Sons Ltd., Editura Technica, Bucuresti. Joback, K.G. and Stephanopoulos G., 1989. Designing molecules possessing desired physical property values. Proc. FOCAPD’89, Snowmass, CO, pp. 363-387. Joback, K.G., and Stephanopoulos, G., 1995. Advances in chemical engineering, Vol.21. Searching spaces of discrete solutions: The design of molecules possessing desired physical properties. Academic Press Inc. Joback K.G., 1989, Designing molecules possessing desired physical property values. PhD Thesis, MIT, Cambridge, MA. Kirkpatrick, S., Gelatt, Jr, C.D., and Vecchi, M.P., 1983. Optimisation by simulated annealing. Sci., 220, 671-680. Machietto, S., Odele, O., and Omatsome, O., 1990. Design of optimal solvents for liquid-liqud extraction and gas absorption processes. Trans IChemE., 68, 429-433. Maia, L.O.A., Vidal de Carvalho, L.A., and Qassim, R.Y., 1995. Synthesis of utility systems by simulated annealing. Computers Chem. Engng, 19,481-488. Marcoulaki, EC. and Kokossis A.C., 1996. Stochastic optimisation of complex reaction systems. Computers Chem. Engng., 20, S23 l-S236. Mehta, V.L. and Kokossis, A.C., 1997. Development of novel multi-phase reactors using a systematic design framework. Computers Chem. Engng., 21, S325-S330. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E., 1953, Equation of state calculations by fast computing machines. J. Chem. Phys., 21,6, 1087-1092. Munson, C.L. and Judson King, C., 1984. Factors influencing solvent selection for extraction of ethanol from aqueous solutions. Ind Eng. Chem. Process Des. Dev., 23, 109. Patel, A.N., Mah, R.S.H., and Karimi, LA., 1991. Preliminary design of multiproduct noncontinuous plants using simulated annealing. Computers Chem. Engng, 15,451-469. Pretel, E.J., Lopez, P.A., Bottini, S.B. and Brignole, E.A., 1994. Computer-aided molecular design of solvents for separation processes. AIChE J., 40, 1349. Reid, R.C., Prausnitz, J.M., and Poling, B.E., 1977 The properties of gases and liquids. 3rd ed., McGraw- Hill, New York. Reid, R.C., Prausnitz, J.M., and Poling, B.E., 1987 The properties of gases and liquids. 4’h ed., McGraw- Hill, New York. Sorkin, G.B., 1991. Efficient simulated annealing on fractal energy landscapes. Algorithmica, 6, 367-418. Venkatasubramanian, V., Ghan, K., and Caruthers, J.M., 1994. Computer-aided molecular design using genetic algorithms. Computers Chem. Engng, 18, 833-844.