global (q)sars for skin sensitisation–assessment against oecd principles‖

25
This article was downloaded by: [Monash University Library] On: 02 October 2013, At: 06:36 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK SAR and QSAR in Environmental Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/gsar20 Global (Q)SARs for skin sensitisation–assessment against OECD principles D. W. Roberts a , A. O. Aptula b , M. T. D. Cronin a , E. Hulzebos c & G. Patlewicz d a School of Pharmacy and Chemistry, Liverpool John Moores University, England b Safety and Environmental Assurance Centre, Unilever Research, Sharnbrook, England c RIVM, Bilthoven, The Netherlands d European Chemicals Bureau, Joint Research Centre, Ispra, Italy Published online: 04 Dec 2010. To cite this article: D. W. Roberts , A. O. Aptula , M. T. D. Cronin , E. Hulzebos & G. Patlewicz (2007) Global (Q)SARs for skin sensitisation–assessment against OECD principles , SAR and QSAR in Environmental Research, 18:3-4, 343-365, DOI: 10.1080/10629360701306118 To link to this article: http://dx.doi.org/10.1080/10629360701306118 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,

Upload: g

Post on 19-Dec-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

This article was downloaded by: [Monash University Library]On: 02 October 2013, At: 06:36Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

SAR and QSAR in EnvironmentalResearchPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/gsar20

Global (Q)SARs for skinsensitisation–assessment against OECDprinciplesD. W. Roberts a , A. O. Aptula b , M. T. D. Cronin a , E. Hulzebos c

& G. Patlewicz da School of Pharmacy and Chemistry, Liverpool John MooresUniversity, Englandb Safety and Environmental Assurance Centre, Unilever Research,Sharnbrook, Englandc RIVM, Bilthoven, The Netherlandsd European Chemicals Bureau, Joint Research Centre, Ispra, ItalyPublished online: 04 Dec 2010.

To cite this article: D. W. Roberts , A. O. Aptula , M. T. D. Cronin , E. Hulzebos & G. Patlewicz(2007) Global (Q)SARs for skin sensitisation–assessment against OECD principles , SAR and QSAR inEnvironmental Research, 18:3-4, 343-365, DOI: 10.1080/10629360701306118

To link to this article: http://dx.doi.org/10.1080/10629360701306118

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,

Page 2: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 3: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

SAR and QSAR in Environmental Research,Vol. 18, Nos. 3–4, May–June 2007, 343–365

Global (Q)SARs for skin sensitisation – assessment against

OECD principlesk

D. W. ROBERTS*y, A. O. APTULAz, M. T. D. CRONINy,E. HULZEBOSx and G. PATLEWICZ{

ySchool of Pharmacy and Chemistry, Liverpool JohnMoores University, England

zSafety and Environmental Assurance Centre, Unilever Research,Sharnbrook, England

xRIVM, Bilthoven, The Netherlands{European Chemicals Bureau, Joint Research Centre, Ispra, Italy

(Received 11 May 2006; in final form 6 August 2006)

As part of a European Chemicals Bureau contract relating to the evaluation of (Q)SARs fortoxicological endpoints of regulatory importance, we have reviewed and analysed (Q)SARs forskin sensitisation. Here we consider some recently published global (Q)SAR approaches againstthe OECD principles and present re-analysis of the data. Our analyses indicate that ‘‘statistical’’(Q)SARs which aim to be global in their applicability tend to be insufficiently robustmechanistically, leading to an unacceptably high failure rate. Our conclusions are that, for skinsensitisation, the mechanistic chemistry is very important and consequently the best non-animalapproach currently applicable to predict skin sensitisation potential is with the help of an expertsystem. This would assign compounds into mechanistic applicability domains and applymechanism-based (Q)SARs specific for those domains and, very importantly, recognise whena compound is outside its range of competence. In such situations, it would call for humanexpert input supported by experimental chemistry studies as necessary.

Keywords: Structure-activity relationship; Skin sensitisation; Mechanistic domain; Schiff base

1. Introduction

1.1 Use of predictive models in REACH

Under the European Union (EU) Registration, Evaluation and Authorisation ofCHemicals (REACH) programme, all chemicals produced or imported>1 ton perannum (tpa) in the EU will need to be assessed for human and environmental hazards.If conducted by means of the present data requirements/test strategy, this assessment

*Corresponding author. Email: [email protected] at the 12th International Workshop on Quantitative Structure-Activity Relationships inEnvironmental Toxicology (QSAR2006), 8–12 May 2006, Lyon, France.

SAR and QSAR in Environmental Research

ISSN 1062-936X print/ISSN 1029-046X online � 2007 Taylor & Francis

DOI: 10.1080/10629360701306118

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 4: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

will use a huge number of test animals and will be neither resource nor time effective.The REACH proposal calls for an increased use of (quantitative) structure-activityrelationships ((Q)SARs) and other alternatives for the assessment of, in particular,the lower production volume chemicals, i.e., the categories 1–10 tpa and 10–100 tpa.Skin sensitisation potential needs to be assessed for either of these production volumes.

Skin sensitisation is an important endpoint. Workers and consumers being sensitisedare a major problem for individuals, for employers and for marketing certain products.It is an effect for which no threshold can be established yet and it is a lifelong effect.In REACH the sensitising potential should therefore be assessed for chemicals belowthe 10 ton threshold (Annex V). No in vitro alternative is available yet, nor will it be inthe near future. According to a European Chemicals Bureau (ECB) assessment ofadditional testing needs under REACH, the highest number of tests is required for thisendpoint (EC 2003) [1].

Although the new and current EU-legislation both provide the possibility of using(Q)SARs instead of in vivo testing, there are neither accepted (Q)SARs for humantoxicological endpoints nor detailed guidance on how to develop (Q)SARs for humanrisk assessment. As a consequence, regulators have scarcely used human toxicological(Q)SARs in decision making. However, as noted the recently proposed EU-REACHsystem calls for an increased use of (Q)SARs and other non-animal methods,especially for the assessment of the low production volume chemicals. Therefore,several initiatives have recently emerged to increase acceptance of (Q)SARs. The mainprinciples for the validity of (Q)SARs were identified in a Workshop organisedby CEFIC (European Chemical Industry Council)/ICCA (International Council ofChemical Associations) in Setubal in 2002 and were subsequently evaluated by OECD(Organisation for Economic Co-operation and Development) as part of the Ad hocExpert group for (Q)SARs.

These are now referred to as the ‘OECD principles’ which read as follows:‘‘To facilitate the consideration of a (Q)SAR model for regulatory purposes, it shouldbe associated with the following information:

. a defined endpoint

. an unambiguous algorithm

. a defined domain of applicability

. appropriate measures of goodness-of-fit, robustness and predictivity

. a mechanistic interpretation, if possible

(Q)SARs, which fulfil these criteria, may in principle be applicable within the regulationpractice to predict mammalian endpoints’’.Further details on these principles and individual case studies are provided by theOECD (2004) [2].

In this article we evaluate and compare, with reference to the OECD principles,three approaches to (Q)SAR development for skin sensitisation, one being a non-mechanistic/information-content-based global approach, one being a semi-mechanisticglobal approach and one being a mechanism-based domain-specific approach.

1.2 Mechanisms of skin sensitisation

1.2.1 Biological aspects of the mechanisms of skin sensitisation. Skin sensitisation isa T-cell mediated immune response. The biological mechanism of skin sensitisation can

344 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 5: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

be summarised briefly as follows. The skin sensitising chemical reacts with skin proteinin the epidermis so as to make it antigenic. The antigenic protein is processed byLangerhans cells in the epidermis and these Langerhans cells are consequentlystimulated to migrate to a lymph node where they present the antigen to naıve T-cells.As a result, T-cells with receptors able to specifically recognise the antigen arestimulated to proliferate and circulate throughout the body. These events take placeduring the induction stage of a sensitisation test.

On subsequent exposure to the same sensitiser, or a second sensitiser cross-reactivewith the first, protein binding and processing of the resulting antigenic protein byLangerhans cells again occurs, after which the antigen presented by the Langerhanscells is recognised by the circulating T-cells, triggering a cascade of biochemical andcellular processes that produce the clinical sensitisation response. These events takeplace at the challenge stage of a sensitisation test.

1.2.2 Chemical aspects of the mechanisms of skin sensitisation. Reaction chemistryunderpins all mechanistic attempts to predict skin sensitisation from structural andphysical properties. In order to understand these approaches, some description ofreactivity and its implications for skin immunology is required.

Skin sensitisation to chemicals, in most if not all cases, involves the compound, eitheras such or after metabolic or abiotic conversion, acting as an electrophile towardsnucleophilic groups on skin protein, leading to formation of antigens. The biologicalprocesses downstream of this reaction will not be discussed here. Important as theyclearly are to the induction and elicitation of sensitisation, they have no relevance to thequestion of what makes some compounds strong sensitisers, other compounds weakersensitisers, and others non-sensitisers. Skin sensitisers fall naturally into several reactionmechanistic domains, which are summarised in scheme 1. Until recently it has usuallybeen assumed that mechanism-based QSARs for skin sensitization can only bedeveloped for sets of compounds that are closely related in structure [3]. More recently,it has been argued that it is the mechanistic relationship that is important and thatmechanism-based QSARs applicable across complete mechanistic domains andcovering wide structural diversity should be possible [4].

1.2.3 The Relative Alkylation Index model. The Relative Alkylation Index (RAI)model [5] has proved to be a useful tool in analysing sensitisation data. It is based on theconcept that the degree of sensitisation produced at induction, and the magnitude ofthe sensitisation response at challenge, depends on the degree of covalent binding(haptenation; alkylation) to carrier protein occurring at induction and challenge.The RAI is an index of the relative degree of carrier protein haptenation and wasderived from differential equations modelling competition between the carrierhaptenation reaction in a hydrophobic environment and removal of the sensitiserthrough partitioning into polar lymphatic fluid.

In its most general form the RAI is expressed as:

RAI ¼ logDþ A log kþ B logP ð1Þ

From this expression the general form of a skin sensitisation QSAR can be derived:

pEC3 ¼ a log kþ b logPþ c ð2Þ

Skin sensitisation global QSARs 345

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 6: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

Scheme 1. Reaction mechanistic applicability domains.

346 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 7: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

pEC3 is the negative logarithm of the concentration required to give a stimulation index(SI) of 3 in the local lymph node assay (LLNA) [6], and is now widely accepted asan index of sensitisation potency.

Thus the general mechanistic QSAR for skin sensitisation requires twoparameters – a hydrophobic parameter (logP) and a reactivity parameter (log k).LogP can in most cases be easily calculated e.g., by the Leo and Hansch method [7].For log k, even when rate constants have not been measured experimentally, in manycases the reactivity of one compound relative to another can be estimated, usingsubstituent constants of the type developed by Hammett and Taft [8]. Calculatedmolecular orbital parameters (e.g., ELUMO) can also be useful, particularly whenconsidering a congeneric series of compounds, as surrogates for log k.

1.3 The nature of skin sensitisation data

Much of the data on skin sensitisation in humans are epidemiological or anecdotal innature, where clinical evidence of skin sensitisation is found by patch testing to beassociated with the patient’s exposure, often occupationally, to natural substances orsynthetic chemicals. Such data are far from ideal for QSAR work, although fromcomparing exposure data with numbers of cases, some substances can clearly beidentified as very strong sensitisers (e.g., 2,4-dinitrochlorobenzene, poison ivy) andothers, which despite widespread exposure of the population cause very few cases, canbe identified as weak sensitisers. Evidence of this nature, supported by animal studiesfor hypothesis testing, formed the basis for much of the understanding of theconnection between reaction chemistry and skin sensitisation that had developed bythe early 1980s [9]. Data of a more quantitative nature come from animal testing. Untilthe mid-1990s, all animal tests for skin sensitisation involved at least two separatetreatments with the test chemical. Details of the various tests differ, but in outline theycan be summarised as follows. In the induction stage, the animals are treated withthe test chemical and then left for a period (usually about 3 weeks). During this period,the chemical has the opportunity to induce a state of sensitisation. Then there is achallenge stage, whereby the animals are treated once more with the test chemical atsub-irritant dose levels and then examined (usually after a period of 24 or 48 h) for anyclinical evidence of a sensitisation response (erythema, oedema). The response is usuallyrecorded in terms of the numbers of test animals responding to challenge and themagnitude of the response in each animal (often using a subjective 0–3 scale).

Cross-challenges can also be performed in which animals that have undergone theinduction treatment with one chemical are treated with a second chemical atthe challenge stage. If this challenge is positive, the two chemicals are said to becross-reactive. Cross-reactivity studies can be very useful in elucidating chemicalmechanisms of sensitisation.

The mouse local lymph node assay (LLNA) developed in the 1990s is different.Only one dose of the test chemical is given, corresponding to an induction treatment.The end point is a quantitative inferential indicator of sensitisation rather than a clinicalresponse. Lymph node uptake of tritiated thymidine which is an indicator of T-cellproliferation, is measured, and this is effectively the indicator for the sensitisationprocess. The response is recorded as a stimulation index (SI), this being the ratio oftritiated thymidine uptake in treated animals to uptake in control animals. The assay isusually carried out over a range of dosages of test chemical, and from dose-response

Skin sensitisation global QSARs 347

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 8: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

analysis it is usually possible to derive an EC3 value, this being the dose (expressed asconcentration) giving SI¼ 3. The LLNA is now accepted as the preferred test methodaccording to Test Guidelines 429 for evaluating skin sensitisation hazard for regulatoryuse. It also provides a potential boost to skin sensitisation (Q)SAR since it givesa single-dose-based quantitative endpoint. However, cross-challenges are not possiblewith the LLNA.

2. Analyses of QSAR approaches

2.1 Non-mechanistic/information content-based global approach (TOPS-MODE)

2.1.1 Brief description. Estrada et al. [10] performed a linear discriminant analysisrelating topological descriptors to skin sensitisation data as measured in the LLNA.The topological descriptors are so-called TOPS-MODE descriptors or topologicalsubstructural molecular design. In this article, a set of 93 diverse chemicals and theirassociated LLNA EC3 values were collated. The EC3 values were categorised intobands of potency. A chemical with an EC3 value less than 1% was defined as strongsensitiser (<0.1% is given in the article, but we assume this to be a typographical errorsince 1% is the value given in the reference quoted), one with an EC3 between1% and 10%, moderate, 10–30% weak, 30–50% extremely weak and greater than50% non-sensitising. These classifications were further amalgamated into two groups.In all two models were developed. The first discriminated strong/moderate sensitisers(EC3<10%) from all other chemicals and the second discriminated weak sensitisers(10%<EC3<30%) from extremely weak and non-sensitising chemicals (EC3>30%).The topological TOPS-MODE descriptors used are spectral moments of a bond matrixweighted for different physicochemical properties (six in total). The six weightingproperties were lipophilicity, polarisability, polar surface area, molar refractivity,atomic charges and van der Waals radii. The bond matrix was raised to an order of15 such that in total 15 spectral moments were calculated for each of the six weights.The unique feature of the TOPS-MODE descriptors is that, in addition to using thewhole descriptors for QSAR modelling, local spectral moments for each of the bondscan also be calculated. It is argued that this can enable features that contributepositively to skin sensitisation and those that contribute negatively to be identified,leading to structural rules or alerts that can be later embedded into expert systems suchas DEREK. Further examples of structural rules are discussed in Estrada et al. [11].

2.1.2 Defined endpoint (Principle 1). The primary endpoint is the EC3 value, which isclearly defined and is accepted as an appropriate endpoint for the LLNA (Dir 2004/73/EC, method B42).

In the article the EC3 values are grouped into classification bands of potency basedon ranges of EC3 values. The ranges described above are based on the earlyconsiderations surrounding the use of EC3 as a measure of potency in classification andlabelling discussions. These bands are derived from comparing human sensitisationpotency to that measured in the LLNA and are subject to slight changes depending onthe dataset of chemicals used. Revised classification bands have recently been proposed[12]. This OECD principle is clearly met.

348 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 9: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

2.1.3 Defined algorithm (Principle 2). This principle is met. Two equations for theLDA models are provided in the article. A positive score in Model 1 classifies achemical as a strong/moderate sensitiser. A negative score necessitates the applicationof Model 2. A positive score in Model 2 classifies a chemical as a weak sensitiser,a negative score as an extremely weak or non-sensitising chemical.

Model 1 ðClass 1Þ ¼ ð1:331� �H1 Þ � ð0:00598 � � �H

4 Þ � ð0:0078 � � �PS2 Þ

� ð0:00021366 � � �PS3 Þ þ ð0:0755 � � �MR

1 Þ þ ð0:0319 � � �MR2 Þ � ð0:0011133 � � �Pol

5 Þ

� ð2:3797 � � �Ch1 Þ þ ð0:1547 � � �Ch

3 Þ þ ð0:00425 � � �Ch6 Þ þ ð2:0932 � � �vDW

1 Þ

� ð0:8683 � � �vDW2 Þ þ 0:7954

Model 2 ðClass 2Þ ¼ ð0:946 � � �H1 Þ � ð0:00468 � � �H

7 Þ � ð0:894 � � �PS1 Þ

þ ð0:1004 � � �PS2 Þ � ð0:0024 � � �PS

3 Þ þ ð0:0057 � � �Pol3 Þ � ð1:429 � � �Ch

1 Þ

þ ð0:0053 � � �Ch8 Þ � ð0:00111 � � �Ch

9 Þ � 5:309

2.1.4 Domain of applicability (Principle 3). The applicability domain is not defined bythe authors although, from the diversity of structures and chemical mechanismsrepresented in the data set, it is clearly intended to be global. However, the statedintention was not so much to provide a QSAR capable of making predictions but moreto provide a means of generating hypotheses, to derive new structural rules.

Ideally, a domain of applicability would be defined by using the descriptor values todescribe a multi-dimensional descriptor space. Further work would need to be carriedout to explore the best approaches of defining the domain by using the descriptorinformation.

2.1.5 Goodness of fit, robustness and predictivity: internal performance

(Principle 4a). The 93 compounds in the original training set were divided atrandom into a test set (15 compounds) and training set (78 compounds).Three compounds were found to be statistical outliers and removed from the analysis.The authors do not attempt to rationalise why these compounds should be outliers,although they do identify them: they are C4, C6 and C9 azlactones, CAS numbers176664-99-6, 176665-02-4 and 176665-04-6 respectively. Four higher chain lengthazlactones – C11, C15, C17 and C19 – are included in the model. The first model wasdeveloped using 75 compounds and this was able to classify 80% of the trainingset compounds correctly. 31 of the 39 strong/moderate sensitisers in this training setwere correctly predicted. Three compounds could not be classified and were reported asnon-classified. Eight compounds were classified as false negatives.

Wilk’s lambda ¼ 0:61 F ¼ 3:39 D2 ¼ 2:52 p < 0:0007

The number of compounds per variable in this model was 6.3. This meets the statisticalrequirements of the OECD principles.

Model 1 was able to classify correctly seven of the nine strong/moderate sensitisers inthe test set. One was found to be unclassified.

For Model 2, 42 compounds were left in the data set when the strong/moderatesensitisers were removed. This was split at random into a training set of 36 compounds

Skin sensitisation global QSARs 349

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 10: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

and a test set of six compounds. Model 2 was still statistically robust but weaker thanModel 1. In all, 80.5% of the training set compounds in Model 2 were predictedcorrectly: 29 were classified correctly and the other seven were false positives.

Wilk’s lambda ¼ 0:38 F ¼ 4:63 D2 ¼ 8:76 p < 0:001

The number of compounds per variable in this model was four. This does not meet thestatistical requirements of the OECD principles. Model 2 classified five out of the sixtest set compounds correctly.

2.1.6 External validation for predictivity (Principle 4b). A dataset of 15 compoundsunseen by the model and with available skin sensitisation data formed the basis of theexternal validation set to test the robustness and predictivity of Model 1 [11]. All butone of the strong/moderate sensitisers were classified correctly. There was one falsestrong/moderate prediction. Whilst it is laudable that an external validation was carriedout, this was only for one of the two models developed and there was no thought givento the representativeness of the chemical space when selecting this validation set.The chemicals were apparently taken at random based on available historical andpublicly available data and the good performance that resulted might be on the basis ofchance rather than from design.

2.1.7 Mechanistic basis (Principle 5). The descriptors used in the equations providedare quite complex and, on the face of, it not very mechanistically transparent. If theconcept of spectral moments is put to one side and the weighting of the physicochemicalparameters considered instead, a mechanistic interpretation may be made for each ofthe descriptors in turn. Factors such as lipophilicity, polar surface area, van der Waalsradii may model partition-related effects in skin sensitisation, whereas parameters suchas charges and polarisability may relate to the reactivity or electrophilicity of thechemicals. However, there is an element of false logic in this argument. Although it maybe true that electrophilic centres in molecules are usually – perhaps always – associatedwith partial charge and polarisation or polarisability, this does not mean that a centreassociated with partial charge and polarisation or polarisability is necessarily anelectrophilic centre. The authors claim that the approach is valuable when applied torelate potency classification to local bond contributions. It is argued that this can leadto the formulation of structural rules, highlighting fragments and substructural featuresthat can provide valuable insights into the potential mechanisms of action froma chemistry perspective. Under reanalysis below, we consider the examples givenin support of this claim.

2.1.8 Reanalysis: regression. Compound numbers quoted here refer to those usedby Estrada et al. in table 1 of [10]. Since this is a linear discriminant analysis rather thana linear regression analysis, there is no statistical requirement for the models to givecorrelations within their bins; their performance is judged statistically solely on theirability to classify compounds into their correct potency bins. However, we note thata good linear regression model can classify compounds into potency bands (bins) withany specified boundaries within the range of the model. For example the Konemannequation for general narcosis in acute fish toxicity (pEC50¼ 0.87 logPþ 1.13) [13]could be used to classify compounds, on the basis of their logP and molecular

350 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 11: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

weight values, into bands such as EC50<0.1mgL�1, 0.1<EC50<1.0mgL�1,

1.0<EC50<10mgL�1, EC50>10mgL�1. To some extent, if the parameters are

mechanistically relevant, the converse should be true, i.e., a function derived by linear

discriminant analysis should be expected to give a reasonable correlation, although not

necessary the best possible based on the parameters used, when used as the independent

variable in a linear regression analysis. This is illustrated in section 2.4 by a comparison

of linear discriminant analysis and regression analysis for skin sensitisation potential of

Schiff base electrophiles.We therefore considered it relevant to explore the extent to which the TOPS-MODE

model correlates the data within the bins. We chose Model 1 for this purpose. In order

to give the model the best chance of success, we removed several compounds

from the total data set, originally consisting of 93 compounds. The compounds

removed are:Compounds 45, 46 and 47 (C4, C6 and C9 azlactones, respectively), identified by

Estrada et al. as outliers [10].Compound 22, 2-methyl-5-hydroxyethylaminophenol, for which, on further

examination of the data, we found that the wrong structure had been used to calculate

the descriptors.Compound 76, lactic acid. This is quoted as having an EC3 value of 14.3%.

We found this surprising, since lactic acid appears to have no electrophilic or

pro-electrophilic character, and we would have confidently predicted it to be a

non-sensitiser. We therefore checked in a more recent database [14], where we found

that lactic acid is listed as a non-sensitiser. Apparently a sample of lactic acid had been

found to be positive at the time that [10] was written, but other studies have failed to

show any sensitisation potential (G. Patlewicz, personal communication). This anecdote

illustrates the value of mechanistic insights, which can alert us to the possibility that

a positive test result may be misleading and is likely to be due to an impurity. In this

case we hypothesise that the lactic acid sample which tested positive may have been

contaminated with lactide, the dimeric lactone of lactic acid [15], which is an activated

ester and which could sensitise by the acyl transfer mechanism.

Lactide

O

O O

O

Table 1. Semi-mechanistic global approach. Comparison of observed and calculated EC3 values.

Residual

Name CAS EC3Calc EC3obs wt% Factor pEC3molar

N-methyl-N-nitroso urea 684-93-5 3.27 0.05 3.22 65.0 1.82-Amino phenol 95-55-6 4.10 0.5 3.60 8.2 0.9Formaldehyde 50-00-0 4.16 0.4 3.66 10.4 1.0Glutaraldehyde 111-30-8 5.29 0.2 5.09 26.5 1.4

Skin sensitisation global QSARs 351

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 12: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

Compound 65, farnesal. As discussed elsewhere [16], for this compound sensitisationis likely to be penetration-dependent. Several other compounds (2, 4, 5, 11, 14, 24, 48,49, 50, 51, 73, 81, 82, 89) in the TOPS-MODE study fall into the penetration-dependentcategory (logP>4.0) and, since this effect may not be modelled by the TOPS-MODEdescriptors, all these compounds were removed.

Compound 58, 1,2-dibromo-2,4-dicyanobutane. The mechanism of action isprobably via de-hydrobromination to give a Michael acceptor ‘‘ultimate sensitiser’’,so the TOPS–MODE descriptors for the de-hydrobromination product would havebeen more appropriate. This could justify removal of compound 58.With all the above listed compounds removed, the regression equation is:

pEC3 ¼ 0:22ð�0:03ÞpEC3pred þ 1:33ð�0:07Þ

n ¼ 72, r2 ¼ 0:363, r2adj ¼ 0:354, s ¼ 0:55, F ¼ 39:97: ð3Þ

The compounds included in this regression still include several of the compoundswith EC3>10%, which Estrada et al. [10] argue are better discriminated by Model 2than by Model 1. However, after removal of all of these compounds, the regressionequation for the 41 strong/moderate compounds remaining is inferior to equation (3):

pEC3 ¼ 0:19ð�0:06ÞpEC3pred þ 1:54ð�0:11Þ

n ¼ 41, r2 ¼ 0:231, r2adj ¼ 0:212, s ¼ 0:53, F ¼ 11:75: ð4Þ

The poor statistical quality of these regression equations does not give confidence inthe mechanistic robustness of the TOPS-MODE Model 1.

Our conclusion from this reanalysis is that the TOPS-MODE parameters have notbeen demonstrated to be mechanistically relevant and consequently global QSARs forskin sensitisation based on these parameters have not been demonstrated to be arealistic prospect.

2.1.9 Reanalysis: generation of structural alerts. Estrada et al. [10, 11] claim thatby using local spectral moments calculated for individual bonds and substructures,features which contribute positively to skin sensitisation and those which contributenegatively can be identified, leading to structural rules or alerts that can be laterembedded into expert systems such as DEREK. Examples of structural rules derived inthis way are discussed in both papers, and are the main focus of Estrada et al. [11].

It is conceivable that TOPS-MODE might have some application for this purposebut, since a high proportion of the examples presented are based on erroneousarguments, this still remains to be demonstrated. The errors are not trivial, since theyhave in several cases led the authors to untenable conclusions. This is illustrated in thebrief analysis of structural rules below.

Alkyl halides. It is stated that bond contributions to sensitisation decrease in the orderBr>Cl>I, and that this is consistent with the alkyl iodides in the data set being weaksensitisers. However, contrary to what is implied, iodides are NOT per se weakersensitisers than bromides or chlorides (nor are they less reactive as electrophiles).Our analysis [16] of a larger data set of alkyl halides reveals that, up to a limiting carbonnumber (ca C14–C17) depending on the halogen, Br and I halides fit the same positive

352 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 13: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

correlation with logP, with chlorides being less potent due to their lower reactivity.Iodine is less electronegative than bromine and chlorine and hence there is less polarityassociated with the C–I bond than with the C–Br or C–Cl bonds. This is probably whythe TOPS-MODE approach underestimates the reactivity of alkyl iodides. We note thatthe descriptors used by Miller et al. [17] also underestimate the sensitisation potential ofalkyl iodides.

Alkyl alkane sulphonates. The data set contains two alkyl alkane sulphonates:dodecyl methane sulphonate (EC3¼ 8.8) and oleyl methane sulphonate (EC3¼ 25). Theauthors state: ‘‘Finally it is noted that dodecyl methane sulphonate, which is a strong/moderate sensitiser, contains an isolated double bond which is recognised by TOPS-MODE as having an important positive contribution to sensitization’’. But the isolateddouble bond is found in oleyl methane sulphonate, which is the weaker sensitiser inthe LLNA, not in dodecyl methane sulphonate which is the stronger sensitiser. By theauthors’ logic therefore, this should be evidence for a negative contribution ofthe isolated double bond to sensitisation.

Aromatic nitro compounds. The aromatic nitro group is proposed as a structuralalert, on the basis that the five compounds in the data set with aromatic nitro groupsare all strong or moderate sensitisers with EC3 values not greater than 2.2 and TOPS-MODE recognises the NO bonds of the nitro groups as having high contributions toskin sensitisation. The five compounds are: 2,4-dinitrochlorobenzene, 2-amino-6-chloro-4-nitrophenol, 2-nitro-p-phenylenediamine, 4-nitrobenzyl bromide and HCRed 3 (4-(2-hydroxyethyl)amino-3-nitroaniline), all strong sensitisers. However, for allof these compounds the sensitisation potential can be ascribed to the properties of othergroups present, the NO2 groups acting simply as ‘‘bystanders’’ or at most activators(e.g., in 2,4-dinitrochlorobenzene). 4-Nitrobenzyl bromide is an SN2 electrophile, theC-Br bond being the site of reaction [18]. It is instructive to compare 2-nitro-p-phenylenediamine (compound 25, EC3¼ 0.4) with p-phenylenediamine (compound83, EC3¼ 0.29): removal of the nitro group appears to marginally increase thesensitisation potential. Many aromatic nitro compounds are known to be non-sensitisers [19]. Some idea of the long-existing level of insight into the effects of aromaticnitro groups is revealed by the case of the 4-nitrobenzyl halides and pseudohalides.In work carried out in the late 1970s and published in 1983, a series of these SN2electrophiles (iodide, bromide, chloride, fluoride, tosylate) were predicted to besensitisers and synthesised for the purpose of the electrophilicity-sensitisation studieswhich led to the RAI model. The analogues with the halogen replaced by hydrogen(i.e., 4-nitrotoluene) and hydroxyl (i.e., 4-nitrobenzyl alcohol) were also made, andpredicted to be non-sensitisers because of the absence of a reactive group. Thesensitisation test results confirmed that as predicted the halides and pseudo halides weresensitisers and their H and OH analogues were non-sensitisers [18].

3-Methyl-1,2,5-thiadiazole-1,1-dioxide (MPT). This compound 79, has EC3¼ 1.4.Estrada et al. [10, 11] argue that it sensitises via addition of protein nucleophiles to theC¼CH2 double bond of the tautomer (figure 1). TOPS-MODE predicts the tautomer asa strong/moderate sensitiser with a 90% probability whereas the parent structureis predicted strong/moderate with 70% probability. However, the exocylic double bond

Skin sensitisation global QSARs 353

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 14: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

of the tautomer, lacking electron-withdrawing substituents is not activated towardsnucleophilic attack to give the observed reaction products. The proposed reaction iscontrary to well established mechanistic principles of organic chemistry. In contrastthe C¼N double bonds in the parent structure are activated by the electronegativeSO2 group. The most likely mechanism, consistent with the mechanistic principles oforganic chemistry, is that MPT sensitises via nucleophilic addition to the C¼N doublebond of the parent structure.

Hydroxylamines. Estrada et al. [10] discuss at length the proposal that thehydroxylamine group should be a structural alert for skin sensitisation, being capableof undergoing oxidation to an electrophilic nitroso group. They base the discussion oncompound 22, 2-methyl-5-hydroxyethylaminophenol, EC3¼ 0.4, saying ‘‘. . . thiscompound is not an amine . . .we have chosen to discuss it on account of the presenceof the hydroxylamine group . . .’’. However: firstly, the compound IS an amine; secondlyit DOES NOT have a hydroxylamine group; thirdly, if it did have the hydroxylaminestructure which Estrada et al. erroneously ascribe to it, it would be unable to form anitroso group owing to the absence of a hydrogen atom on the nitrogen (figure 2).Despite these errors, a good case can be made for including aromatic hydroxylamines ofgeneral formula ArNHOH as a structural alert for sensitisation.

Clotrimazole. This compound (56, EC3¼ 4.8) is argued to be a non-covalent bindingsensitiser. The arguments, advanced in [11], are based on their computational finding

Figure 1. MPT chemistry.

354 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 15: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

that TOPS-MODE predicts most of the bonds in the clotrimazole molecule to

contribute positively to sensitisation, in contrast to other compounds for which they

find that only specific localised regions of the molecule contribute. The non-covalent

binding possibility cannot be excluded but, since the proposal goes against the generally

accepted mechanistic model relating skin sensitisation to covalent binding, it requires

strong supporting evidence and exclusion of the possibility of covalent binding if it is to

be accepted. It seems to us more plausible that clotrimazole acts as an SN1 electrophile,

dissociating to a delocalised imadazole anion and a delocalised triarylmethyl cation

(figure 3). The resonance delocalisation of these ions could explain why TOPS-MODE

identifies most of the bonds as contributing to sensitisation.

2.1.10 Conclusions from the re-analysis. The TOPS-MODE approach does not appearpromising as a means of generating QSARs capable of predicting sensitisation

potential. It may prove useful in the generation of leads for formulation of new

structural alerts for expert systems, but this has not yet been demonstrated

convincingly.

Figure 3. Proposed SN1 reaction mechanism for Clotrimazole.

Figure 2. Structure of 2-Methyl-5-hydroxyethylaminophenol.

Skin sensitisation global QSARs 355

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 16: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

2.2 Semi-mechanistic global approach

2.2.1 Brief description. Miller et al. [17] considered a set of 87 LLNA data and afterremoval of 20 outliers (the basis for removal being the poor fit between calculated andexperimental EC3 values) 67 chemicals are analysed: 50 as a training set and 17 as a testset. EC3 values as w/v percent concentrations are used as the dependent variable.Hundreds of CODESSA (Comprehensive Descriptors for Structural and StatisticalAnalysis (Codessa), Semichem, Inc.: Shawnee Misson, KS, USA) descriptors arecalculated and a heuristic algorithm (not further described in the article) is used toderive several correlations from these descriptors. The authors select the best modelbased on the statistical parameters and ‘‘chemical sense of the descriptors for theunderstood mechanism of sensitisation’’.

2.2.2 Defined endpoint (Principle 1). The endpoint, EC3, is clearly defined and isaccepted as an appropriate endpoint for the LLNA (Dir 2004/73/EC, method B42).

2.2.3 Defined algorithm (Principle 2). This principle is met. The descriptors arecalculated by CODESSA software which is commercially available. The QSARreported is derived by multiple linear regression analysis.

2.2.4 Domain of applicability (Principle 3). The QSAR is claimed to be applicableacross a wide range of structures, including halogenated and aromatic compounds,alcohols, aldehydes and ketones. Compounds ‘‘with iodine, sulphur and anhydriderings, adjacent ketones and alkene aldehydes which are not Michael reactants’’ are citedas being outside the applicability the QSAR. However the reason for excludingcompounds in these categories (20 compounds in all) is simply on the basis of a poor fitbetween calculated and experimental values.

2.2.5 Goodness of fit, robustness and predictivity (Principle 4). The QSAR equationderived from the training set is:

EC3 ¼ 9:16 FPSA2ESP þ 4:29 EHOMO�LUMO � 45:89

n ¼ 50, r2 ¼ 0:773, r2adj ¼ 0:763, rcv ¼ 0:738, F ¼ 79:9 ð5Þ

FPSA2ESP – fractional positively charged surface area descriptor based on electrostaticpotential charge

EHOMO–LUMO – is the energy gap between Highest Occupied Molecular Orbital(HOMO) and Lowest Unoccupied Molecular Orbital (LUMO).

The performance for the test set of 17 chemicals is expressed in terms of r2 (¼0.877).

2.2.6 Further observations relating to Principle 4. The authors classify the compounds‘‘by a bin selection method’’ into strong, moderate and weak categories. These aredefined as: Strong, EC3<6%; moderate 6%<EC3<13%; weak EC3>13%. These donot correspond to the usually accepted classification [6] of strong <1%, moderate1%<EC3<10%, weak EC3>10%, but are used, according to the authors, becauseapproximately equal numbers of compounds of the training set fall into the strong andmoderate categories. It should be noted that, unlike the case with the TOPS-MODE

356 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 17: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

study discussed in section 2.1, the assignment of compounds to their potency bins isbased on multiple linear regression analysis, not on linear discriminant analysis. Theauthors present a plot indicating that most of the training set compounds (45 out of 50)are correctly classified by equation (5) and likewise most of the test set (14 out of 17) arecorrectly classified. However, it is clear from inspection of the figure in their original

paper that with the category boundaries defined differently (e.g., at 1% and 10%), theQSAR would perform much less well at classifying the compounds. Consequently wedo not consider the QSAR robust enough to be generally useful for classifying intopotency categories.

2.2.7 Mechanistic basis (Principle 5). The authors suggest that the FPSA2ESPdescriptor may be relevant to skin permeation and that the EHOMO–LUMO descriptormodels reactivity. As discussed below, these assumptions are questionable.

2.2.8 Additional comments. Use of EC3 weight percent rather than pEC3(molar basis) is difficult to justify. It disguises the fact that equation (5) is very poor

at discriminating within the strong category (as defined by the authors). This can beseen from table 1.

For example with N-methyl-N-nitroso urea the residual on a wt% basis is ‘‘only’’3.22 which would be regarded as good agreement for a weak sensitiser with EC3ca 30%, whereas actually the prediction underestimates the potency by a factor of 65.

It is instructive to examine the list of 20 compounds excluded as outliers on the basisof poor fit between the literature values and calculated EC3 values. Primary alkyliodides are very much underpredicted: the model gives EC3 values ranging from 93.7to 136%, compared with literature values ranging from 13.1 to 24.2%. Xylene, an

‘‘obvious’’ non-sensitiser with a literature EC3 value of 95.8%, is calculated to havean EC3 value of 9.2%. Dimethyl sulphoxide, another ‘‘obvious’’ non-sensitiser, with aliterature EC3 value of 71.9%, is calculated to have EC3¼ 7.0 One of the compoundslisted as an outlier is oleyl methane sulphonate. Its calculated EC3 value is 21.6% andthe literature value is given as 98%. However this value is erroneous: the correct

literature value is 25%, in quite good agreement with the calculated value.In order to make a better assessment of this approach we have reanalysed the data

using EC3 values converted to logEC3 (%w/v) equation (6), to EC3 (molarconcentration) equation (7) and pEC3 (molar) equation (8). EC3 (molar) is calculatedas EC3/M, where M is the molecular weight, and pEC3 is calculated as log (M/EC3).Logarithms are to base 10.

2.2.9 Reanalysis. First we compared equation (5) with the result from reanalysis usinglogEC3 (%w/v)

logEC3ð%w=vÞ ¼ 0:82ð�0:13Þ FPSA2ESP þ 0:31ð�0:08Þ EHOMO�LUMO � 3:54ð�0:83Þ

n ¼ 50, r2 ¼ 0:489, r2adj ¼ 0:469, s ¼ 0:58, F ¼ 22:5 ð6Þ

Clearly the statistical fit is worse than for equation (5).

Skin sensitisation global QSARs 357

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 18: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

Next we compared equation (5) with the result from reanalysis using EC3

(molar basis):

EC3ðmolarÞ ¼ 0:043ð�0:006Þ FPSA2ESP þ 0:019ð�0:004Þ EHOMO�LUMO � 0:20ð�0:04Þ

n ¼ 50, r2 ¼ 0:566, r2adj ¼ 0:547, s ¼ 0:03, F ¼ 30:6 ð7Þ

Clearly the statistical fit is worse than for equation (5):Finally we compared equation (5) with the result from reanalysis using pEC3

(molar basis):

pEC3ðmolarÞ ¼ 5:21ð�0:82Þ � 0:70ð�0:13Þ FPSA2ESP � 0:27ð�0:08Þ EHOMO�LUMO

n ¼ 50, r2 ¼ 0:419, r2adj ¼ 0:394, s ¼ 0:58, F ¼ 16:9 ð8Þ

Clearly the statistical fit is worse than for equation (5).Figure 4 shows the observed pEC3 values, plotted against values calculated from

equation (8). Figure 4(a) is for the 50 compounds of the training set, and figure 4(b) is

for the 20 compounds of the test set.It seems clear to us why a better QSAR is obtained with EC3 values on a wt% rather

than on a molar basis. The FPSA2ESP descriptor is only a partial model for

hydrophobicity, which is well recognised as an important parameter in skin

sensitisation. The overall hydrophobicity can be considered to be comprised of

a negative polarity contribution (which is modelled by FPSA2ESP) and the positive

cavity contribution which is approximately correlated with molecular weight (MW), but

which is not incorporated in FPSA2ESP or EHOMO-LUMO parameters. Consequently

neglect of MW in the dependent variable (i.e., not dividing EC3 by MW) compensates

approximately for the lack of a cavity term in the independent variables.Miller et al. [17] claim that the regression analysis based on EC3 makes it possible to

classify sensitisers into potency categories: strong, moderate, weak (including non-

sensitisers). However examination of figure 5 reveals that this apparent ability to

classify is illusory. It can be seen that the plot has the form of a ‘‘cloud’’ of points within

Figure 4. Semi-mechanistic global approach. pEC3obs vs. pEC3Calc (training set (a), test set (b)).

358 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 19: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

overall positive slope, but with a great deal of scatter about the trendline. The ‘‘cloud’’is relatively thin at two points (one near the coordinates 7, 5 and other near thecoordinate 13, 16). Drawing horizontal and vertical lines through these coordinatescreates three ‘‘bins’’, which in figure 5 are designated Strong, Moderate, Weak.The large majority of the points are in one or other of those ‘‘bins’’ and on this basisthe category classification is to a large extent successful. However:

(1) The boundaries of the ‘‘bins’’ do not correspond to the recognised [6] potencycategory boundaries.

(2) There are no other zones where the ‘‘clouds’’ are thin enough to separate thecompounds into categories; in particular the coordinates 1,1 and 10,10, whichwould be the appropriate for the recognized categories would categoriseapproximately as many chemicals incorrectly as correctly.

2.2.10 Overall assessment. Because of the poor fit to the data, lack of robustness andmis-predictions for several ‘‘obvious’’ sensitisers and non-sensitisers, we consider thisQSAR to be unsuitable for prediction or classification into potency categories.

2.3 Mechanism-based approach: the Schiff base domain

2.3.1 Description. Three of the present authors [20] have carried out a QSAR analysisof published LLNA EC3 values [21–23] for a diverse range of aliphatic aldehydes and

Figure 5. Semi-mechanistic global approach. EC3lit vs. EC3Calc.

Category Recognised [6] Miller et al. [17]Strong/extreme <1% <6%Moderate >1%, <10% >6%, <16%Weak >10% >16%

Skin sensitisation global QSARs 359

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 20: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

ketones which can react with protein amino groups so as to bind to the protein by Schiff

base (SB) formation. For the Schiff base mechanism, reactivity can be modelled by ��*,the sum of the Taft �* substituent constants for the two groups attached to the carbonyl

group. Excluded from the analysis were:

Aryl aldehydes and ketones (CO group attached to one or two benzene rings).

Such compounds are usually non-sensitisers or weak sensitisers, their Schiff

base reactivity being impaired because of resonance stabilisation of the parent

compound.Those diketones which are predominantly enolised such that the non-enolised

keto group is aromatic.Reactive Michael acceptors.

The approach was

(1) Regression analysis based on logP and ��* for a series of 10 SB aldehydes as an

initial training set.(2) Prediction for an initial test set consisting of one aldehyde and five ketones with an

alpha carbonyl group (4 �,�-diketones and 1 �-ketoester).(3) Further regression analysis on the combined initial training and test set data, to

develop a QSAR based on 16 compounds.(4) Explore the predictive performance of this QSAR across the wider SB mechanistic

applicability domain, using further LLNA data for 1,3-dicarbonyl compounds.

Table 2 lists the compounds studied, with their toxicity values and physicochemical

parameters.The regression equation for the initial training set is:

pEC3 ¼ 1:12ð�0:11Þ�� � þ 0:41ð�0:05Þ logP� 0:60ð�0:18Þ

n ¼ 10, r2 ¼ 0:937, r2adj ¼ 0:919, s ¼ 0:11, F ¼ 52:2 ð9Þ

The regression equation for the initial test set is:

pEC3obs ¼ 0:99ð�0:11Þ pEC3calcd þ 0:01ð�0:15Þ

n ¼ 6, r2 ¼ 0:957, r2adj ¼ 0:946, s ¼ 0:17, F ¼ 88:1 ð10Þ

The regression equation for the initial training and test sets combined is:

pEC3 ¼ 1:12ð�0:07Þ�� � þ 0:42ð�0:04Þ logP� 0:62ð�0:13Þ

n ¼ 16, r2 ¼ 0:952, r2adj ¼ 0:945, s ¼ 0:12, F ¼ 129:6 ð11Þ

Two of the 1,3-dicarbonyl compounds of the further test set were classed as

non-sensitisers (EC3 not reached at 40%). One of these is calculated to have

EC3¼ 59%; the other is calculated to have EC3¼ 39%. For the remaining six

1,3-dicarbonyl compounds, the agreement between calculated from equation (11) and

observed pEC3 values is good, as can be seen from table 2.This QSAR equation (11) is derived from a mechanistic model, being based on the

RAI principle. A diverse range of compounds, for which the only feature in common is

the presence of an aliphatic carbonyl group capable of Schiff base formation, is well

predicted by this QSAR.

360 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 21: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

2.4 Application of the OECD principles to the Schiff base QSAR equation (11)

2.4.1 Defined endpoint (Principle 1). The endpoint, EC3, is clearly defined and isaccepted as an appropriate endpoint for the LLNA (Dir 2004/73/EC, method B42).

2.4.2 Defined algorithm (Principle 2). This principle is met. The descriptorsare calculated by methods which are well documented: ��* by the method of

Perrin et al. [24] and log P by the method of Leo and Hansch [7].

2.4.3 Domain of applicability (Principle 3). The QSAR is applicable across the wholeSchiff base former mechanistic applicability domain. To use it confidently it is of course

necessary to be able to assign a compound to this applicability domain. This requires

organic chemistry expertise. All aliphatic aldehydes and ketones belong to this domain,

unless they have other functional groups enabling them to be stronger sensitisers via the

reaction chemistry of other mechanistic applicability domains.

Table 2. Schiff base LLNA sensitisation data set.

Compound CAS logP ��* pEC3obs pEC3Calcd LDF

Initial training set

Cyclamen aldehyde 103-95-7 2.86 0.33 0.97 �0.20Glyoxal 107-22-2 �1.66 2.64 1.62 2.43Hydroxycitronellal 107-75-5 2.66* 0.30 0.83 �1.05Undec-10-enal 112-45-8 4.05 0.26 1.39 3.07cis-6-Nonenal 2277-19-2 2.99 0.24 0.78 �0.45Landolal 31906-04-4 3.57* 0.30 1.09 1.84Lilial 80-54-6 2.96 0.33 1.02 0.12�-Methyl-phenyl-acetaldehyde 93-53-8 1.79 0.86 1.33 0.30Citral 5392-40-5 2.35 0.68 1.06 0.76Diethyl acetaldehyde 97-96-1 1.10 0.30 0.17 �6.01Initial test set

Camphorquinone 465-29-2 0.69 1.62 1.55 1.48 2.40Butan-2,3-dione 431-03-8 �1.22 1.81 0.79 0.89 �2.281-Phenylpropane-1,2-dione 579-07-7 1.02 2.20 2.30 2.27 7.72Furil 492-94-4 1.42 0.50 0.80** 0.54 �3.52Hexanal 66-25-1 1.95 0.26 0.35 0.49 �3.61Methyl pyruvate 600-22-6 0.10 2.00 1.63 1.66 3.32Further test set

PhCOCHMeCOMe 1.21 0.68 0.78 0.65 �2.87PhCOCH2COBu(n) 3.05 0.63 1.30 1.37 2.622-Acetylcyclohexanone 874-23-7 0.97 0.53 *** 0.38 �4.731-(20,50-Diethylphenyl)-butane-1,3-dione 3.14 0.88 1.36 1.68 4.741-(20,50-Dimethylphenyl)-butane-1,3-dione 2.09 0.88 1.17 1.24 1.402,2,6,6-Tetramethyl-heptane-3,5-dione 1118-71-4 2.45 0.32 0.84 0.77 �1.571-(20,304050-Tetramethyl-phenyl)-butane-1,3-dione 2.93 0.88 1.43 1.60 4.071-(30,40,50-Trimethoxy-phenyl)-4-

dimethyl-pentane-1,3-dione2.03 0.58 **** 0.88 �1.00

*The logP values are for the unsaturated aldehydes resulting from elimination of water, which is assumed to occur in vivo [21,22].**Furil was classified as a non-sensitiser [23], having failed to give a stimulation index of 3 or above at the concentrationstested (up to 25%). The pEC3 value given here is from a dose-response-QSAR plot for the firstfour compounds in the table, and corresponds to an extrapolated EC3 value of 30%.***No EC3 recorded when tested up to 40%. Calculated EC3¼ 59%.****No EC3 recorded when tested up to 40%. Calculated EC3¼ 39%.

Skin sensitisation global QSARs 361

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 22: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

2.4.4 Goodness of fit, robustness and predictivity (Principle 4). As stated previously,the QSAR equation is:

pEC3 ¼ 1:12ð�0:07Þ�� � þ 0:42ð�0:04Þ logP� 0:62ð�0:13Þ

n ¼ 16, r2 ¼ 0:952, r2adj ¼ 0:945, s ¼ 0:12, F ¼ 129:6 ð11Þ

The compounds from which equation (11) was derived consist of 11 aldehydes, one�-ketoester and four �,�-diketones. In developing the QSAR, an initial regressionequation for a training set of ten aldehydes was found to predict a test set consisting ofthe other six compounds. Equation (11) is for the combined training and test sets.The internal performance criterion (Principle 4a) is clearly met.

Predictions of equation (11) are in good agreement with experimental data for eight1,3-diketones. One further 1,3-diketone, PhCOCH2COCF3, is significantly less potentthan predicted by equation (11): this is rationalised in terms of its thermodynamicallymore stable keto-enol tautomer, PhCOCH¼C(OH)CF3 being an aromatic ketone andhence relatively unreactive. Predictions of equation (11) are also consistent with the lackof evidence for sensitisation by simple aliphatic monoketones. Thus the externalvalidation for predictivity criterion (Principle 4b) is met.

2.4.5 Mechanistic basis (Principle 5). The mechanistic basis for the QSAR is veryclear. It is based on the RAI pharmacokinetic model relating sensitisation potential toprotein binding reaction kinetics and partitioning. The protein binding reactioninvolves nucleophilic attack on the carbonyl group.

2.4.6 Overall assessment. The Schiff base QSAR meets all the criteria of the OECDprinciples and can be used to predict sensitisation potential of compounds which can beclassified into the Schiff base mechanistic applicability domain. In view of the structuraldiversity within the sets of compounds considered here, the present findings confirm ourview expressed previously [4, 22] that within the mechanistic applicability domain thedifferences in sensitisation potential are dependent solely on differences in chemicalreactivity and partitioning.

2.4.7 Discriminant analysis. In section 2.1 we argued that if the parameters aremechanistically relevant, a function derived by linear discriminant analysis should beexpected to give a reasonable correlation, although not necessary the best possiblebased on the parameters used, when used as the independent variable in a linearregression analysis. To illustrate this point we applied linear discriminant analysis to theSchiff base electrophiles discussed in this section. We used MINITAB software ver. 14(MINITAB Inc., State College, PA) to derive a linear discriminant function (LDF),using ��* and logP as the discriminating variables, to separate the 16 compounds ofthe training set (table 2, initial training set plus initial test set), into two bins:compounds with pEC3>1 and compounds with pEC3<1. We derived thefollowing LDF:

LDF ¼ 7:36�� �þ3:18 logP� 11:72

Positive LDF values predict pEC3>1 and negative LDF values predict pEC3<1.All 16 compounds of the training set (table 2, initial training set plus initial test set) arecorrectly classified by this function (table 2, right hand column). Its robustness was

362 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 23: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

further tested using the eight compounds listed under ‘‘further test set’’ in table 2.All of these compounds are correctly classified by LDF (table 2, right hand column).

We next carried out a linear regression analysis for pEC3 against LDF. The resultingregression equation is:

pEC3 ¼ 0:15ð�0:01Þ LDFþ 0:16ð�0:04Þ

n ¼ 16, r2 ¼ 0:91, r2adj ¼ 0:90, s ¼ 0:16, F ¼ 142:0 ð12Þ

Although this equation is not as good as that obtained by conventional multiple linear

regression analysis for pEC3 with ��* and logP as the independent variablesequation (11) its statistical quality is high, and supports our contention that a robust

discriminant function based on mechanistically relevant parameters can be expected toperform adequately when used as a regression parameter.

3. Conclusions

‘‘Global QSARs’’ based on complex structure-derived parameters, covering several

mechanistic domains, have become a growth area recently. Those examined in thisarticle all use selections from the same limited set of published LLNA data. They are

based ultimately on the false logic that ‘‘All A’s are B’s implies all B’s are A’s’’.Although such logic is false, it can sometimes give quite a high predictive success

rate (e.g., consider ‘‘all humans are tail-less mammals’’. An alien observer who sees atail-less mammal and on that basis classifies it as human will be right more often than

not) and that is why success rates of the order of 80% are often reported. Because theylack a sound mechanistic basis and cross mechanistic domain boundaries, such QSARs

are unlikely to give sufficiently reliable predictions for new compounds. The weaknessesof this approach are well demonstrated by the first two QSARs analysed here.

Within mechanistic applicability domains, clear chemistry-activity trends can be seen

[4, 16], indicating that mechanistic model based ‘‘QSARs’’ (perhaps a better term wouldbe quantitative mechanistic models, or QMMs, analogous to linear free energy

relationships in organic chemistry), are achievable. The Schiff base QSAR analysed

here, meeting all of the OECD criteria, is a good example. There is scope for integratingsuch quantitative correlations, together with mechanistic applicability domain

classification facilities (guidelines for such classification have recently been published[25]) into expert systems based on the extensive literature on experimental mechanistic

investigation of skin sensitisation. We see two barriers to further development:The first barrier is that there are still some areas of mechanistic uncertainty (the large

structural domain of aromatic hydroxy and amino compounds is one). These gaps can

and should be filled by experimental chemistry programmes on compounds for whichsensitisation data are already available, so as to arrive at rules for mechanistic

applicability domain classification of currently difficult compounds and to developquantitative principles for applying to compounds within those domains. In particular,

the structural domain of aromatic hydroxy and amino compounds should receiveattention.

The second barrier is the problem of adequately modelling the key chemical

properties, particularly reactivity, in silico. Further work in this area is very desirable.

Skin sensitisation global QSARs 363

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 24: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

However it should not be forgotten that even when chemical properties cannot bemodelled, they can be measured experimentally.

In the immediate short term the best non-animal approach currently applicable topredict skin sensitisation potential is with the help of an expert system. This wouldassign compounds into mechanistic applicability domains and apply mechanism-based(Q)SARs specific for those domains, and, very importantly, recognise whena compound is outside its range of competence. In such situations, it would call forhuman expert input supported by experimental chemistry studies as necessary.

We do not envisage a role for global (Q)SAR approaches in such situations. Our viewis that if the relevant chemistry cannot be discerned from inspection of structure,we could not put much reliance on a prediction from a global model. In sucha situation, what is necessary is experimental determination of the chemistry, enablingthe compound to be assigned to a mechanistic domain and its reactivity compared toother compounds, of known sensitisation potency, in its mechanistic domain.

In our experience, when we fail to correctly predict sensitisation potential (or lackof it) the failure results not from inadequacy of the chemistry-sensitisation relationshipbut from our failure to correctly predict the chemistry. In many cases the chemistry ofa compound is obvious to an organic chemist from its structure and can be quantified toa greater or lesser extent by comparison with other known compounds (chemicalread-across). In other cases it is difficult or impossible to predict the chemistry.The latter situation could be dealt with by appropriate laboratory chemistry studies,and we would argue that currently insufficient consideration is given to replacinganimal laboratory testing by chemistry laboratory testing. However, some positiveinitiatives in this direction have recently been reported [5, 25–27].

Acknowledgement

The work presented here was carried out as part of an ECB project, JRC contractCCR.IHCP.C430412.X0: Validation of (quantitative) structure-activity relationshipsfor toxicological endpoints of regulatory importance. Lot 3: QSARs for skinsensitisation. We also wish to acknowledge the insightful comments of two anonymousreviewers.

References

[1] EC. Off. J. Eur. Union, L66, 26 (2003).[2] Organisation for Economic Co-operation and Development (OECD) (2004). Available online at: http://

www.oecd.org/document/23/0,2340,en_2649_34365_33957015_1_1_1_1,00.html (accessed 11 May 2006).[3] M. Divkovic, K. Pease, G.F. Gerberick, D.A. Basketter. Cont. Dermat., 53, 189 (2005).[4] A.O. Aptula, G.Y. Patlewicz, D.W. Roberts. Chem. Res. Tox., 18, 1420 (2005).[5] D.W. Roberts, D.L. Williams. J. Theor. Biol., 99, 807 (1982).[6] I. Kimber, D.A. Basketter, M. Butler, A. Gamer, J.L. Garrigue, G.F. Gerberick, C. Newsome,

W. Steiling, H.W. Vohr. Food Chem. Toxicol., 41, 1799 (2003).[7] C. Hansch, A.J. Leo. Substituent Constants for Correlation Analysis in Chemistry and Biology, John Wiley

and Sons, New York (1979).[8] J. Hine. Physical Organic Chemistry, 2nd Edn, p. 176, McGraw-Hill, New York (1962).[9] G. Dupuis, C. Benezra. Allergic Contact Dermatitis to Simple Chemicals: A Molecular Approach, Dekker,

New York (1982).[10] E. Estrada, G. Patlewicz, M. Chamberlain, D. Basketter, S. Larbey. Chem. Res. Toxicol., 16, 1226 (2003).

364 D. W. Roberts et al.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3

Page 25: Global (Q)SARs for skin sensitisation–assessment against OECD principles‖

[11] E. Estrada, G. Patlewicz, Y. Gutierrez. J. Chem. Inf. Comp. Sci., 44, 688 (2004).[12] I. Kimber, R.J. Dearman, D.A. Basketter, C.A. Ryan, G.F. Gerberick. Cont. Dermat., 47, 315 (2002).[13] H. Konemann. Toxicology, 19, 209 (1981).[14] G.F. Gerberick, C.A. Ryan, P.S. Kern, H. Schlatter, R.J. Dearman, I. Kimber, G.Y. Patlewicz,

D.A. Basketter. Dermatitis, 16, 157 (2005).[15] L.F. Fieser, M. Fieser. Advanced Organic Chemistry, Reinhold, New York (1961).[16] D.W. Roberts, A.O. Aptula, G. Patlewicz. Chem. Res. Toxicol., 20, 44 (2007).[17] M.D. Miller, D.M. Yourtee, A.G. Glaros, C.C. Chappelow, J.D. Eick, A.J. Holder. J. Chem. Inf. Model.,

45, 924 (2005).[18] D.W. Roberts, B.F.J. Goodwin, D.L. Williams, K. Jones, A.W. Johnson, J.C.E. Alderson. Food. Chem.

Toxicol., 21, 811 (1983).[19] K. Landsteiner, J. Jacobs. J. Exp. Med., 64, 643 (1936).[20] D.W. Roberts, A.O. Aptula, G. Patlewicz. Chem. Res. Toxicol., 19, 1228 (2006).[21] G. Patlewicz, D.A. Basketter, C.K. Smith, S.A. Hotchkiss, D.W. Roberts. Contact Dermatitis, 44, 331

(2001).[22] G. Patlewicz, D.W. Roberts, J.D. Walker. QSAR & Comb. Sci., 22, 196 (2003).[23] D.W. Roberts, M. York, D.A. Basketter. Contact Dermatitis, 41, 14 (1999).[24] D.D. Perrin, B. Dempsey, E.P. Serjeant. pKa Prediction for Organic Acids and Bases, Chapman and Hall,

London (1981).[25] A.O. Aptula, D.W. Roberts. Chem. Res. Toxicol., 19, 1097 (2006).[26] G.F. Gerberick, C.A. Ryan, P.S. Kern, R.J. Dearman, I. Kimber, G.Y. Patlewicz, D.A. Basketter.

Contact Dermatitis, 50, 274 (2004).[27] A.O. Aptula, G. Patlewicz, D.W. Roberts, T.W. Schultz. Toxicol. in vitro, 20, 239 (2006).

Skin sensitisation global QSARs 365

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

6:36

02

Oct

ober

201

3