evaluation of the chemical inventories in the us fda’s ... · endpoint alerts training set #...
TRANSCRIPT
Endpoint Alerts training set
# Compounds (POS / NEG)
# Alerts (PPV range)
9,697 (5,116 / 4,581) 50 (0.65-0.94)
1,928 (849 / 1079) 13 (0.70-0.81)
556 (122 / 434) 7 (0.60-0.75)
1,419 (83 / 1336) 24 (0.03-0.78)
602 (212 / 390) 5 (0.71-0.87)
Endpoint
Global QSAR training set
# Compounds (POS / NEG)
PPV / NPV*
Ames 2825 (935 / 1,890) 0.72 / 0.78
iviCA 1389 (671 / 718) 0.67 / 0.62
ivvMN 540 (119/421) 0.60 / 0.65
Cleft palate 270 (81 / 189) 0.53 / 0.79
SkinSensHaz 602 (212 / 390) 0.67 / 0.69
Evaluation of the chemical inventories in the US FDA’s Office of Food Additive Safety for human health endpoints using a toxicity prediction system
Arvidson K1, Rathman J2,4, Volarath P1, Mostrag A2, Tarkhov A3, Bienfait B3, Vitcheva V2, Yang C2,3,4 1U.S. FDA CFSAN, College Park, MD 2Altamira LLC, Columbus OH, USA, 3Molecular Networks GmbH, Erlangen, Germany, 4Ohio State University, Columbus OH, USA
Abstract #2467 / P533
U.S. FDA CFSAN INVENTORY
PREDICTION ACCURACY FOR BACTERIAL MUTAGENESIS MODEL
CONCLUSIONS AND FURTHER STEPS
Validation set: 115 common InChI keys (computational forms representing 157 CRS-IDs) with experimental data
VALIDATION OF BACTERIAL MUTAGENESIS MODEL
• Validation set: Of the 2372 CRS-IDs, only 157 compounds currently have data (115 InChI keys) in CERES
• The study source for the validation set was mostly U.S. FDA PAFA “C” studies, which are not used in the QSAR training set.
• QSAR training set: 33% POS; Validation set: 5% predicted POS (model not biased towards POS)
QSAR Training Set • Subset of ToxGPS vetted for data
quality and balancing structure space • 2,825 InChI keys : 1890 non-mutagenic
and 935 mutagenic structures
U.S. FDA CFSAN inventory o 2,372 CRS-IDs o 2,275 InChI keys
115
ToxGPS Knowledgebase • Large collection of public Ames data • Over 10,000 test substances • 7,722 InChI keys (computational form)
Only 20 structures were common between the QSAR training set and the U.S. FDA CFSAN test list. Most of these compounds were non-mutagens.
2 false positives
Exp. POS
Exp. NEG
Total
Pred. POS 6 2 8
Pred. NEG 3 145 147
Not Pred. 0 2 2
Quinoline
Michael acceptor
The Chemical Evaluation and Risk Estimation System (CERES) at the U.S. FDA’s Office of Food Additive Safety (OFAS) implemented the ChemTunes prediction system within the workflows for pre-market reviews and post-market monitoring of food ingredients and packaging materials. Present work demonstrates how post-marketing evaluation based on a profiling analysis of the historical inventories provides opportunities to enhance the current workflow by applying the advanced methods in the ChemTunes knowledgebase within CERES.
The OFAS operates under the U. S. FDA’S Center for Food Safety and Applied Nutrition (CFSAN) to ensure the safety of all food additives and ingredients used in the U. S. The office is comprised of three divisions :
(1) DPR (Division of Petition Review): Premarket reviews of direct food and color additive petitions
(2) DFCN (Division of Food Contact Substance Notification Review): Reviews of food contact substances or indirect food additives (e.g., food packaging)
(3) DBGNR (Division of Biotechnology and GRAS Notice Review): Consultations with industry - on bio-engineered food products and Generally Recognized as Safe substances
FDA PROGRAMS
ABSTRACT
CERES • CERES is a substance-centric database that
houses the OFAS’s food additive data and cheminformatics platform for data analysis (Abstract #2617/P115) and toxicity prediction
• ChemTunes models are implemented in CERES for: genetic toxicity (Ames); in vitro chromosome aberrations (ivtCA); in vivo micronucleus (ivvMN)), tumorigenicity (mouse and rat); developmental toxicity (cleft palate) and skin sensitization
CERES MODELS FOR TOXICITY PREDICTION CERES ChemTunes models are based on ToxGPS knowledgebase of in vivo and in vitro toxicity data compiled from regulatory, primary and literature sources. Predictions are based on chemotype alerts and mode-of-action (MoA) informed QSAR models by applying a quantitative weight of evidence (WOE) method. Prediction accuracies were validated by experimental data. The domains of applicability of models and chemotype alerts were systematically addressed.
Structure selection Structure QC
Final computational form used for predictions: • 2,372 unique CRS-IDs • 2,275 InChI keys
• All structures reviewed by
chemists using CORINA CLEAN workflow in Corina Symphony (Molecular Networks)
• Processing options: remove small fragments, neutralize, generate 3-D structure, flag duplicates for user decision
• Total structures: 10,884 • Removed IOM(s),
natural products, mixtures, polymers, and other Ill-defined compounds
Predicted endpoint profile of inventories
• Concordance: 95% • Sensitivity: 67% • Specificity: 97%
INVENTORY PROFILING OF CHEMICAL SPACE – PCA projection
PCs projections based on ToxPrint Chemotypes space (www.toxprint.org). High loading ToxPrint Chemotypes include: carboxylic esters; alcohols; sulfhydrides; sulfonates; alkanes: branched, cyclic, oxy-, and linear C2-C16; alkenes: branched; aromatic rings with heteroatoms (PC1); carboxylic acids; alkanes: linear C2-C4 and aromatic; alkenes: linear; aromatic rings: benzene, phenyl (PC2); Michael acceptors; aromatic amines; alkenes: linear and aromatic (styrene); aromatic rings: benzene, phenyl, biphenyl (PC3).
Principal Components (PCs) projections based on Corina Symphony physical-chemical properties space. High loading physicochemical properties include: number of atoms, bonds, and rotatable bonds, MW, complexity, McGowan volume, polarizability (PC1), # H-donors, TPSA, XLogP (PC2), and ring complexity (PC3).
• The advanced methods in the ToxGPS knowledgebase within CERES are used to profile historical inventories.
• Fewer than 5% the structures in the combined inventory are predicted positive in more than one mutagenicity or clastogenicity endpoint.
• Prioritize harvesting of CFSAN regulatory submissions to increase availability of toxicity data in CERES
IOM: inorganic, organometallic, metal complexes and metals
Final counts and overlap Food Ingredient Inventory # REC # Structure
EAFUS: Everything Added to Food in the U.S. 3,968 2,443
FCS: Food Contact Substances 1,155 391
FEMA: Flavor and Extract Manufacturer's
Association2,758 1,742
GRAS: Generally Recognized As Safe 572 40
INDIRECT: Indirect Food Additives 3,237 1,790
PAFA: Priority-based Assessment of Food
Additives7,202 4,341
SCOGS: Select Committee of GRAS Substance 373 137
TOTAL 19,265 10,884
Physicochemical properties ToxPrint chemotypes
bond:C(=O)O_carboxylicAcidEster_generic bond:C=O_aldehyde_generic
bond:CC(=O)C_ketone_generic bond:CN_amine_aromatic_generic
bond:CN_amine_generic bond:CN_amine_pri-NH2_aromatic
bond:COC_ether_aliphatic bond:COH_alcohol_aromatic_phenol
bond:COH_alcohol_generic bond:CS_halide_aliphatic
bond:CX_halide_aromatic-X_generic bond:CS_sulfide
bond:N=N_azo_generic bond:NC=O_urea_generic
bond:S(=O)O_sulfonate chain:alkaneLinear_dodedyl_C12 (>=12)
chain:alkaneLinear_hexadecyl_C16 (>=C16)
chain:alkaneLinear_octyl_C8 group:carbohydrate
ring:hetero_[5]_O_furan_oxolane ring:hetero_[5]_Z_1_3-Z
ring:hetero_[6]_N_pyridine_generic ring:hetero_[6]_O_pyran_generic
Disclaimer: References to ChemTunes or CORINA
Symphony are not endorsement by U.S. FDA.
CHEMICAL SPACE COMPARISON OF MODEL AND FDA INVENTORY
• High specificity indicates accurate classification of negatives
• Due to low %POS in validation set, the sensitivity cannot be reliably estimated. However, low number of false positives shows the model is not biased towards POS even though %POS in training set was higher (33%) than that of the validation set.
Ratio of the % frequency
ToxPrint Name QSAR set FDA
Inventory