Abstract
Some drugs and xenobiotics have the potential to disturb homeostasis, normal growth, differentiation, development or behavior during prenatal development or postnatally until puberty. Assessment of the developmental toxicity is one of the important safety considerations incorporated by international regulatory agencies. In this investigation, seven machine learning methods, including naïve Bayes, support vector machine, recursive partitioning, k-nearest neighbor, C4.5 decision tree, random forest and Adaboost, were used to build binary classification models for developmental toxicity. Among these models, the naïve Bayes classifier represented the best predictive performance and stability, which gave 91.11% overall prediction accuracy, 91.50% balanced accuracy and 0.818 MCC for the training set, and generated 83.93% concordance, 81.85% balanced accuracy and 0.627 MCC for the test set. The application domains were analyzed, and only one chemical in the test set was identified as outside the application domain. In addition, 10 important molecular descriptors related to developmental toxicity were selected by the genetic algorithm, which may contribute to explanation of the mechanisms of developmental toxicants. The best naïve Bayes classification model should be employed as alternative method for qualitative prediction of chemical-induced developmental toxicity in early stages of drug development.
Graphic abstract
Similar content being viewed by others
References
Bracken MB, Holford TR (1981) Exposure to prescribed drugs in pregnancy and association with congenital malformations. Obstet Gynecol 58:336–344. https://doi.org/10.1016/0378-5122(81)90041-4
van Gelder MM, van Rooij IA, Miller RK, Zielhuis GA, Jong-van den Berg LT, Roeleveld N (2010) Teratogenic mechanisms of medical drugs. Hum Reprod Update 16:378–394. https://doi.org/10.1093/humupd/dmp052
Wu C (2010) Overview of developmental and reproductive toxicity research in china: history, funding mechanisms, and frontiers of the research. Birth Defects Res (Part B) 89:9–17. https://doi.org/10.1002/bdrb.20231
CEPA, Canadian Environmental Protection Act (2018) Canada. S.C., c. 33. Part III, vol 22, no 3. http://laws-lois.justice.gc.ca/PDF/C-15.31.pdf
EPA, U.S. Environmental Protection Agency (2014) Roundtable on environmental health sciences, research, and medicine. Board on population health and public
REACH, European Chemicals Agency, Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorization and Restriction of Chemicals (REACH),establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC. OJ L 396, 30.12.2006, pp 1–849
ICH (2015) S5(R3) final concept paper: detection of toxicity to reproduction for medicinal products and toxicity to male fertility dated 9 February 2015. Endorsed by the ICH Steering Committee on 27 March 2015
ICH (2005) Harmonized tripartite guideline, detection of toxicity to reproduction for medicinal products and toxicity to male fertility S5. Parent guideline dated 24 June 1993. Addendum dated 9 November 2000 incorporated in November 2005
OECD 414 (2001) Guideline for the testing of chemicals. No. 414 Prenatal developmental toxicity study
OECD 415 (1983) Guideline for the testing of chemicals. No. 415 One-generation reproduction toxicity study
OECD 416 (2001) Guideline for the testing of chemicals. No. 416 Two generation reproduction toxicity study
OECD 421 (2016) OECD guideline for testing of chemicals No. 421: reproduction/developmental toxicity screening test
OECD 422 (2016) OECD guideline for testing of chemicals No. 422: combined repeated dose toxicity study with the reproduction/developmental toxicity screening test
Höfer T, Gerner I, Gundert-Remy U, Liebsch M, Schulte A, Spielmann H, Richard V, Wettig K (2004) Animal testing and alternative approaches for the human health risk assessment under the proposed new European chemicals regulation. Arch Toxicol 78:549–564. https://doi.org/10.1007/s00204-004-0577-9
Scialli AR (2008) The challenge of reproductive and developmental toxicology under REACH. Regul Toxicol Pharmacol 51:244–250. https://doi.org/10.1016/j.yrtph.2008.04.008
Manon B (2017) The era of 3Rs implementation in developmental and reproductive toxicity (DART) testing: current overview and future perspectives. Reprod Toxicol 72:86–96. https://doi.org/10.1016/j.reprotox.2017.05.006
Arena VC, Sussman NB, Mazumdar S, Yu S, Macina QT (2004) The utility of structure-activity relationship (SAR) models for prediction and covariate selection in developmental toxicity: comparative analysis of logistic regression and decision tree models. SAR QSAR Environ Res 15:1–18. https://doi.org/10.1080/1062936032000169633
Cassano A, Manganaro A, Martin T, Young D, Piclin N, Pintore M, Bigoni D, Benfenati E (2010) CAESAR models for developmental toxicity. Chem Cent J S4:1–11. https://doi.org/10.1186/1752-153X-4-S1-S4
Gombar VK, Enslein K, Blake BW (1995) Assessment of developmental toxicity potential of chemicals by quantitative structure-toxicity relationship models. Chemosphere 31:2499–2510. https://doi.org/10.1016/0045-6535(95)00119-S
Ghorbanzadeh M, Zhang J, Andersson PL (2016) Binary classification model to predict developmental toxicity of industrial chemicals in zebrafish. J Chemom 30:298–307. https://doi.org/10.1002/cem.2791
Gunturia SB, Ramamurthia N (2014) A novel approach to generate robust classification models to predict developmental toxicity from imbalanced datasets. SAR QSAR Environ Res 25:1–17. https://doi.org/10.1080/1062936x.2014.942357
Hewitt M, Ellison CM, Enoch SJ, Madden JC, Cronin MTD (2010) Integrating (Q)SAR models, expert systems and read-across approaches for the prediction of developmental toxicity. Reprod Toxicol 30:147–160. https://doi.org/10.1016/j.reprotox.2009.12.003
Marzo M, Kulkarni S, Manganaro A, Roncaglioni A, Wu S, Barton-Maclaren TS, Lester C, Benfenati E (2016) Integrating in silico models to enhance predictivity for developmental toxicity. Toxicology 370:127–137. https://doi.org/10.1016/j.tox.2016.09.015
Sussman NB, Arena VC, Yu S, Mazumdar S, Thampatty BP (2003) Decision tree SAR models for developmental toxicity based on an FDA/TERIS database. SAR QSAR Environ Res 14:83–96. https://doi.org/10.1080/1062936031000073126
Zhang H, Ren JX, Kang YL, Bo P, Liang JY, Ding L, Kong WB, Zhang J (2017) Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method. Reprod Toxicol 71:8–15. https://doi.org/10.1016/j.reprotox.2017.04.005
VCCLAB, Virtual Computational Chemistry Laboratory (2018) http://www.vcclab.org
Davis L (ed) (1991) Handbook of genetic algorithms. Van Nostrand Reinhold, New York
Berger JO (2013) Statistical decision theory and Bayesian analysis. Springer, Berlin
Box GE, Tiao CC (2011) Bayesian inference in statistical analysis. Wiley, Hoboken
Vapnik V (1998) Statistical learning theory. Wiley, New York
Yang SY, Huang Q, Li LL, Ma CY, Zhang H, Bai R, Teng QZ, Xiang ML, Wei YQ (2009) An integrated scheme for feature selection and parameter setting in the support vector machine modeling and its application to the prediction of pharmacokinetic properties of drugs. Artif Intell Med 46:155–163. https://doi.org/10.1016/j.artmed.2008.07.001
Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14:323–348. https://doi.org/10.1037/a0016973
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175–185. https://doi.org/10.2307/2685209
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Huang C, Ma YH, Zhao HB, Lu XP (2017) Spectral classification of asteroids by random forest. Chin Astron Astrophys 41:549–557. https://doi.org/10.1016/j.chinastron.2017.11.006
Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121:256–285. https://doi.org/10.1006/inco.1995.1136
Roy K, Kar S, Ambure P (2015) On a simple approach for determining applicability domain of QSAR models. Chemom Intell Lab Syst 145:22–29. https://doi.org/10.1016/j.chemolab.2015.04.013
OECD (2014) Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models. In: OECD series on testing and assessment. OECD Publishing, Paris, pp 1–154
Roy K, Mitra I (2011) On various metrics used for validation of predictive QSAR models with applications in virtual screening and focused library design. Comb Chem High Throughput Screen 14:450–474. https://doi.org/10.2174/138620711795767893
Lei T, Chen F, Liu H, Sun H, Kang Y, Li D, Li Y, Hou T (2017) ADMET evaluation in drug discovery. Part 17: development of quantitative and qualitative prediction models for chemical-induced respiratory toxicity. Mol Pharm 14:2407–2421. https://doi.org/10.1021/acs.molpharmaceut.7b00317
Zhang H, Ma JX, Liu CT, Ren JX, Ding L (2018) Development and evaluation of in silico prediction model for drug-induced respiratory toxicity by using naïve Bayes classifier method. Food Chem Toxicol 121:593–603. https://doi.org/10.1016/j.fct.2018.09.051
Giaginis C, Zira A, Theocharis S, Tsantili-Kakoulidou A (2008) Simple physicochemical properties as effective filters for risk estimation of drug transport across the human placental barrier. Rev Clin Pharmacol Pharmacokinet (Int Ed) 22:146–148
Medina-Franco JL (2013) Activity cliffs: facts or artifacts? Chem Biol Drug Des 81:553–556. https://doi.org/10.1111/cbdd.12115
Concu R, Kleandrova VV, Speck-Planche A, Cordeiro M (2017) Probing the toxicity of nanoparticles: a unified in silico machine learning model based on perturbation theory. Nanotoxicology 11:891–906. https://doi.org/10.1080/17435390.2017.1379567
Gonzalez-Diaz H, Arrasate S, Gomez-Sanjuan A, Sotomayor N, Lete E, Besada-Porto L, Ruso JM (2013) General theory for multiple input–output perturbations in complex molecular systems. 1. Linear QSPR electronegativity models in physical, organic, and medicinal chemistry. Curr Top Med Chem 13:1713–1741. https://doi.org/10.2174/1568026611313140011
Kleandrova VV, Luan F, Speck-Planche A, Cordeiro MNDS (2015) In silico assessment of the acute toxicity of chemicals: recent advances and new model for multitasking prediction of toxic effect. Mini Rev Med Chem 15:677–686. https://doi.org/10.2174/1389557515666150219143604
Tenorio-Borroto E, Ramirez FR, Speck-Planche A, Cordeiro MNDS, Luan F, Gonzalez-Diaz H (2014) QSPR and flow cytometry analysis (QSPR-FCA): review and new findings on parallel study of multiple interactions of chemical compounds with immune cellular and molecular targets. Curr Drug Metab 15:414–428. https://doi.org/10.2174/1389200215666140908101152
Luan F, Kleandrova VV, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS (2014) Computer-aided nanotoxicology: assessing cytotoxicity of nanoparticles under diverse experimental conditions by using a novel QSTR-perturbation approach. Nanoscale 6:10623–10630. https://doi.org/10.1039/c4nr01285b
Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Speck-Planche A, Cordeiro MNDS (2014) Computational tool for risk assessment of nanomaterials: novel QSTR-perturbation model for simultaneous prediction of ecotoxicity and cytotoxicity of uncoated and coated nanoparticles under multiple experimental conditions. Environ Sci Technol 48:14686–14694. https://doi.org/10.1021/es503861x
Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS (2014) Computational ecotoxicology: simultaneous prediction of ecotoxic effects of nanoparticles under different experimental conditions. Environ Int 73C:288–294. https://doi.org/10.1016/j.envint.2014.08.009
Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MNDS (2012) Predicting multiple ecotoxicological profiles in agrochemical fungicides: a multi-species chemoinformatic approach. Ecotoxicol Environ Saf 80:308–313. https://doi.org/10.1016/j.ecoenv.2012.03.018
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant nos. 81660589 and 81903543).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zhang, H., Mao, J., Qi, HZ. et al. In silico prediction of drug-induced developmental toxicity by using machine learning approaches. Mol Divers 24, 1281–1290 (2020). https://doi.org/10.1007/s11030-019-09991-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-019-09991-y