Skip to main content
Log in

QSPR with extended topochemical atom (ETA) indices. 4. Modeling aqueous solubility of drug like molecules and agrochemicals following OECD guidelines

  • Original Research
  • Published:
Structural Chemistry Aims and scope Submit manuscript

Abstract

Aqueous solubility is the property of utmost interest for predicting the behavior of chemical compounds inside body, since water serves as the most ubiquitous component of any living cell. Predictive quantitative structure–property relationship models on aqueous solubility try to explore the essential chemical information of molecules that control their dissolution ability. Considering the importance of solubility controlling the absorption, distribution, metabolism, excretion, and toxicity properties of drug and other such chemicals, attempts were made to develop predictive models following OECD guidelines on aqueous solubility of a large set (N = 565) of diverse drug, drug like compounds, and agrochemicals with extended topochemical atom (ETA) indices using suitable chemometric tools. Because of the prime involvement of hydrophobicity in solubilization of structurally complex and crystalline organic compounds, computed lipophilicity parameter ClogP was used. Models were also developed using various other non-ETA descriptors. Additional attempt was made to build models employing ETA, non-ETA, and ClogP parameters. All the models were subjected to rigorous statistical validation using multiple strategies and encouraging results were obtained for internal, external, and overall validation of the models. Comparative analysis performed on the prediction set (test set) using general solubility equation, and the best model developed here with ETA and ClogP parameters demonstrated better predictive potential of the latter model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:3–25

    Article  CAS  Google Scholar 

  2. Clarke ED, Delaney JS (2003) Physical and molecular properties of agrochemicals: an analysis of screen inputs, hits, leads and products. Chimia 57:731–734

    Article  CAS  Google Scholar 

  3. Klamt A, Eckert F, Hornig M, Beck ME, Bürger T (2002) Prediction of aqueous solubility of drugs and pesticides with COSMO-RS. J Comput Chem 23:275–281

    Article  CAS  Google Scholar 

  4. McElroy NR, Jurs PC (2001) Prediction of aqueous solubility of heteroatom-containing organic compounds from molecular structure. J Chem Inf Comput Sci 41:1237–1247

    Article  CAS  Google Scholar 

  5. Schuster D, Laggner C, Langer T (2005) Why drugs fail-a study on side effects in new chemical entities. Curr Pharm Des 11:3545–3559

    Article  CAS  Google Scholar 

  6. Hansen NT, Kouskoumvekaki I, Jørgensen FS, Brunak S, Jo′nsdo′ttir SO (2006) Prediction of pH-dependent aqueous solubility of druglike molecules. J Chem Inf Model 46:2601–2609

    Article  CAS  Google Scholar 

  7. Di L, Kerns EH (2006) Biological assay challenges from compound solubility: strategies for bioassay optimisation. Drug Discovery Today 11:446–451

    Article  CAS  Google Scholar 

  8. McGovern SL, Caselli E, Grigorieff N, Shoichet BK (2002) A common mechanism underlying promiscuous inhibitors from virtual and high throughput screening. J Med Chem 45:1712–1722

    Article  CAS  Google Scholar 

  9. van de Waterbeemd H, Smith DA, Beaumont K, Walker DK (2001) Property-based design: optimization of drug absorption and pharmacokinetics. J Med Chem 44:1–21

    Article  Google Scholar 

  10. Center for Drug Evaluation and Research (2000) Guidance for industry. Rockville, MD, CDER/FDA. http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm070246.pdf. Accessed 26 April 2012

  11. EMEA (2007) Committee for medicinal products for human use, concept paper on BCS-based biowaiver. EMEA, London, EMEA/CHMP/EWP/213035/2007

  12. Stegemann S, Leveiller F, Franchi D, de Jong H, Lindén H (2007) When poor solubility becomes an issue: from early stage to proof of concept. Eur J Pharm Sci 31:249–261

    Article  CAS  Google Scholar 

  13. Smith CJ, Hansch C (2000) The relative toxicity of compounds in mainstream cigarette smoke condensate. Food Chem Toxicol 38:637–646

    Article  CAS  Google Scholar 

  14. Pogãcean MP, Gavrilescu M (2009) Plant protection products and their sustainable and environmentally friendly use. Environ Eng Manag J 8:607–627

    Google Scholar 

  15. Waichman AV, Römbke J, Ribeiro MOA, Nina NCS (2002) Use and fate of pesticides in the Amazon State, Brazil. Risk to human health and the environment. Environ Sci Pollut Res 9:423–428

    Article  Google Scholar 

  16. Jain N, Yalkowsky SH (2001) Estimation of the aqueous solubility I: application to organic nonelectrolytes. J Pharm Sci 90:234–252

    Article  CAS  Google Scholar 

  17. Faller B, Ertl P (2007) Computational approaches to determine drug solubility. Adv Drug Deliv Rev 59:533–545

    Article  CAS  Google Scholar 

  18. Taskinen J (2000) Prediction of aqueous solubility in drug design. Curr Opin Drug Discov Dev 3:102–107

    CAS  Google Scholar 

  19. Jorgensena WL, Duffy EM (2002) Prediction of drug solubility from structure. Adv Drug Deliv Rev 54:355–366

    Article  Google Scholar 

  20. Worth AP, Bassan A, De Bruijn J, Saliner AG, Netzeva T, Patlewicz G, Pavan M, Tsakovska I, Eisenreich S (2007) The role of the European Chemicals Bureau in promoting the regulatory use of (Q)SAR methods. SAR QSAR Environ Res 18:111–125

    Article  CAS  Google Scholar 

  21. OECD Environment Health and Safety Publications Series on Testing and Assessment No. 69 (2007) Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models. http://www.oecd.org/officialdocuments/displaydocumentpdf/?cote=env/jm/mono(2007)2&doclanguage=en. Accessed 26 April 2012

  22. Bhattachar SN, Deschenes LA, Wesley JA (2006) Solubility: it’s not just for physical chemists. Drug Discovery Today 11:1012–1018

    Article  CAS  Google Scholar 

  23. Yalkowsky SH, Banerjee S (1992) Aqueous solubility: methods of estimation for organic compounds. Marcel Dekker, New York

    Google Scholar 

  24. Peterson DL, Yalkowski SH (2001) Comparison of two methods for predicting aqueous solubility. J Chem Inf Comput Sci 41:1531–1534

    Article  CAS  Google Scholar 

  25. Ran Y, Yalkowsky SH (2001) Prediction of drug solubility by the general solubility equation (GSE). J Chem Inf Comput Sci 41:354–357

    Article  CAS  Google Scholar 

  26. Ran Y, Jain N, Yalkowsky SH (2001) Prediction of aqueous solubility of organic compounds by the general solubility equation (GSE). J Chem Inf Comput Sci 41:1207–1208

    Google Scholar 

  27. Meylan WM, Howard PH, Boethling RS (1996) Improved method for estimating water solubility from octanol/water coefficient. Environ Toxicol Chem 15:100–106

    Article  CAS  Google Scholar 

  28. Meylan WM, Howard PH (2000) Estimating log P with atom/fragments and water solubility with logP. Perspect Drug Discovery Des 19:67–84

    Article  CAS  Google Scholar 

  29. Myrdal P, Ward GH, Dannenfelser RM, Mishra DS, Yalkowsky SH (1992) AQUAFAC 1: aqueous Functional group activity coefficients: application to hydrocarbons. Chemosphere 24:1047–1061

    Article  CAS  Google Scholar 

  30. Ruelle P, Rey-Mermet C, Buchmann M, Nam-Tran H, Kesselring U, Huyskens P (1991) A new predictive equation for the solubility of drugs based on the thermodynamics of mobile disorder. Pharm Res 8:840–850

    Article  CAS  Google Scholar 

  31. Roy K, Das RN (2011) On some novel extended topochemical atom (ETA) parameters for effective encoding of chemical information and modeling of fundamental physicochemical properties. SAR QSAR Environ Res 22:451–472

    Article  CAS  Google Scholar 

  32. Delaney JS (2005) Predicting aqueous solubility from structure. Drug Discovery Today 10:289–295

    Article  CAS  Google Scholar 

  33. Huuskonen J (2001) Estimation of aqueous solubility in drug design. Comb Chem HTS 4:311–316

    CAS  Google Scholar 

  34. Huuskonen J, Livingstone DJ, Manallack DT (2008) Prediction of drug solubility from molecular structure using a drug-like training set. SAR QSAR Env Res 19:191–212

    Article  CAS  Google Scholar 

  35. Yalkowsky SH, Dannelfelser RM (1990) The Arizona database of aqueous solubility. College of Pharmacy, University of Arizona, Tucson

    Google Scholar 

  36. O’Neill MJ, Smith A, Heckelman PE (eds) (2001) The Merck Index: an encyclopedia of chemicals, drugs, and biologicals, 13th edn. Whitehouse Station, Rahway

    Google Scholar 

  37. CambridgeSoft Corporation (2012) Cambridge USA, http://chemfinder.cambridgesoft.com/. Accessed 26 April 2012

  38. Syracuse Research Corporation (2012) Syracuse, USA, http://www.syrres.com/esc/physprop.htm. Accessed 26 April 2012

  39. PubChem (2012) PubChem is a linked database of compounds and provides fast chemical structure similarity search tool. http://pubchem.ncbi.nlm.nih.gov/. Accessed 26 April 2012

  40. The National Institute of Standards and Technology (NIST) Chemistry WebBook is a database of chemicals compiled under the Standard Reference Data Program. http://webbook.nist.gov/chemistry/. Accessed 26 April 2012

  41. ChemSpideris (2012) ChemSpideris a free chemical structure database governed by the Royal Society of Chemistry, Cambridge. http://www.chemspider.com/. Accessed 26 April 2012

  42. Roy K, Ghosh G (2003) Introduction of extended topochemical atom (ETA) indices in the valence electron mobile (VEM) environment as tools for QSAR/QSPR studies. Internet Electron J Mol Des 2:599–620

    CAS  Google Scholar 

  43. Roy K, Ghosh G (2004) Introduction of extended topochemical atom (ETA) Indices in the valence electron mobile (VEM) environment as tools for QSAR/QSPR studies QSTR with extended topochemical atom indices. 2. Fish toxicity of substituted benzenes. J Chem Inf Comput Sci 44:559–567

    Article  CAS  Google Scholar 

  44. Roy K, Ghosh G (2004) QSTR with extended topochemical atom indices: 3. Toxicity of nitrobenzenes to Tetrahymena pyriformis. QSAR Comb Sci 23:99–108

    Article  CAS  Google Scholar 

  45. Roy K, Ghosh G (2004) QSTR with extended topochemical atom indices: 4. Modeling of the acute toxicity of phenylsulfonyl carboxylates to Vibrio fischeri using principal component factor analysis and principal component regression analysis. QSAR Comb Sci 23:526–535

    Article  CAS  Google Scholar 

  46. Roy K, Ghosh G (2005) QSTR with extended topochemical atom indices. Part 5. Modeling of the acute toxicity of phenylsulfonyl carboxylates to Vibrio fischeri using genetic function approximation. Bioorg Med Chem 13:1185–1194

    Article  CAS  Google Scholar 

  47. Roy K, Ghosh G (2006) QSTR with extended topochemical atom (ETA) indices: vI. Acute toxicity of benzene derivatives to tadpoles (Rana japonica). J Mol Model 12:306–316

    Article  CAS  Google Scholar 

  48. Roy K, Sanyal I (2006) QSTR with extended topochemical atom indices: 7. QSAR of substituted benzenes to Saccharomyces cerevisiae. QSAR Comb Sci 25:359–371

    Article  CAS  Google Scholar 

  49. Roy K, Ghosh G (2006) QSTR with extended topochemical atom (ETA) indices: 8. QSAR for the inhibition of substituted phenols on germination rate of Cucumis sativus using chemometric tools. QSAR Comb Sci 25:846–859

    Article  CAS  Google Scholar 

  50. Roy K, Ghosh G (2007) QSTR with extended topochemical atom (ETA) indices: 9. Comparative QSAR for the toxicity of diverse functional organic compounds to Chlorella vulgaris using chemometric tools. Chemosphere 70:1–12

    Article  CAS  Google Scholar 

  51. Roy K, Ghosh G (2008) QSTR with extended topochemical atom indices: 10. Modeling of toxicity of organic chemicals to humans using different chemometric tools. Chem Biol Drug Des 72:383–394

    Article  CAS  Google Scholar 

  52. Roy K, Ghosh G (2009) QSTR with extended topochemical atom (ETA) indices. 11. Comparative QSAR of acute NSAID cytotoxicity in rat hepatocytes using chemometric tools. Mol Simul 35:648–659

    Article  CAS  Google Scholar 

  53. Roy K, Ghosh G (2009) QSTR with extended topochemical atom (ETA) indices. 12. QSAR for the toxicity of diverse aromatic compounds to Tetrahymena pyriformis using chemometric tools. Chemosphere 77:999–1009

    Article  CAS  Google Scholar 

  54. Roy K, Ghosh G (2009) QSTR with extended topochemical atom (ETA) Indices. 13. Modeling of hERG K+ channel blocking activity of diverse functional drugs using different chemometric tools. Mol Simul 15:1256–1268

    Article  Google Scholar 

  55. Roy K, Das RN (2010) QSTR with extended topochemical atom (ETA) indices. 14. QSAR modeling of toxicity of aromatic aldehydes to Tetrahymena pyriformis. J Hazard Mater 183:913–922

    Article  CAS  Google Scholar 

  56. Roy K, Das RN (2012) QSTR with extended topochemical atom (ETA) indices. 15. Development of predictive models for toxicity of organic chemicals against fathead minnow using second generation ETA indices. SAR QSAR Environ Res 23:125–140

    Article  CAS  Google Scholar 

  57. Roy K, Sanyal I, Roy PP (2006) QSPR of the bio-concentration factors of nonionic organic compounds in fish using extended topochemical atom (ETA) indices. SAR QSAR Environ Res 17:563–582

    Article  CAS  Google Scholar 

  58. Roy K, Sanyal I, Ghosh G (2006) QSPR of n-octanol/water partition coefficient of non-ionic organic compounds using extended topochemical atom (ETA) indices. QSAR Comb Sci 25:629–646

    Article  Google Scholar 

  59. Roy K, Ghosh G (2010) Exploring QSARs with extended topochemical atom (ETA) indices for modeling chemical and drug toxicity. Curr Pharm Des 16:2625–2639

    Article  CAS  Google Scholar 

  60. Roy K, Das RN (2011) On extended topochemical atom (ETA) indices for QSPR studies. In: Castro EA, Hagi AK (eds) Advanced methods and applications in chemoinformatics: research progress and new applications. IGI Global, Hershey

    Google Scholar 

  61. Roy K, Kabir H (2012) QSPR with extended topochemical atom (ETA) indices. Modeling of critical micelle concentration of non-ionic surfactants. Chem Engg Sci 73:86–98

    Article  CAS  Google Scholar 

  62. Pal DK, Sengupta C, De AU (1988) A new topochemical descriptor (TAU) in molecular connectivity concept: part I—aliphatic compounds. Ind J Chem 27B:734–739

    CAS  Google Scholar 

  63. Pal DK, Purkayastha SK, Sengupta C, De AU (1992) Quantitative structure—property relationships with TAU indices: part I—research octane numbers of alkane fuel molecules. Ind J Chem 31B:109–114

    CAS  Google Scholar 

  64. Roy K, Saha A (2003) QSPR with TAU indices: water solubility of diverse functional acyclic compounds. Internet Electron J Mol Des 2:475–491

    CAS  Google Scholar 

  65. Roy K, Saha A (2004) QSPR with TAU indices: boiling points of sulfides and thiols. Ind J Chem 43A:1369–1376

    CAS  Google Scholar 

  66. Roy K, Saha A (2005) QSPR with TAU indices: molar refractivity of diverse functional acyclic compounds. Ind J Chem 44B:1693–1707

    CAS  Google Scholar 

  67. Leo AJ (1991) CLOGP, version 3.63. Daylight Chemical Information Systems, Irvine

  68. Roy PP, Leonard JT, Roy K (2008) Exploring the impact of the size of training sets for the development of predictive QSAR models. Chemom Intell Lab Syst 90:31–42

    Article  CAS  Google Scholar 

  69. Stephens MA (1976) Asymptotic results for goodness-of-fit statistics with unknown parameters. Ann Stat 4:357–369

    Article  Google Scholar 

  70. Massey FJ Jr (1951) The Kolmogorov–Smirnov test for goodness of fit. J Am Stat Assoc 46:68–78

    Article  Google Scholar 

  71. Lilliefors HW (1967) On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J Am Stat Assoc 64:399–402

    Article  Google Scholar 

  72. Hutter MC (2011) Determining the degree of randomness of descriptors in linear regression equations with respect to the data size. J Chem Inf Model 51:3099–3104

    Article  CAS  Google Scholar 

  73. Darlington RB (1990) Regression and linear models. McGrawHill, New York

    Google Scholar 

  74. Wold S (1995) In: van de Waterbeemd H (ed) Chemometric methods in molecular design. VCH, Weinheim

    Google Scholar 

  75. Wold H (1966) In: David FN (ed) Research papers in statistics, Festschrift for J. Neyman. Wiley, New York

    Google Scholar 

  76. Holland J (1975) Adaptation in artificial and natural systems. University of Michigan Press, Ann Arbor

    Google Scholar 

  77. Friedman J (1988) Multivariate adaptive regression splines, technical report No. 102. Laboratory for Computational Statistics, Department of Statistics, Stanford University, Stanford, CA, Novemer (revised August 1990)

  78. Rogers D, Hopfinger AJ (1994) Application of genetic function approximation to quantitative structure—activity relationships and quantitative structure—property relationships. J Chem Inf Comput Sci 34:854–866

    Article  CAS  Google Scholar 

  79. Yap CW (2011) PaDEL-Descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474

    Article  CAS  Google Scholar 

  80. Cerius 2 Version 4.10 (2005) Accelrys Inc., San Diego, CA, USA. Software. http://www.accelrys.com. Accessed 26 April 2012

  81. MINITAB, Minitab Inc., USA (2012) Software. http://www.minitab.com/en-US/default.aspx. Accessed 26 April 2012

  82. STATISTICA, STATSOFT Inc., USA (2012) Software. http://www.statsoft.com. Accessed 26 April 2012

  83. Snedecor GW, Cochran WG (1967) Statistical methods. Oxford & IBH, New Delhi

    Google Scholar 

  84. Hawkins DM, Basak SC, Mills D (2003) Assessing model fit by cross-validation. J Chem Inf Comput Sci 43:579–586

    Article  CAS  Google Scholar 

  85. Schürmann G, Ebert R-U, Chen J, Wang B, Kühne R (2008) External validation and prediction employing the predictive squared correlation coefficients test set activity mean vs training set activity mean. J Chem Inf Model 48:2140–2145

    Article  Google Scholar 

  86. Roy PP, Paul S, Mitra I, Roy K (2009) On two novel parameters for validation of predictive QSAR models. Molecules 14:1660–1701

    Article  CAS  Google Scholar 

  87. Mitra I, Roy PP, Kar S, Ojha PK, Roy K (2010) On further application of r 2m as a metric for validation of QSAR models. J Chemom 24:22–33

    Article  CAS  Google Scholar 

  88. Ojha PK, Mitra I, Das RN, Roy K (2011) Further exploring r 2m metrics for validation of QSPR models. Chemom Intell Lab Syst 107:194–205

    Article  CAS  Google Scholar 

  89. Roy K, Mitra I, Kar S, Ojha PK, Das RN, Kabir H (2012) Comparative studies on some metrics for external validation of QSPR models. J Chem Inf Model 52:396–408

    Article  CAS  Google Scholar 

  90. Todeschini R (2010) Milano chemometrics, Italy (personal communication)

  91. Wold S, Sjöström M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst 58:109–130

    Article  CAS  Google Scholar 

Download references

Acknowledgments

Financial assistance from the Council of Scientific and Industrial Research, Government of India, New Delhi in the form of a fellowship to R.N.D. is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kunal Roy.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 222 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Das, R.N., Roy, K. QSPR with extended topochemical atom (ETA) indices. 4. Modeling aqueous solubility of drug like molecules and agrochemicals following OECD guidelines. Struct Chem 24, 303–331 (2013). https://doi.org/10.1007/s11224-012-0080-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11224-012-0080-5

Keywords

Navigation