Machine Learning and Scoring Functions (SFs) for Molecular Drug Discovery: Prediction and Characterisation of Druggable Drugs and Targets

I. L. Hudson, S. Y. Leemaqz, A. D. Abell

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


Predicting druggability and prioritising disease-modifying targets is critical in drug discovery. In this chapter, we describe the testing of a druggability rule based on 9 molecular parameters, which uses cutpoints for each molecular parameter and targets based on mixture clustering discriminant analysis. We demonstrate that principal component constructs and score functions of violations can be used to identify the hidden pattern of druggable molecules and disease targets. Random Forest and Artificial Neural Network rules to classify the high-score target from the low-score molecular violators, based both on molecular parameters and the principal component constructs, have confirmed the value of logD's inclusion in the scoring function. Our scoring functions of counts of violations and novel principal component analytic molecular and target-based constructs partitioned chemospace well, identifying both good and poor druggable molecules and targets. Viable molecules and targets were located in both the beyond Rule of 5 and expanded Rule of 5 regions. Random Forest and Artificial Neural Networks showed different variable importance profiles, with Artificial Neural Networks models performing better than Random Forests. The most important molecular descriptors that influence classification, by the Random Forest methods, were MW, NATOM, logD, and PSA. The optimal Artificial Neural Networks target models indicated that PSA and logD were more important than the traditional parameter MW. Overall, our score 4 partitions using logD were optimal at classification as shown in all Random Forests and Artificial Neural Networks analyses.

Original languageEnglish
Title of host publicationMachine Learning in Chemistry
Subtitle of host publicationThe Impact of Artificial Intelligence
EditorsHugh M. Cartwright
Place of PublicationLondon
PublisherRoyal Society of Chemistry
Number of pages29
ISBN (Electronic) 9781839160233, 9781839160240
ISBN (Print) 9781788017893
Publication statusPublished - 2020
Externally publishedYes

Publication series

NameRSC Theoretical and Computational Chemistry Series
ISSN (Print)2041-3181
ISSN (Electronic)2041-319X


  • Machine Learning
  • Chemistry
  • drug discovery
  • druggability


Dive into the research topics of 'Machine Learning and Scoring Functions (SFs) for Molecular Drug Discovery: Prediction and Characterisation of Druggable Drugs and Targets'. Together they form a unique fingerprint.

Cite this