Abstract

Probabilistic support vector machine (SVM) in combination with ECFP_4 (Extended Connectivity Fingerprints) were applied to establish a druglikeness filter for molecules. Here, the World Drug Index (WDI) and the Available Chemical Directory (ACD) were used as surrogates for druglike and nondruglike molecules, respectively. Compared with published methods using the same data sets, the classifier significantly improved the prediction accuracy, especially when using a larger data set of 341 601 compounds, which further pushed the correct classification rates up to 92.73%. On the other hand, most characteristic features for drugs and nondrugs found by the current method were visualized, which might be useful as guiding fragments for de novo drug design and fragment based drug design.

Keywords

Support vector machineProbabilistic logicClassifier (UML)Computer scienceArtificial intelligenceProbabilistic classificationDirectoryPattern recognition (psychology)Data miningKernel (algebra)Data setMachine learningMathematicsNaive Bayes classifier

Affiliated Institutions

Related Publications

Publication Info

Year
2007
Type
article
Volume
47
Issue
5
Pages
1776-1786
Citations
49
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

49
OpenAlex

Cite This

Qingliang Li, Andreas Bender, Jianfeng Pei et al. (2007). A Large Descriptor Set and a Probabilistic Kernel-Based Classifier Significantly Improve Druglikeness Classification. Journal of Chemical Information and Modeling , 47 (5) , 1776-1786. https://doi.org/10.1021/ci700107y

Identifiers

DOI
10.1021/ci700107y