Abstract
Probabilistic support vector machine (SVM) in combination with ECFP_4 (Extended Connectivity Fingerprints) were applied to establish a druglikeness filter for molecules. Here, the World Drug Index (WDI) and the Available Chemical Directory (ACD) were used as surrogates for druglike and nondruglike molecules, respectively. Compared with published methods using the same data sets, the classifier significantly improved the prediction accuracy, especially when using a larger data set of 341 601 compounds, which further pushed the correct classification rates up to 92.73%. On the other hand, most characteristic features for drugs and nondrugs found by the current method were visualized, which might be useful as guiding fragments for de novo drug design and fragment based drug design.
Keywords
Affiliated Institutions
Related Publications
Can We Learn To Distinguish between “Drug-like” and “Nondrug-like” Molecules?
We have used a Bayesian neural network to distinguish between drugs and nondrugs. For this purpose, the CMC acts as a surrogate for drug-like molecules while the ACD is a surrog...
Gene functional classification from heterogeneous data
In our attempts to understand cellular function at the molecular level, we must be able to synthesize information from disparate types of genomic data. We consider the problem o...
Support Vector Machine Classification of Microarray Data
The Problem: Use the learning from examples paradigm to make class predictions and infer genes involved in these predictions from DNA microarray expression data. Specifically, w...
Enrichment of High-Throughput Screening Data with Increasing Levels of Noise Using Support Vector Machines, Recursive Partitioning, and Laplacian-Modified Naive Bayesian Classifiers
High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probab...
Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy
Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion base...
Publication Info
- Year
- 2007
- Type
- article
- Volume
- 47
- Issue
- 5
- Pages
- 1776-1786
- Citations
- 49
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1021/ci700107y