Abstract

Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall. A general tendency is that performance scales with network size, suggesting that new interaction discovery currently outweighs the detrimental effects of false positives. Correcting for size, we find that the DIP network provides the highest efficiency (value per interaction). Based on these results, we create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research.

Keywords

False positive paradoxBiological networkComputational biologyBiologyGene regulatory networkGenomeComputer scienceGeneSelection (genetic algorithm)GeneticsMachine learningGene expression

MeSH Terms

AlgorithmsComputational BiologyGene Regulatory NetworksGenetic Predisposition to DiseaseGenomeHumanHumans

Affiliated Institutions

Related Publications

The human disease network

A network of disorders and disease genes linked by known disorder–gene associations offers a platform to explore in a single graph-theoretic framework all known phenotype and di...

2007 Proceedings of the National Academy o... 3182 citations

Publication Info

Year
2018
Type
article
Volume
6
Issue
4
Pages
484-495.e5
Citations
302
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

302
OpenAlex
14
Influential
251
CrossRef

Cite This

Justin K. Huang, Daniel E. Carlin, Michael Yu et al. (2018). Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. Cell Systems , 6 (4) , 484-495.e5. https://doi.org/10.1016/j.cels.2018.03.001

Identifiers

DOI
10.1016/j.cels.2018.03.001
PMID
29605183
PMCID
PMC5920724

Data Quality

Data completeness: 90%