Abstract

The prediction of interresidue contacts and distances from coevolutionary data using deep learning has considerably advanced protein structure prediction. Here, we build on these advances by developing a deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on 13th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)- and Continuous Automated Model Evaluation (CAMEO)-derived sets, the method outperforms all previously described structure-prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo-designed proteins, identifying the key fold-determining residues and providing an independent quantitative measure of the “ideality” of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.

Keywords

Benchmark (surveying)Protein structure predictionComputer scienceResidualArtificial intelligenceNetwork structureRange (aeronautics)Measure (data warehouse)Deep learningProtein structureMinificationData miningMachine learningAlgorithmEngineeringChemistry

MeSH Terms

AnimalsDeep LearningHumansProtein ConformationSequence AnalysisProteinSoftware

Affiliated Institutions

Related Publications

Publication Info

Year
2020
Type
article
Volume
117
Issue
3
Pages
1496-1503
Citations
1491
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1491
OpenAlex
89
Influential

Cite This

Jianyi Yang, Ivan Anishchenko, Hahnbeom Park et al. (2020). Improved protein structure prediction using predicted interresidue orientations. Proceedings of the National Academy of Sciences , 117 (3) , 1496-1503. https://doi.org/10.1073/pnas.1914677117

Identifiers

DOI
10.1073/pnas.1914677117
PMID
31896580
PMCID
PMC6983395

Data Quality

Data completeness: 90%