Abstract

Accurate prediction of RNA three-dimensional (3D) structure remains an unsolved challenge. Determining RNA 3D structures is crucial for understanding their functions and informing RNA-targeting drug development and synthetic biology design. The structural flexibility of RNA, which leads to scarcity of experimentally determined data, complicates computational prediction efforts. Here, we present RhoFold+, an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences. By integrating an RNA language model pre-trained on ~23.7 million RNA sequences and leveraging techniques to address data scarcity, RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction. Retrospective evaluations on RNA-Puzzles and CASP15 natural RNA targets demonstrate RhoFold+'s superiority over existing methods, including human expert groups. Its efficacy and generalizability are further validated through cross-family and cross-type assessments, as well as time-censored benchmarks. Additionally, RhoFold+ predicts RNA secondary structures and inter-helical angles, providing empirically verifiable features that broaden its applicability to RNA structure and function studies.

Keywords

RNADeep learningComputer scienceNucleic acid structureArtificial intelligencePipeline (software)Nucleic acid secondary structureAlgorithmProtein secondary structureComputational biologyBiologyGenetics

Related Publications

Publication Info

Year
2022
Type
preprint
Citations
57
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

57
OpenAlex

Cite This

Tao Shen, Zhigang Hu, Zhangzhi Peng et al. (2022). Accurate RNA 3D structure prediction using a language model-based deep learning approach. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.2207.01586

Identifiers

DOI
10.48550/arxiv.2207.01586