AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences

2023 Nucleic Acids Research 1,411 citations

Abstract

Abstract The AlphaFold Database Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled by the groundbreaking AlphaFold2 artificial intelligence (AI) system, the predictions archived in AlphaFold DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements in data archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, and a host of curated protein datasets. We detail the data access mechanisms of AlphaFold DB, from direct file access via FTP to advanced queries using Google Cloud Public Datasets and the programmatic access endpoints of the database. We also discuss the improvements and services added since its initial release, including enhancements to the Predicted Aligned Error viewer, customisation options for the 3D viewer, and improvements in the search engine of AlphaFold DB.

Keywords

BiologyProtein structureComputational biologyDatabaseGeneticsBiochemistryComputer science

Affiliated Institutions

Related Publications

Publication Info

Year
2023
Type
article
Volume
52
Issue
D1
Pages
D368-D375
Citations
1411
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1411
OpenAlex

Cite This

Mihály Váradi, Damian Bertoni, Paulyna Magaña et al. (2023). AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Research , 52 (D1) , D368-D375. https://doi.org/10.1093/nar/gkad1011

Identifiers

DOI
10.1093/nar/gkad1011