UniProt archive | RDL Research Database

Abstract

Abstract Summary: UniProt Archive (UniParc) is the most comprehensive, non-redundant protein sequence database available. Its protein sequences are retrieved from predominant, publicly accessible resources. All new and updated protein sequences are collected and loaded daily into UniParc for full coverage. To avoid redundancy, each unique sequence is stored only once with a stable protein identifier, which can be used later in UniParc to identify the same protein in all source databases. When proteins are loaded into the database, database cross-references are created to link them to the origins of the sequences. As a result, performing a sequence search against UniParc is equivalent to performing the same search against all databases cross-referenced by UniParc. UniParc contains only protein sequences and database cross-references; all other information must be retrieved from the source databases. Availability: http://www.ebi.ac.uk/uniparc/

Keywords

UniProtIdentifierSequence databaseComputer scienceDatabaseProtein sequencingRedundancy (engineering)Sequence (biology)Information retrievalData miningPeptide sequenceBiologyProgramming language

Affiliated Institutions

Related Publications

The Jpred 3 secondary structure prediction server

Christian Cole , Jonathan D. Barber , Geoffrey J. Barton

Jpred (http://www.compbio.dundee.ac.uk/jpred) is a secondary structure prediction server powered by the Jnet algorithm. Jpred performs over 1000 predictions per week for users i...

2008 Nucleic Acids Research 1439 citations

TarO: a target optimisation system for structural biology

Ian M. Overton , C. A. Johannes van Niekerk , Lester G. Carter +8 more

TarO (http://www.compbio.dundee.ac.uk/taro) offers a single point of reference for key bioinformatics analyses relevant to selecting proteins or domains for study by structural ...

2008 Nucleic Acids Research 114 citations

Similarity Search in High Dimensions via Hashing

Aristides Gionis , Piotr Indyk , Rajeev Motwani

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasin...

1999 3096 citations

VSEARCH: a versatile open source tool for metagenomics

Torbjørn Rognes , Tomáš Flouri , Ben Nichols +2 more

Background VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence...

2016 PeerJ 10017 citations

Generating consensus sequences from partialorder multiple sequence alignment graphs

Christopher J. Lee

Abstract Motivation: Consensus sequence generation is important in many kinds of sequence analysis ranging from sequence assembly to profile-based iterative search methods. Howe...

2003 Bioinformatics 99 citations

Publication Info

Year: 2004
Type: article
Volume: 20
Issue: 17
Pages: 3236-3237
Citations: 209
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

UniProt archive

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

209

OpenAlex

Cite This

APA Style

                            
                                
                                    Rasko Leinonen, 
                                
                                    Federico Garcia Diez, 
                                
                                    David Binns
                                
                                et al.
                            
                            (2004). 
                            UniProt archive. 
                            Bioinformatics
                            , 20
                            (17)
                            , 3236-3237.
                            https://doi.org/10.1093/bioinformatics/bth191
                        

Identifiers

DOI: 10.1093/bioinformatics/bth191