» Articles » PMID: 36408920

UniProt: the Universal Protein Knowledgebase in 2023

Overview
Specialty Biochemistry
Date 2022 Nov 21
PMID 36408920
Affiliations
Soon will be listed here.
Abstract

The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this publication we describe enhancements made to our data processing pipeline and to our website to adapt to an ever-increasing information content. The number of sequences in UniProtKB has risen to over 227 million and we are working towards including a reference proteome for each taxonomic group. We continue to extract detailed annotations from the literature to update or create reviewed entries, while unreviewed entries are supplemented with annotations provided by automated systems using a variety of machine-learning techniques. In addition, the scientific community continues their contributions of publications and annotations to UniProt entries of their interest. Finally, we describe our new website (https://www.uniprot.org/), designed to enhance our users' experience and make our data easily accessible to the research community. This interface includes access to AlphaFold structures for more than 85% of all entries as well as improved visualisations for subcellular localisation of proteins.

Citing Articles

Seq2Topt: a sequence-based deep learning predictor of enzyme optimal temperature.

Qiu S, Hu B, Zhao J, Xu W, Yang A Brief Bioinform. 2025; 26(2).

PMID: 40079266 PMC: 11904407. DOI: 10.1093/bib/bbaf114.


The genome sequence of the Coppice Mining Bee, (Linnaeus, 1758).

Falk S, Monks J Wellcome Open Res. 2025; 10:102.

PMID: 40078958 PMC: 11897692. DOI: 10.12688/wellcomeopenres.23746.1.


The genome sequence of the Dotted Footman moth, (Hufnagel, 1767).

Fletcher C, Lees D Wellcome Open Res. 2025; 10:106.

PMID: 40078957 PMC: 11897693. DOI: 10.12688/wellcomeopenres.23766.1.


Foundation models in bioinformatics.

Guo F, Guan R, Li Y, Liu Q, Wang X, Yang C Natl Sci Rev. 2025; 12(4):nwaf028.

PMID: 40078374 PMC: 11900445. DOI: 10.1093/nsr/nwaf028.


Structural Dynamics of OATP1A2 in Mediating Paclitaxel Transport Mechanism in Breast Cancer.

Kumar R, Singh G, Akhter Y, Kaithwas G, Agrawal A, Singh S Nanotheranostics. 2025; 9(1):52-62.

PMID: 40078312 PMC: 11898720. DOI: 10.7150/ntno.103095.


References
1.
Huang J, Chen M, Chen D, Gao X, Zhu S, Huang H . A Peptide Encoded by a Putative lncRNA HOXB-AS3 Suppresses Colon Cancer Growth. Mol Cell. 2017; 68(1):171-184.e6. DOI: 10.1016/j.molcel.2017.09.015. View

2.
Cantelli G, Bateman A, Brooksbank C, Petrov A, Malik-Sheriff R, Ide-Smith M . The European Bioinformatics Institute (EMBL-EBI) in 2021. Nucleic Acids Res. 2021; 50(D1):D11-D19. PMC: 8690175. DOI: 10.1093/nar/gkab1127. View

3.
Del-Toro N, Duesbury M, Koch M, Perfetto L, Shrivastava A, Ochoa D . Capturing variation impact on molecular interactions in the IMEx Consortium mutations data set. Nat Commun. 2019; 10(1):10. PMC: 6315030. DOI: 10.1038/s41467-018-07709-6. View

4.
MacDougall A, Volynkin V, Saidi R, Poggioli D, Zellner H, Hatton-Ellis E . UniRule: a unified rule resource for automatic annotation in the UniProt Knowledgebase. Bioinformatics. 2021; 36(22-23):5562. PMC: 8016456. DOI: 10.1093/bioinformatics/btaa663. View

5.
Morales J, Pujar S, Loveland J, Astashyn A, Bennett R, Berry A . A joint NCBI and EMBL-EBI transcript set for clinical genomics and research. Nature. 2022; 604(7905):310-315. PMC: 9007741. DOI: 10.1038/s41586-022-04558-8. View