OLGA: Fast Computation of Generation Probabilities of B- and T-cell Receptor Amino Acid Sequences and Motifs
Overview
Affiliations
Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem.
Results: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design.
Availability And Implementation: Source code is available at https://github.com/zsethna/OLGA.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Chernigovskaya M, Pavlovic M, Kanduri C, Gielis S, Robert P, Scheffer L Nucleic Acids Res. 2025; 53(3).
PMID: 39873270 PMC: 11773363. DOI: 10.1093/nar/gkaf025.
T cell receptor-centric perspective to multimodal single-cell data analysis.
Mullan K, Ha M, Valkiers S, de Vrij N, Ogunjimi B, Laukens K Sci Adv. 2024; 10(48):eadr3196.
PMID: 39612336 PMC: 11606500. DOI: 10.1126/sciadv.adr3196.
Local and Global Variability in Developing Human T-Cell Repertoires.
Isacchini G, Quiniou V, Barennes P, Mhanna V, Vantomme H, Stys P PRX Life. 2024; 2(1).
PMID: 39582620 PMC: 11583800. DOI: 10.1103/prxlife.2.013011.
An unbiased comparison of immunoglobulin sequence aligners.
Konstantinovsky T, Peres A, Polak P, Yaari G Brief Bioinform. 2024; 25(6).
PMID: 39489605 PMC: 11531861. DOI: 10.1093/bib/bbae556.
Hanna S, Bonami R, Corrie B, Westley M, Posgai A, Luning Prak E Diabetologia. 2024; 68(1):186-202.
PMID: 39467874 PMC: 11663175. DOI: 10.1007/s00125-024-06298-y.