-
How different are self and nonself?
Authors:
Andreas Mayer,
Jonathan A. Levine,
Christopher J. Russo,
Quentin Marcou,
William Bialek,
Benjamin D. Greenbaum
Abstract:
Biological and artificial networks routinely make reliable distinctions between similar inputs, and the rules for making these distinctions are learned. In some ways, self/nonself discrimination in the immune system is similar, being both reliable and (partly) learned through thymic selection. In contrast to other examples, we show that the distributions of self and nonself peptides are nearly ide…
▽ More
Biological and artificial networks routinely make reliable distinctions between similar inputs, and the rules for making these distinctions are learned. In some ways, self/nonself discrimination in the immune system is similar, being both reliable and (partly) learned through thymic selection. In contrast to other examples, we show that the distributions of self and nonself peptides are nearly identical but strongly inhomogeneous. Reliable discrimination is possible only because self-peptides are a particular finite sample drawn out of this distribution, and T cells can target the spaces in between these samples. In conventional learning problems, this would constitute overfitting and lead to disaster. Here, the strong inhomogeneities imply instead that the immune system gains by targeting peptides which are similar to self, with maximum sensitivity for sequences just one or two substitutions away. This prediction from the structure of the underlying distribution in sequence space agrees, for example, with the observed responses to mutation derived cancer neoantigens.
△ Less
Submitted 25 April, 2025; v1 submitted 22 December, 2022;
originally announced December 2022.
-
Genesis of the alpha beta T-cell receptor
Authors:
Thomas Dupic,
Quentin Marcou,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
The T-cell (TCR) repertoire relies on the diversity of receptors composed of two chains, called $α$ and $β$, to recognize pathogens. Using results of high throughput sequencing and computational chain-pairing experiments of human TCR repertoires, we quantitively characterize the $αβ$ generation process. We estimate the probabilities of a rescue recombination of the $β$ chain on the second chromoso…
▽ More
The T-cell (TCR) repertoire relies on the diversity of receptors composed of two chains, called $α$ and $β$, to recognize pathogens. Using results of high throughput sequencing and computational chain-pairing experiments of human TCR repertoires, we quantitively characterize the $αβ$ generation process. We estimate the probabilities of a rescue recombination of the $β$ chain on the second chromosome upon failure or success on the first chromosome. Unlike $β$ chains, $α$ chains recombine simultaneously on both chromosomes, resulting in correlated statistics of the two genes which we predict using a mechanistic model. We find that $\sim 28 \%$ of cells express both $α$ chains. We report that clones sharing the same $β$ chain but different $α$ chains are overrepresented, suggesting that they respond to common immune challenges. Altogether, our statistical analysis gives a complete quantitative mechanistic picture that results in the observed correlations in the generative process. We learn that the probability to generate any TCR$αβ$ is lower than $10^{-12}$ and estimate the generation diversity and sharing properties of the $αβ$ TCR repertoire.
△ Less
Submitted 11 December, 2018; v1 submitted 28 June, 2018;
originally announced June 2018.
-
IGoR: a tool for high-throughput immune repertoire analysis
Authors:
Quentin Marcou,
Thierry Mora,
Aleksandra M Walczak
Abstract:
High throughput immune repertoire sequencing is promising to lead to new statistical diagnostic tools for medicine and biology. Successful implementations of these methods require a correct characterization, analysis and interpretation of these datasets. We present IGoR -- a new comprehensive tool that takes B or T-cell receptors sequence reads and quantitatively characterizes the statistics of re…
▽ More
High throughput immune repertoire sequencing is promising to lead to new statistical diagnostic tools for medicine and biology. Successful implementations of these methods require a correct characterization, analysis and interpretation of these datasets. We present IGoR -- a new comprehensive tool that takes B or T-cell receptors sequence reads and quantitatively characterizes the statistics of receptor generation from both cDNA and gDNA. It probabilistically annotates sequences and its modular structure can investigate models of increasing biological complexity for different organisms. For B-cells IGoR returns the hypermutation statistics, which we use to reveal co-localization of hypermutations along the sequence. We demonstrate that IGoR outperforms existing tools in accuracy and estimate the sample sizes needed for reliable repertoire characterization.
△ Less
Submitted 23 May, 2017;
originally announced May 2017.
-
A model for the integration of conflicting exogenous and endogenous signals by dendritic cells
Authors:
Quentin Marcou,
Irit Carmi-Levy,
Coline Trichot,
Vassili Soumelis,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Cells of the immune system are confronted with opposing pro- and anti-inflammatory signals. Dendritic cells (DC) integrate these cues to make informed decisions whether to initiate an immune response. Confronted with exogenous microbial stimuli, DC endogenously produce both anti- (IL-10) and pro-inflammatory (TNF$α$) cues whose joint integration controls the cell's final decision. We combine exper…
▽ More
Cells of the immune system are confronted with opposing pro- and anti-inflammatory signals. Dendritic cells (DC) integrate these cues to make informed decisions whether to initiate an immune response. Confronted with exogenous microbial stimuli, DC endogenously produce both anti- (IL-10) and pro-inflammatory (TNF$α$) cues whose joint integration controls the cell's final decision. We combine experimental measurements with theoretical modeling to quantitatively describe the integration mode of these opposing signals. We propose a two step integration model that modulates the effect of the two types of signals: an initial bottleneck integrates both signals (IL-10 and TNF$α$), the output of which is later modulated by the anti-inflammatory signal. We show that the anti-inflammatory IL-10 signaling is long ranged, as opposed to the short-ranged pro-inflammatory TNF$α$ signaling. The model suggests that the population averaging and modulation of the pro-inflammatory response by the anti-inflammatory signal is a safety guard against excessive immune responses.
△ Less
Submitted 25 July, 2016;
originally announced July 2016.
-
Persisting fetal clonotypes influence the structure and overlap of adult human T cell receptor repertoires
Authors:
Mikhail V. Pogorelyy,
Yuval Elhanati,
Quentin Marcou,
Anastasia L. Sycheva,
Ekaterina A. Komech,
Vadim I. Nazarov,
Olga V. Britanova,
Dmitriy M. Chudakov,
Ilgar Z. Mamedov,
Yuri B. Lebedev,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
The diversity of T-cell receptors recognizing foreign pathogens is generated through a highly stochastic recombination process, making the independent production of the same sequence rare. Yet unrelated individuals do share receptors, which together constitute a "public" repertoire of abundant clonotypes. The TCR repertoire is initially formed prenatally, when the enzyme inserting random nucleotid…
▽ More
The diversity of T-cell receptors recognizing foreign pathogens is generated through a highly stochastic recombination process, making the independent production of the same sequence rare. Yet unrelated individuals do share receptors, which together constitute a "public" repertoire of abundant clonotypes. The TCR repertoire is initially formed prenatally, when the enzyme inserting random nucleotides is downregulated, producing a limited diversity subset. By statistically analyzing deep sequencing T-cell repertoire data from twins, unrelated individuals of various ages, and cord blood, we show that T-cell clones generated before birth persist and maintain high abundances in adult organisms for decades, slowly decaying with age. Our results suggest that large, low-diversity public clones are created during pregnancy, and survive over long periods, providing the basis of the public repertoire.
△ Less
Submitted 9 February, 2016;
originally announced February 2016.
-
repgenHMM: a dynamic programming tool to infer the rules of immune receptor generation from sequence data
Authors:
Yuval Elhanati,
Quentin Marcou,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
The diversity of the immune repertoire is initially generated by random rearrangements of the receptor gene during early T and B cell development. Rearrangement scenarios are composed of random events -- choices of gene templates, base pair deletions and insertions -- described by probability distributions. Not all scenarios are equally likely, and the same receptor sequence may be obtained in sev…
▽ More
The diversity of the immune repertoire is initially generated by random rearrangements of the receptor gene during early T and B cell development. Rearrangement scenarios are composed of random events -- choices of gene templates, base pair deletions and insertions -- described by probability distributions. Not all scenarios are equally likely, and the same receptor sequence may be obtained in several different ways. Quantifying the distribution of these rearrangements is an essential baseline for studying the immune system diversity. Inferring the properties of the distributions from receptor sequences is a computationally hard problem, requiring enumerating every possible scenario for every sampled receptor sequence. We present a Hidden Markov model, which accounts for all plausible scenarios that can generate the receptor sequences. We developed and implemented a method based on the Baum-Welch algorithm that can efficiently infer the parameters for the different events of the rearrangement process. We tested our software tool on sequence data for both the alpha and beta chains of the T cell receptor. To test the validity of our algorithm, we also generated synthetic sequences produced by a known model, and confirmed that its parameters could be accurately inferred back from the sequences. The inferred model can be used to generate synthetic sequences, to calculate the probability of generation of any receptor sequence, as well as the theoretical diversity of the repertoire. We estimate this diversity to be $\approx 10^{23}$ for human T cells. The model gives a baseline to investigate the selection and dynamics of immune repertoires.
△ Less
Submitted 31 October, 2015;
originally announced November 2015.
-
Inferring processes underlying B-cell repertoire diversity
Authors:
Yuval Elhanati,
Zachary Sethna,
Quentin Marcou,
Curtis G. Callan Jr.,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
We quantify the VDJ recombination and somatic hypermutation processes in human B-cells using probabilistic inference methods on high-throughput DNA sequence repertoires of human B-cell receptor heavy chains. Our analysis captures the statistical properties of the naive repertoire, first after its initial generation via VDJ recombination and then after selection for functionality. We also infer sta…
▽ More
We quantify the VDJ recombination and somatic hypermutation processes in human B-cells using probabilistic inference methods on high-throughput DNA sequence repertoires of human B-cell receptor heavy chains. Our analysis captures the statistical properties of the naive repertoire, first after its initial generation via VDJ recombination and then after selection for functionality. We also infer statistical properties of the somatic hypermutation machinery (exclusive of subsequent effects of selection). Our main results are the following: the B-cell repertoire is substantially more diverse than T-cell repertoires, due to longer junctional insertions; sequences that pass initial selection are distinguished by having a higher probability of being generated in a VDJ recombination event; somatic hypermutations have a non-uniform distribution along the V gene that is well explained by an independent site model for the sequence context around the hypermutation site.
△ Less
Submitted 26 February, 2015; v1 submitted 10 February, 2015;
originally announced February 2015.