Interpreting BERT architecture predictions for peptide presentation by MHC class I proteins

Gasser, Hans-Christof; Bedran, Georges; Ren, Bo; Goodlett, David; Alfaro, Javier; Rajan, Ajitha

Quantitative Biology > Quantitative Methods

arXiv:2111.07137 (q-bio)

[Submitted on 13 Nov 2021]

Title:Interpreting BERT architecture predictions for peptide presentation by MHC class I proteins

Authors:Hans-Christof Gasser, Georges Bedran, Bo Ren, David Goodlett, Javier Alfaro, Ajitha Rajan

View PDF

Abstract:The major histocompatibility complex (MHC) class-I pathway supports the detection of cancer and viruses by the immune system. It presents parts of proteins (peptides) from inside a cell on its membrane surface enabling visiting immune cells that detect non-self peptides to terminate the cell. The ability to predict whether a peptide will get presented on MHC Class I molecules helps in designing vaccines so they can activate the immune system to destroy the invading disease protein. We designed a prediction model using a BERT-based architecture (ImmunoBERT) that takes as input a peptide and its surrounding regions (N and C-terminals) along with a set of MHC class I (MHC-I) molecules. We present a novel application of well known interpretability techniques, SHAP and LIME, to this domain and we use these results along with 3D structure visualizations and amino acid frequencies to understand and identify the most influential parts of the input amino acid sequences contributing to the output. In particular, we find that amino acids close to the peptides' N- and C-terminals are highly relevant. Additionally, some positions within the MHC proteins (in particular in the A, B and F pockets) are often assigned a high importance ranking - which confirms biological studies and the distances in the structure visualizations.

Comments:	10 pages
Subjects:	Quantitative Methods (q-bio.QM)
Cite as:	arXiv:2111.07137 [q-bio.QM]
	(or arXiv:2111.07137v1 [q-bio.QM] for this version)
	https://doi.org/10.48550/arXiv.2111.07137

Submission history

From: Hans-Christof Gasser [view email]
[v1] Sat, 13 Nov 2021 16:01:36 UTC (2,582 KB)

Quantitative Biology > Quantitative Methods

Title:Interpreting BERT architecture predictions for peptide presentation by MHC class I proteins

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Quantitative Methods

Title:Interpreting BERT architecture predictions for peptide presentation by MHC class I proteins

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators