Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation

Hofstätter, Sebastian; Althammer, Sophia; Schröder, Michael; Sertkan, Mete; Hanbury, Allan

Computer Science > Information Retrieval

arXiv:2010.02666v1 (cs)

[Submitted on 6 Oct 2020 (this version), latest version 22 Jan 2021 (v2)]

Title:Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation

Authors:Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, Allan Hanbury

View PDF

Abstract:The latency of neural ranking models at query time is largely dependent on the architecture and deliberate choices by their designers to trade-off effectiveness for higher efficiency. This focus on low query latency of a rising number of efficient ranking architectures make them feasible for production deployment. In machine learning an increasingly common approach to close the effectiveness gap of more efficient models is to apply knowledge distillation from a large teacher model to a smaller student model. We find that different ranking architectures tend to produce output scores in different magnitudes. Based on this finding, we propose a cross-architecture training procedure with a margin focused loss (Margin-MSE), that adapts knowledge distillation to the varying score output distributions of different BERT and non-BERT ranking architectures. We apply the teachable information as additional fine-grained labels to existing training triples of the MSMARCO-Passage collection. We evaluate our procedure of distilling knowledge from state-of-the-art concatenated BERT models to four different efficient architectures (TK, ColBERT, PreTT, and a BERT CLS dot product model). We show that across our evaluated architectures our Margin-MSE knowledge distillation significantly improves their effectiveness without compromising their efficiency. To benefit the community, we publish the costly teacher-score training files in a ready-to-use package.

Subjects:	Information Retrieval (cs.IR)
Cite as:	arXiv:2010.02666 [cs.IR]
	(or arXiv:2010.02666v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2010.02666

Submission history

From: Sebastian Hofstätter [view email]
[v1] Tue, 6 Oct 2020 12:35:53 UTC (2,862 KB)
[v2] Fri, 22 Jan 2021 16:24:52 UTC (3,547 KB)

Computer Science > Information Retrieval

Title:Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators