Rethinking Large-scale Pre-ranking System: Entire-chain Cross-domain Models

Song, Jinbo; Huang, Ruoran; Wang, Xinyang; Huang, Wei; Yu, Qian; Chen, Mingming; Yao, Yafei; Fan, Chaosheng; Peng, Changping; Lin, Zhangang; Hu, Jinghe; Shao, Jingping

doi:10.1145/3511808.3557683

Computer Science > Information Retrieval

arXiv:2310.08039 (cs)

[Submitted on 12 Oct 2023]

Title:Rethinking Large-scale Pre-ranking System: Entire-chain Cross-domain Models

Authors:Jinbo Song (1), Ruoran Huang (1), Xinyang Wang (1), Wei Huang (1), Qian Yu (1), Mingming Chen (1), Yafei Yao (1), Chaosheng Fan (1), Changping Peng (1), Zhangang Lin (1), Jinghe Hu (1), Jingping Shao (1) ((1) Marketing and Commercialization Center, JD.com)

View PDF

Abstract:Industrial systems such as recommender systems and online advertising, have been widely equipped with multi-stage architectures, which are divided into several cascaded modules, including matching, pre-ranking, ranking and re-ranking. As a critical bridge between matching and ranking, existing pre-ranking approaches mainly endure sample selection bias (SSB) problem owing to ignoring the entire-chain data dependence, resulting in sub-optimal performances. In this paper, we rethink pre-ranking system from the perspective of the entire sample space, and propose Entire-chain Cross-domain Models (ECM), which leverage samples from the whole cascaded stages to effectively alleviate SSB problem. Besides, we design a fine-grained neural structure named ECMM to further improve the pre-ranking accuracy. Specifically, we propose a cross-domain multi-tower neural network to comprehensively predict for each stage result, and introduce the sub-networking routing strategy with $L0$ regularization to reduce computational costs. Evaluations on real-world large-scale traffic logs demonstrate that our pre-ranking models outperform SOTA methods while time consumption is maintained within an acceptable level, which achieves better trade-off between efficiency and effectiveness.

Comments:	5 pages, 2 figures
Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2310.08039 [cs.IR]
	(or arXiv:2310.08039v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2310.08039
Journal reference:	Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2022: 4495-4499
Related DOI:	https://doi.org/10.1145/3511808.3557683

Submission history

From: Jinbo Song [view email]
[v1] Thu, 12 Oct 2023 05:14:42 UTC (261 KB)

Computer Science > Information Retrieval

Title:Rethinking Large-scale Pre-ranking System: Entire-chain Cross-domain Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Rethinking Large-scale Pre-ranking System: Entire-chain Cross-domain Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators