Enhanced vectors for top-k document retrieval in Question Answering

Hammad, Mohammed

Computer Science > Information Retrieval

arXiv:2210.10584 (cs)

[Submitted on 8 Oct 2022]

Title:Enhanced vectors for top-k document retrieval in Question Answering

Authors:Mohammed Hammad

View PDF

Abstract:Modern day applications, especially information retrieval webapps that involve "search" as their use cases are gradually moving towards "answering" modules. Conversational chatbots which have been proved to be more engaging to users, use Question Answering as their core. Since, precise answering is computationally expensive, several approaches have been developed to prefetch the most relevant documents/passages from the database that contain the answer. We propose a different approach that retrieves the evidence documents efficiently and accurately, making sure that the relevant document for a given user query is not missed. We do so by assigning each document (or passage in our case), a unique identifier and using them to create dense vectors which can be efficiently indexed. More precisely, we use the identifier to predict randomly sampled context window words of the relevant question corresponding to the passage along with the words of passage itself. This naturally embeds the passage identifier into the vector space in such a way that the embedding is closer to the question without compromising he information content. This approach enables efficient creation of real-time query vectors in ~4 milliseconds.

Comments:	Year-2019
Subjects:	Information Retrieval (cs.IR); Computation and Language (cs.CL); Machine Learning (cs.LG)
MSC classes:	68T50
ACM classes:	I.2.7
Cite as:	arXiv:2210.10584 [cs.IR]
	(or arXiv:2210.10584v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2210.10584

Submission history

From: Mohammed Hammad [view email]
[v1] Sat, 8 Oct 2022 07:44:24 UTC (2,643 KB)

Computer Science > Information Retrieval

Title:Enhanced vectors for top-k document retrieval in Question Answering

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Enhanced vectors for top-k document retrieval in Question Answering

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators