An IR-based Evaluation Framework for Web Search Query Segmentation

Roy, Rishiraj Saha; Ganguly, Niloy; Choudhury, Monojit; Laxman, Srivatsan

Computer Science > Information Retrieval

arXiv:1111.1497v2 (cs)

A newer version of this paper has been withdrawn by Rishiraj Saha Roy

[Submitted on 7 Nov 2011 (v1), revised 18 Dec 2011 (this version, v2), latest version 18 Sep 2012 (v4)]

Title:An IR-based Evaluation Framework for Web Search Query Segmentation

Authors:Rishiraj Saha Roy, Niloy Ganguly, Monojit Choudhury, Srivatsan Laxman

No PDF available, click to view other formats

Abstract:In this paper, we present a comparative evaluation scheme for Web search query segmentation, based directly on IR performance. We evaluate six segmentation strategies, including four state-of-the-art techniques, vis-a-vis segmentations from three human experts. In the past, segmentation strategies were mainly validated against manual annotations, which suffer from guideline inadequacies and human idiosyncrasy. This work shows that there need not be a perfect correlation between the goodness of a scheme as judged against a handful of human segmentations and its effectiveness from an IR perspective. Moreover, algorithms are shown to perform equally good and sometimes even better in comparison to human markups, a fact masked by previous validations. A test set of relatively rarer queries (more suitable for discriminating between algorithms) were used for evaluation. Our results also prove the usefulness of query segmentation in a Web search scenario.

Comments:	All the authors do not wish that the copy of the paper be available online before the paper is accepted
Subjects:	Information Retrieval (cs.IR)
ACM classes:	H.3.3
Cite as:	arXiv:1111.1497 [cs.IR]
	(or arXiv:1111.1497v2 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1111.1497

Submission history

From: Rishiraj Saha Roy [view email]
[v1] Mon, 7 Nov 2011 07:26:27 UTC (91 KB)
[v2] Sun, 18 Dec 2011 17:33:28 UTC (1 KB) (withdrawn)
[v3] Tue, 20 Dec 2011 11:22:38 UTC (1 KB) (withdrawn)
[v4] Tue, 18 Sep 2012 03:26:22 UTC (64 KB)

Computer Science > Information Retrieval

Title:An IR-based Evaluation Framework for Web Search Query Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:An IR-based Evaluation Framework for Web Search Query Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators