Search | arXiv e-print repository

The Role of Schwartz Measures in Human Tri-Color Vision

Abstract: The human tri-color vision process may be characterized as follows: 1. A requirement of three scalar quantities to fully define a color (for example, intensity, hue, and purity), with 2. These scalar measures linear in the intensity of the incident light, allowing in general any specific color to be duplicated by an additive mixture of light from three standardized (basis) colors, 3. The exc… ▽ More The human tri-color vision process may be characterized as follows: 1. A requirement of three scalar quantities to fully define a color (for example, intensity, hue, and purity), with 2. These scalar measures linear in the intensity of the incident light, allowing in general any specific color to be duplicated by an additive mixture of light from three standardized (basis) colors, 3. The exception being that the spectral colors are unique, in that they cannot be duplicated by any positive mixture of other colors. These characteristics strongly suggest that human color vision makes use of Schwartz measures in processing color data. This hypothesis is subject to test. In this brief paper, the results of this hypothesis are shown to be in good agreement with measured data. △ Less

Submitted 22 June, 2023; originally announced July 2023.

arXiv:1810.09164 [pdf, other]

doi 10.1007/978-3-030-15719-7_10

Named Entity Disambiguation using Deep Learning on Graphs

Authors: Alberto Cetoli, Mohammad Akbari, Stefano Bragaglia, Andrew D. O'Harney, Marc Sloan

Abstract: We tackle \ac{NED} by comparing entities in short sentences with \wikidata{} graphs. Creating a context vector from graphs through deep learning is a challenging problem that has never been applied to \ac{NED}. Our main contribution is to present an experimental study of recent neural techniques, as well as a discussion about which graph features are most important for the disambiguation task. In… ▽ More We tackle \ac{NED} by comparing entities in short sentences with \wikidata{} graphs. Creating a context vector from graphs through deep learning is a challenging problem that has never been applied to \ac{NED}. Our main contribution is to present an experimental study of recent neural techniques, as well as a discussion about which graph features are most important for the disambiguation task. In addition, a new dataset (\wikidatadisamb{}) is created to allow a clean and scalable evaluation of \ac{NED} with \wikidata{} entries, and to be used as a reference in future research. In the end our results show that a \ac{Bi-LSTM} encoding of the graph triplets performs best, improving upon the baseline models and scoring an \rm{F1} value of $91.6\%$ on the \wikidatadisamb{} test set △ Less

Submitted 22 October, 2018; originally announced October 2018.

arXiv:1709.10053 [pdf, ps, other]

Graph Convolutional Networks for Named Entity Recognition

Authors: A. Cetoli, S. Bragaglia, A. D. O'Harney, M. Sloan

Abstract: In this paper we investigate the role of the dependency tree in a named entity recognizer upon using a set of GCN. We perform a comparison among different NER architectures and show that the grammar of a sentence positively influences the results. Experiments on the ontonotes dataset demonstrate consistent performance improvements, without requiring heavy feature engineering nor additional languag… ▽ More In this paper we investigate the role of the dependency tree in a named entity recognizer upon using a set of GCN. We perform a comparison among different NER architectures and show that the grammar of a sentence positively influences the results. Experiments on the ontonotes dataset demonstrate consistent performance improvements, without requiring heavy feature engineering nor additional language-specific knowledge. △ Less

Submitted 14 February, 2018; v1 submitted 28 September, 2017; originally announced September 2017.

Comments: Accepted at the 16th International Workshop on Treebanks and Linguistic Theories

arXiv:1601.04615 [pdf, ps, other]

doi 10.1007/s10791-015-9251-5

A Term-Based Methodology for Query Reformulation Understanding

Authors: Marc Sloan, Hui Yang, Jun Wang

Abstract: Key to any research involving session search is the understanding of how a user's queries evolve throughout the session. When a user creates a query reformulation, he or she is consciously retaining terms from their original query, removing others and adding new terms. By measuring the similarity between queries we can make inferences on the user's information need and how successful their new que… ▽ More Key to any research involving session search is the understanding of how a user's queries evolve throughout the session. When a user creates a query reformulation, he or she is consciously retaining terms from their original query, removing others and adding new terms. By measuring the similarity between queries we can make inferences on the user's information need and how successful their new query is likely to be. By identifying the origins of added terms we can infer the user's motivations and gain an understanding of their interactions. In this paper we present a novel term-based methodology for understanding and interpreting query reformulation actions. We use TREC Session Track data to demonstrate how our technique is able to learn from query logs and we make use of click data to test user interaction behavior when reformulating queries. We identify and evaluate a range of term-based query reformulation strategies and show that our methods provide valuable insight into understanding query reformulation in session search. △ Less

Submitted 19 January, 2016; v1 submitted 18 January, 2016; originally announced January 2016.

Comments: Information Retrieval Journal, 23 pages, 10 tables, 6 figures

ACM Class: H.3.3

Journal ref: Information Retrieval Journal, April 2015, Volume 18, Issue 2, pp 145-165

arXiv:1601.04605 [pdf, other]

doi 10.1145/2808194.2809457

Dynamic Information Retrieval: Theoretical Framework and Application

Authors: Marc Sloan, Jun Wang

Abstract: Theoretical frameworks like the Probability Ranking Principle and its more recent Interactive Information Retrieval variant have guided the development of ranking and retrieval algorithms for decades, yet they are not capable of helping us model problems in Dynamic Information Retrieval which exhibit the following three properties; an observable user signal, retrieval over multiple stages and an o… ▽ More Theoretical frameworks like the Probability Ranking Principle and its more recent Interactive Information Retrieval variant have guided the development of ranking and retrieval algorithms for decades, yet they are not capable of helping us model problems in Dynamic Information Retrieval which exhibit the following three properties; an observable user signal, retrieval over multiple stages and an overall search intent. In this paper a new theoretical framework for retrieval in these scenarios is proposed. We derive a general dynamic utility function for optimizing over these types of tasks, that takes into account the utility of each stage and the probability of observing user feedback. We apply our framework to experiments over TREC data in the dynamic multi page search scenario as a practical demonstration of its effectiveness and to frame the discussion of its use, its limitations and to compare it against the existing frameworks. △ Less

Submitted 18 January, 2016; originally announced January 2016.

Comments: ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR), 10 pages, 4 figures, 2 algorithms, 3 tables. in Proceedings of the 2015 International Conference on The Theory of Information Retrieval

ACM Class: H.3.3

arXiv:1303.5250 [pdf, ps, other]

Iterative Expectation for Multi Period Information Retrieval

Authors: Marc Sloan, Jun Wang

Abstract: Many Information Retrieval (IR) models make use of offline statistical techniques to score documents for ranking over a single period, rather than use an online, dynamic system that is responsive to users over time. In this paper, we explicitly formulate a general Multi Period Information Retrieval problem, where we consider retrieval as a stochastic yet controllable process. The ranking action du… ▽ More Many Information Retrieval (IR) models make use of offline statistical techniques to score documents for ranking over a single period, rather than use an online, dynamic system that is responsive to users over time. In this paper, we explicitly formulate a general Multi Period Information Retrieval problem, where we consider retrieval as a stochastic yet controllable process. The ranking action during the process continuously controls the retrieval system's dynamics, and an optimal ranking policy is found in order to maximise the overall users' satisfaction over the multiple periods as much as possible. Our derivations show interesting properties about how the posterior probability of the documents relevancy evolves from users feedbacks through clicks, and provides a plug-in framework for incorporating different click models. Based on the Multi-Armed Bandit theory, we propose a simple implementation of our framework using a dynamic ranking rule that takes rank bias and exploration of documents into account. We use TREC data to learn a suitable exploration parameter for our model, and then analyse its performance and a number of variants using a search log data set; the experiments suggest an ability to explore document relevance dynamically over time using user feedback in a way that can handle rank bias. △ Less

Submitted 21 March, 2013; originally announced March 2013.

Comments: 8 pages, 3 tables, published at the Workshop on Web Search Click Data 2013

arXiv:1206.1754 [pdf, other]

Internet Advertising: An Interplay among Advertisers, Online Publishers, Ad Exchanges and Web Users

Authors: Shuai Yuan, Ahmad Zainal Abidin, Marc Sloan, Jun Wang

Abstract: Internet advertising is a fast growing business which has proved to be significantly important in digital economics. It is vitally important for both web search engines and online content providers and publishers because web advertising provides them with major sources of revenue. Its presence is increasingly important for the whole media industry due to the influence of the Web. For advertisers,… ▽ More Internet advertising is a fast growing business which has proved to be significantly important in digital economics. It is vitally important for both web search engines and online content providers and publishers because web advertising provides them with major sources of revenue. Its presence is increasingly important for the whole media industry due to the influence of the Web. For advertisers, it is a smarter alternative to traditional marketing media such as TVs and newspapers. As the web evolves and data collection continues, the design of methods for more targeted, interactive, and friendly advertising may have a major impact on the way our digital economy evolves, and to aid societal development. Towards this goal mathematically well-grounded Computational Advertising methods are becoming necessary and will continue to develop as a fundamental tool towards the Web. As a vibrant new discipline, Internet advertising requires effort from different research domains including Information Retrieval, Machine Learning, Data Mining and Analytic, Statistics, Economics, and even Psychology to predict and understand user behaviours. In this paper, we provide a comprehensive survey on Internet advertising, discussing and classifying the research issues, identifying the recent technologies, and suggesting its future directions. To have a comprehensive picture, we first start with a brief history, introduction, and classification of the industry and present a schematic view of the new advertising ecosystem. We then introduce four major participants, namely advertisers, online publishers, ad exchanges and web users; and through analysing and discussing the major research problems and existing solutions from their perspectives respectively, we discover and aggregate the fundamental problems that characterise the newly-formed research field and capture its potential future prospects. △ Less

Submitted 2 July, 2012; v1 submitted 8 June, 2012; originally announced June 2012.

Comments: 44 pages, 7 figures, 6 tables. Submitted to Information Processing and Management

ACM Class: H.3.3; H.3.5

Showing 1–7 of 7 results for author: Sloan, M