-
The Role of Schwartz Measures in Human Tri-Color Vision
Authors:
M. L. Sloan
Abstract:
The human tri-color vision process may be characterized as follows:
1. A requirement of three scalar quantities to fully define a color (for example, intensity, hue, and purity), with
2. These scalar measures linear in the intensity of the incident light, allowing in general any specific color to be duplicated by an additive mixture of light from three standardized (basis) colors,
3. The exc…
▽ More
The human tri-color vision process may be characterized as follows:
1. A requirement of three scalar quantities to fully define a color (for example, intensity, hue, and purity), with
2. These scalar measures linear in the intensity of the incident light, allowing in general any specific color to be duplicated by an additive mixture of light from three standardized (basis) colors,
3. The exception being that the spectral colors are unique, in that they cannot be duplicated by any positive mixture of other colors.
These characteristics strongly suggest that human color vision makes use of Schwartz measures in processing color data. This hypothesis is subject to test. In this brief paper, the results of this hypothesis are shown to be in good agreement with measured data.
△ Less
Submitted 22 June, 2023;
originally announced July 2023.
-
Named Entity Disambiguation using Deep Learning on Graphs
Authors:
Alberto Cetoli,
Mohammad Akbari,
Stefano Bragaglia,
Andrew D. O'Harney,
Marc Sloan
Abstract:
We tackle \ac{NED} by comparing entities in short sentences with \wikidata{} graphs. Creating a context vector from graphs through deep learning is a challenging problem that has never been applied to \ac{NED}. Our main contribution is to present an experimental study of recent neural techniques, as well as a discussion about which graph features are most important for the disambiguation task. In…
▽ More
We tackle \ac{NED} by comparing entities in short sentences with \wikidata{} graphs. Creating a context vector from graphs through deep learning is a challenging problem that has never been applied to \ac{NED}. Our main contribution is to present an experimental study of recent neural techniques, as well as a discussion about which graph features are most important for the disambiguation task. In addition, a new dataset (\wikidatadisamb{}) is created to allow a clean and scalable evaluation of \ac{NED} with \wikidata{} entries, and to be used as a reference in future research. In the end our results show that a \ac{Bi-LSTM} encoding of the graph triplets performs best, improving upon the baseline models and scoring an \rm{F1} value of $91.6\%$ on the \wikidatadisamb{} test set
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Graph Convolutional Networks for Named Entity Recognition
Authors:
A. Cetoli,
S. Bragaglia,
A. D. O'Harney,
M. Sloan
Abstract:
In this paper we investigate the role of the dependency tree in a named entity recognizer upon using a set of GCN. We perform a comparison among different NER architectures and show that the grammar of a sentence positively influences the results. Experiments on the ontonotes dataset demonstrate consistent performance improvements, without requiring heavy feature engineering nor additional languag…
▽ More
In this paper we investigate the role of the dependency tree in a named entity recognizer upon using a set of GCN. We perform a comparison among different NER architectures and show that the grammar of a sentence positively influences the results. Experiments on the ontonotes dataset demonstrate consistent performance improvements, without requiring heavy feature engineering nor additional language-specific knowledge.
△ Less
Submitted 14 February, 2018; v1 submitted 28 September, 2017;
originally announced September 2017.
-
A Term-Based Methodology for Query Reformulation Understanding
Authors:
Marc Sloan,
Hui Yang,
Jun Wang
Abstract:
Key to any research involving session search is the understanding of how a user's queries evolve throughout the session. When a user creates a query reformulation, he or she is consciously retaining terms from their original query, removing others and adding new terms. By measuring the similarity between queries we can make inferences on the user's information need and how successful their new que…
▽ More
Key to any research involving session search is the understanding of how a user's queries evolve throughout the session. When a user creates a query reformulation, he or she is consciously retaining terms from their original query, removing others and adding new terms. By measuring the similarity between queries we can make inferences on the user's information need and how successful their new query is likely to be. By identifying the origins of added terms we can infer the user's motivations and gain an understanding of their interactions.
In this paper we present a novel term-based methodology for understanding and interpreting query reformulation actions. We use TREC Session Track data to demonstrate how our technique is able to learn from query logs and we make use of click data to test user interaction behavior when reformulating queries. We identify and evaluate a range of term-based query reformulation strategies and show that our methods provide valuable insight into understanding query reformulation in session search.
△ Less
Submitted 19 January, 2016; v1 submitted 18 January, 2016;
originally announced January 2016.
-
Dynamic Information Retrieval: Theoretical Framework and Application
Authors:
Marc Sloan,
Jun Wang
Abstract:
Theoretical frameworks like the Probability Ranking Principle and its more recent Interactive Information Retrieval variant have guided the development of ranking and retrieval algorithms for decades, yet they are not capable of helping us model problems in Dynamic Information Retrieval which exhibit the following three properties; an observable user signal, retrieval over multiple stages and an o…
▽ More
Theoretical frameworks like the Probability Ranking Principle and its more recent Interactive Information Retrieval variant have guided the development of ranking and retrieval algorithms for decades, yet they are not capable of helping us model problems in Dynamic Information Retrieval which exhibit the following three properties; an observable user signal, retrieval over multiple stages and an overall search intent. In this paper a new theoretical framework for retrieval in these scenarios is proposed. We derive a general dynamic utility function for optimizing over these types of tasks, that takes into account the utility of each stage and the probability of observing user feedback. We apply our framework to experiments over TREC data in the dynamic multi page search scenario as a practical demonstration of its effectiveness and to frame the discussion of its use, its limitations and to compare it against the existing frameworks.
△ Less
Submitted 18 January, 2016;
originally announced January 2016.
-
Iterative Expectation for Multi Period Information Retrieval
Authors:
Marc Sloan,
Jun Wang
Abstract:
Many Information Retrieval (IR) models make use of offline statistical techniques to score documents for ranking over a single period, rather than use an online, dynamic system that is responsive to users over time. In this paper, we explicitly formulate a general Multi Period Information Retrieval problem, where we consider retrieval as a stochastic yet controllable process. The ranking action du…
▽ More
Many Information Retrieval (IR) models make use of offline statistical techniques to score documents for ranking over a single period, rather than use an online, dynamic system that is responsive to users over time. In this paper, we explicitly formulate a general Multi Period Information Retrieval problem, where we consider retrieval as a stochastic yet controllable process. The ranking action during the process continuously controls the retrieval system's dynamics, and an optimal ranking policy is found in order to maximise the overall users' satisfaction over the multiple periods as much as possible. Our derivations show interesting properties about how the posterior probability of the documents relevancy evolves from users feedbacks through clicks, and provides a plug-in framework for incorporating different click models. Based on the Multi-Armed Bandit theory, we propose a simple implementation of our framework using a dynamic ranking rule that takes rank bias and exploration of documents into account. We use TREC data to learn a suitable exploration parameter for our model, and then analyse its performance and a number of variants using a search log data set; the experiments suggest an ability to explore document relevance dynamically over time using user feedback in a way that can handle rank bias.
△ Less
Submitted 21 March, 2013;
originally announced March 2013.
-
Internet Advertising: An Interplay among Advertisers, Online Publishers, Ad Exchanges and Web Users
Authors:
Shuai Yuan,
Ahmad Zainal Abidin,
Marc Sloan,
Jun Wang
Abstract:
Internet advertising is a fast growing business which has proved to be significantly important in digital economics. It is vitally important for both web search engines and online content providers and publishers because web advertising provides them with major sources of revenue. Its presence is increasingly important for the whole media industry due to the influence of the Web. For advertisers,…
▽ More
Internet advertising is a fast growing business which has proved to be significantly important in digital economics. It is vitally important for both web search engines and online content providers and publishers because web advertising provides them with major sources of revenue. Its presence is increasingly important for the whole media industry due to the influence of the Web. For advertisers, it is a smarter alternative to traditional marketing media such as TVs and newspapers. As the web evolves and data collection continues, the design of methods for more targeted, interactive, and friendly advertising may have a major impact on the way our digital economy evolves, and to aid societal development.
Towards this goal mathematically well-grounded Computational Advertising methods are becoming necessary and will continue to develop as a fundamental tool towards the Web. As a vibrant new discipline, Internet advertising requires effort from different research domains including Information Retrieval, Machine Learning, Data Mining and Analytic, Statistics, Economics, and even Psychology to predict and understand user behaviours. In this paper, we provide a comprehensive survey on Internet advertising, discussing and classifying the research issues, identifying the recent technologies, and suggesting its future directions. To have a comprehensive picture, we first start with a brief history, introduction, and classification of the industry and present a schematic view of the new advertising ecosystem. We then introduce four major participants, namely advertisers, online publishers, ad exchanges and web users; and through analysing and discussing the major research problems and existing solutions from their perspectives respectively, we discover and aggregate the fundamental problems that characterise the newly-formed research field and capture its potential future prospects.
△ Less
Submitted 2 July, 2012; v1 submitted 8 June, 2012;
originally announced June 2012.