-
Crowd-Labeling Fashion Reviews with Quality Control
Authors:
Iurii Chernushenko,
Felix A. Gers,
Alexander Löser,
Alessandro Checco
Abstract:
We present a new methodology for high-quality labeling in the fashion domain with crowd workers instead of experts. We focus on the Aspect-Based Sentiment Analysis task. Our methods filter out inaccurate input from crowd workers but we preserve different worker labeling to capture the inherent high variability of the opinions. We demonstrate the quality of labeled data based on Facebook's FastText…
▽ More
We present a new methodology for high-quality labeling in the fashion domain with crowd workers instead of experts. We focus on the Aspect-Based Sentiment Analysis task. Our methods filter out inaccurate input from crowd workers but we preserve different worker labeling to capture the inherent high variability of the opinions. We demonstrate the quality of labeled data based on Facebook's FastText framework as a baseline.
△ Less
Submitted 5 April, 2018;
originally announced May 2018.
-
FashionBrain Project: A Vision for Understanding Europe's Fashion Data Universe
Authors:
Alessandro Checco,
Gianluca Demartini,
Alexander Loeser,
Ines Arous,
Mourad Khayati,
Matthias Dantone,
Richard Koopmanschap,
Svetlin Stalinov,
Martin Kersten,
Ying Zhang
Abstract:
A core business in the fashion industry is the understanding and prediction of customer needs and trends. Search engines and social networks are at the same time a fundamental bridge and a costly middleman between the customer's purchase intention and the retailer. To better exploit Europe's distinctive characteristics e.g., multiple languages, fashion and cultural differences, it is pivotal to re…
▽ More
A core business in the fashion industry is the understanding and prediction of customer needs and trends. Search engines and social networks are at the same time a fundamental bridge and a costly middleman between the customer's purchase intention and the retailer. To better exploit Europe's distinctive characteristics e.g., multiple languages, fashion and cultural differences, it is pivotal to reduce retailers' dependence to search engines. This goal can be achieved by harnessing various data channels (manufacturers and distribution networks, online shops, large retailers, social media, market observers, call centers, press/magazines etc.) that retailers can leverage in order to gain more insight about potential buyers, and on the industry trends as a whole. This can enable the creation of novel on-line shopping experiences, the detection of influencers, and the prediction of upcoming fashion trends.
In this paper, we provide an overview of the main research challenges and an analysis of the most promising technological solutions that we are investigating in the FashionBrain project.
△ Less
Submitted 26 October, 2017;
originally announced October 2017.
-
The Effect of Class Imbalance and Order on Crowdsourced Relevance Judgments
Authors:
Rehab K. Qarout,
Alessandro Checco,
Gianluca Demartini
Abstract:
In this paper we study the effect on crowd worker efficiency and effectiveness of the dominance of one class in the data they process. We aim at understanding if there is any positive or negative bias in workers seeing many negative examples in the identification of positive labels. To test our hypothesis, we design an experiment where crowd workers are asked to judge the relevance of documents pr…
▽ More
In this paper we study the effect on crowd worker efficiency and effectiveness of the dominance of one class in the data they process. We aim at understanding if there is any positive or negative bias in workers seeing many negative examples in the identification of positive labels. To test our hypothesis, we design an experiment where crowd workers are asked to judge the relevance of documents presented in different orders. Our findings indicate that there is a significant improvement in the quality of relevance judgements when presenting relevant results before the non-relevant ones.
△ Less
Submitted 4 September, 2016;
originally announced September 2016.
-
Pairwise, Magnitude, or Stars: What's the Best Way for Crowds to Rate?
Authors:
Alessandro Checco,
Gianluca Demartini
Abstract:
We compare three popular techniques of rating content: the ubiquitous five star rating, the less used pairwise comparison, and the recently introduced (in crowdsourcing) magnitude estimation approach. Each system has specific advantages and disadvantages, in terms of required user effort, achievable user preference prediction accuracy and number of ratings required.
We design an experiment where…
▽ More
We compare three popular techniques of rating content: the ubiquitous five star rating, the less used pairwise comparison, and the recently introduced (in crowdsourcing) magnitude estimation approach. Each system has specific advantages and disadvantages, in terms of required user effort, achievable user preference prediction accuracy and number of ratings required.
We design an experiment where the three techniques are compared in an unbiased way. We collected 39'000 ratings on a popular crowdsourcing platform, allowing us to release a dataset that will be useful for many related studies on user rating techniques.
△ Less
Submitted 2 September, 2016;
originally announced September 2016.
-
BLC: Private Matrix Factorization Recommenders via Automatic Group Learning
Authors:
Alessandro Checco,
Giuseppe Bianchi,
Doug Leith
Abstract:
We propose a privacy-enhanced matrix factorization recommender that exploits the fact that users can often be grouped together by interest. This allows a form of "hiding in the crowd" privacy. We introduce a novel matrix factorization approach suited to making recommendations in a shared group (or nym) setting and the BLC algorithm for carrying out this matrix factorization in a privacy-enhanced m…
▽ More
We propose a privacy-enhanced matrix factorization recommender that exploits the fact that users can often be grouped together by interest. This allows a form of "hiding in the crowd" privacy. We introduce a novel matrix factorization approach suited to making recommendations in a shared group (or nym) setting and the BLC algorithm for carrying out this matrix factorization in a privacy-enhanced manner. We demonstrate that the increased privacy does not come at the cost of reduced recommendation accuracy.
△ Less
Submitted 27 February, 2017; v1 submitted 18 September, 2015;
originally announced September 2015.
-
Analysis of Dynamic Channel Bonding in Dense Networks of WLANs
Authors:
Azadeh Faridi,
Boris Bellalta,
Alessandro Checco
Abstract:
Dynamic Channel Bonding (DCB) allows for the dynamic selection and use of multiple contiguous basic channels in Wireless Local Area Networks (WLANs). A WLAN operating under DCB can enjoy a larger bandwidth, when available, and therefore achieve a higher throughput. However, the use of larger bandwidths also increases the contention with adjacent WLANs, which can result in longer delays in accessin…
▽ More
Dynamic Channel Bonding (DCB) allows for the dynamic selection and use of multiple contiguous basic channels in Wireless Local Area Networks (WLANs). A WLAN operating under DCB can enjoy a larger bandwidth, when available, and therefore achieve a higher throughput. However, the use of larger bandwidths also increases the contention with adjacent WLANs, which can result in longer delays in accessing the channel and consequently, a lower throughput. In this paper, a scenario consisting of multiple WLANs using DCB and operating within carrier-sensing range of one another is considered. An analytical framework for evaluating the performance of such networks is presented. The analysis is carried out using a Markov chain model that characterizes the interactions between adjacent WLANs with overlapping channels. An algorithm is proposed for systematically constructing the Markov chain corresponding to any given scenario. The analytical model is then used to highlight and explain the key properties that differentiate DCB networks of WLANs from those operating on a single shared channel. Furthermore, the analysis is applied to networks of IEEE 802.11ac WLANs operating under DCB--which do not fully comply with some of the simplifying assumptions in our analysis--to show that the analytical model can give accurate results in more realistic scenarios.
△ Less
Submitted 1 September, 2015;
originally announced September 2015.
-
On the Interactions between Multiple Overlapping WLANs using Channel Bonding
Authors:
B. Bellalta,
A. Checco,
A. Zocca,
J. Barcelo
Abstract:
Next-generation WLANs will support the use of wider channels, which is known as channel bonding, to achieve higher throughput. However, because both the channel center frequency and the channel width are autonomously selected by each WLAN, the use of wider channels may also increase the competition with other WLANs operating in the same area for the limited available spectrum, thus causing the opp…
▽ More
Next-generation WLANs will support the use of wider channels, which is known as channel bonding, to achieve higher throughput. However, because both the channel center frequency and the channel width are autonomously selected by each WLAN, the use of wider channels may also increase the competition with other WLANs operating in the same area for the limited available spectrum, thus causing the opposite effect. In this paper, we analyse the interactions between a group of neighboring WLANs that use channel bonding and evaluate the impact of those interactions on the achievable throughput. A Continuous Time Markov Network (CTMN) model that is able to capture the coupled operation of a group of overlapping WLANs is introduced and validated. The results show that the use of channel bonding can provide significant performance gains even in scenarios with high densities of WLANs, though it may also cause unfair situations in which some WLANs cannot access the channel, while others receive most of the transmission opportunities.
△ Less
Submitted 4 February, 2015; v1 submitted 2 December, 2014;
originally announced December 2014.
-
Fast, Responsive Decentralised Graph Colouring
Authors:
Alessandro Checco,
Douglas J. Leith
Abstract:
We solve, in a fully decentralised way (\ie with no message passing), the classic problem of colouring a graph. We propose a novel algorithm that is automatically responsive to topology changes, and we prove that it converges quickly to a proper colouring in $O(N\log{N})$ time with high probability for generic graphs (and in $O(\log{N})$ time if $Δ=O(1)$) when the number of available colours is gr…
▽ More
We solve, in a fully decentralised way (\ie with no message passing), the classic problem of colouring a graph. We propose a novel algorithm that is automatically responsive to topology changes, and we prove that it converges quickly to a proper colouring in $O(N\log{N})$ time with high probability for generic graphs (and in $O(\log{N})$ time if $Δ=O(1)$) when the number of available colours is greater than $Δ$, the maximum degree of the graph.
We believe the proof techniques used in this work are of independent interest and provide new insight into the properties required to ensure fast convergence of decentralised algorithms.
△ Less
Submitted 2 September, 2017; v1 submitted 27 May, 2014;
originally announced May 2014.
-
Throughput Analysis in CSMA/CA Networks using Continuous Time Markov Networks: A Tutorial
Authors:
B. Bellalta,
A. Zocca,
C. Cano,
A. Checco,
J. Barcelo,
A. Vinel
Abstract:
This book chapter introduces the use of Continuous Time Markov Networks (CTMN) to analytically capture the operation of Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) networks. It is of tutorial nature, and it aims to be an introduction on this topic, providing a clear and easy-to-follow description. To illustrate how CTMN can be used, we introduce a set of representative and cut…
▽ More
This book chapter introduces the use of Continuous Time Markov Networks (CTMN) to analytically capture the operation of Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) networks. It is of tutorial nature, and it aims to be an introduction on this topic, providing a clear and easy-to-follow description. To illustrate how CTMN can be used, we introduce a set of representative and cutting-edge scenarios, such as Vehicular Ad-hoc Networks (VANETs), Power Line Communication networks and multiple overlapping Wireless Local Area Networks (WLANs). For each scenario, we describe the specific CTMN, obtain its stationary distribution and compute the throughput achieved by each node in the network. Taking the per-node throughput as reference, we discuss how the complex interactions between nodes using CSMA/CA have an impact on system performance.
△ Less
Submitted 14 May, 2014; v1 submitted 1 April, 2014;
originally announced April 2014.
-
Updating Neighbour Cell List via Crowdsourced User Reports: a Framework for Measuring Time Performance
Authors:
Alessandro Checco,
Carlo Lancia,
Douglas J. Leith
Abstract:
In this paper we introduce the idea of estimating local topology in wireless networks by means of crowdsourced user reports. In this approach each user periodically reports to the serving basestation information about the set of neighbouring basestations observed by the user. We show that, by mapping the local topological structure of the network onto states of increasing knowledge, a crisp mathem…
▽ More
In this paper we introduce the idea of estimating local topology in wireless networks by means of crowdsourced user reports. In this approach each user periodically reports to the serving basestation information about the set of neighbouring basestations observed by the user. We show that, by mapping the local topological structure of the network onto states of increasing knowledge, a crisp mathematical framework can be obtained, which allows in turn for the use of a variety of user mobility models. Using a simplified mobility model we show how obtain useful upper bounds on the expected time for a basestation to gain full knowledge of its local neighbourhood, answering the fundamental question about which classes of network deployments can effectively benefit from a crowdsourcing approach.
△ Less
Submitted 20 November, 2019; v1 submitted 7 January, 2014;
originally announced January 2014.
-
Learning-Based Constraint Satisfaction With Sensing Restrictions
Authors:
Alessandro Checco,
Douglas Leith
Abstract:
In this paper we consider graph-coloring problems, an important subset of general constraint satisfaction problems that arise in wireless resource allocation. We constructively establish the existence of fully decentralized learning-based algorithms that are able to find a proper coloring even in the presence of strong sensing restrictions, in particular sensing asymmetry of the type encountered w…
▽ More
In this paper we consider graph-coloring problems, an important subset of general constraint satisfaction problems that arise in wireless resource allocation. We constructively establish the existence of fully decentralized learning-based algorithms that are able to find a proper coloring even in the presence of strong sensing restrictions, in particular sensing asymmetry of the type encountered when hidden terminals are present. Our main analytic contribution is to establish sufficient conditions on the sensing behaviour to ensure that the solvers find satisfying assignments with probability one. These conditions take the form of connectivity requirements on the induced sensing graph. These requirements are mild, and we demonstrate that they are commonly satisfied in wireless allocation tasks. We argue that our results are of considerable practical importance in view of the prevalence of both communication and sensing restrictions in wireless resource allocation problems. The class of algorithms analysed here requires no message-passing whatsoever between wireless devices, and we show that they continue to perform well even when devices are only able to carry out constrained sensing of the surrounding radio environment.
△ Less
Submitted 13 March, 2013; v1 submitted 26 October, 2012;
originally announced October 2012.