Skip to main content

Showing 1–10 of 10 results for author: Tran-Thanh, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.13899  [pdf, other

    stat.ML cs.LG

    Symmetric Linear Bandits with Hidden Symmetry

    Authors: Nam Phuong Tran, The Anh Ta, Debmalya Mandal, Long Tran-Thanh

    Abstract: High-dimensional linear bandits with low-dimensional structure have received considerable attention in recent studies due to their practical significance. The most common structure in the literature is sparsity. However, it may not be available in practice. Symmetry, where the reward is invariant under certain groups of transformations on the set of arms, is another important inductive bias in the… ▽ More

    Submitted 30 October, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2212.07524  [pdf, ps, other

    cs.LG stat.ML

    Invariant Lipschitz Bandits: A Side Observation Approach

    Authors: Nam Phuong Tran, Long Tran-Thanh

    Abstract: Symmetry arises in many optimization and decision-making problems, and has attracted considerable attention from the optimization community: By utilizing the existence of such symmetries, the process of searching for optimal solutions can be improved significantly. Despite its success in (offline) optimization, the utilization of symmetries has not been well examined within the online optimization… ▽ More

    Submitted 28 August, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

  3. arXiv:2211.07817  [pdf, other

    cs.LG stat.ML

    Multi-Player Bandits Robust to Adversarial Collisions

    Authors: Shivakumar Mahesh, Anshuka Rangi, Haifeng Xu, Long Tran-Thanh

    Abstract: Motivated by cognitive radios, stochastic Multi-Player Multi-Armed Bandits has been extensively studied in recent years. In this setting, each player pulls an arm, and receives a reward corresponding to the arm if there is no collision, namely the arm was selected by one single player. Otherwise, the player receives no reward if collision occurs. In this paper, we consider the presence of maliciou… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  4. arXiv:2208.13663  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning

    Authors: Anshuka Rangi, Haifeng Xu, Long Tran-Thanh, Massimo Franceschetti

    Abstract: To understand the security threats to reinforcement learning (RL) algorithms, this paper studies poisoning attacks to manipulate \emph{any} order-optimal learning algorithm towards a targeted policy in episodic RL and examines the potential damage of two natural types of poisoning attacks, i.e., the manipulation of \emph{reward} and \emph{action}. We discover that the effect of attacks crucially d… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

    Comments: Accepted at International Joint Conferences on Artificial Intelligence (IJCAI) 2022

  5. arXiv:2102.07711  [pdf, ps, other

    cs.LG cs.AI cs.CR stat.ML

    Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

    Authors: Anshuka Rangi, Long Tran-Thanh, Haifeng Xu, Massimo Franceschetti

    Abstract: We study bandit algorithms under data poisoning attacks in a bounded reward setting. We consider a strong attacker model in which the attacker can observe both the selected actions and their corresponding rewards and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret $O(\log T)$ can be forced to suffer a regret $Ω(T)$ with an expected amount of contamina… ▽ More

    Submitted 3 May, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Accepted to AAAI 2022

  6. arXiv:2101.01572  [pdf, other

    stat.ML cs.LG cs.MA

    Sequential Choice Bandits with Feedback for Personalizing users' experience

    Authors: Anshuka Rangi, Massimo Franceschetti, Long Tran-Thanh

    Abstract: In this work, we study sequential choice bandits with feedback. We propose bandit algorithms for a platform that personalizes users' experience to maximize its rewards. For each action directed to a given user, the platform is given a positive reward, which is a non-decreasing function of the action, if this action is below the user's threshold. Users are equipped with a patience budget, and actio… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

  7. arXiv:2006.02796  [pdf, other

    cs.LG stat.ML

    Fuzzy c-Means Clustering for Persistence Diagrams

    Authors: Thomas Davies, Jack Aspinall, Bryan Wilder, Long Tran-Thanh

    Abstract: Persistence diagrams concisely represent the topology of a point cloud whilst having strong theoretical guarantees, but the question of how to best integrate this information into machine learning workflows remains open. In this paper we extend the ubiquitous Fuzzy c-Means (FCM) clustering algorithm to the space of persistence diagrams, enabling unsupervised learning that automatically captures th… ▽ More

    Submitted 15 February, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: Version 4

  8. arXiv:1911.05712  [pdf, ps, other

    cs.LG stat.ML

    Streaming Bayesian Inference for Crowdsourced Classification

    Authors: Edoardo Manino, Long Tran-Thanh, Nicholas R. Jennings

    Abstract: A key challenge in crowdsourcing is inferring the ground truth from noisy and unreliable data. To do so, existing approaches rely on collecting redundant information from the crowd, and aggregating it with some probabilistic method. However, oftentimes such methods are computationally inefficient, are restricted to some specific settings, or lack theoretical guarantees. In this paper, we revisit t… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Comments: Accepted at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  9. arXiv:1811.12253  [pdf, ps, other

    cs.LG cs.GT cs.MA stat.ML

    Unifying the stochastic and the adversarial Bandits with Knapsack

    Authors: Anshuka Rangi, Massimo Franceschetti, Long Tran-Thanh

    Abstract: This paper investigates the adversarial Bandits with Knapsack (BwK) online learning problem, where a player repeatedly chooses to perform an action, pays the corresponding cost, and receives a reward associated with the action. The player is constrained by the maximum budget $B$ that can be spent to perform actions, and the rewards and the costs of the actions are assigned by an adversary. This pr… ▽ More

    Submitted 23 October, 2018; originally announced November 2018.

  10. arXiv:1405.2432  [pdf, ps, other

    stat.ML cs.LG

    Functional Bandits

    Authors: Long Tran-Thanh, Jia Yuan Yu

    Abstract: We introduce the functional bandit problem, where the objective is to find an arm that optimises a known functional of the unknown arm-reward distributions. These problems arise in many settings such as maximum entropy methods in natural language processing, and risk-averse decision-making, but current best-arm identification techniques fail in these domains. We propose a new approach, that combin… ▽ More

    Submitted 10 May, 2014; originally announced May 2014.