Skip to main content

Showing 1–40 of 40 results for author: Harutyunyan, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.26426  [pdf, ps, other

    cs.DS

    Improved Approximation for Broadcasting in k-cycle Graphs

    Authors: Jeffrey Bringolf, Anne-Laure Ehresmann, Hovhannes A. Harutyunyan

    Abstract: Broadcasting is an information dissemination primitive where a message originates at a node (called the originator) and is passed to all other nodes in the network. Broadcasting research is motivated by efficient network design and determining the broadcast times of standard network topologies. Verifying the broadcast time of a node $v$ in an arbitrary network $G$ is known to be NP-hard. Additiona… ▽ More

    Submitted 30 September, 2025; originally announced September 2025.

    Comments: 16 pages, 5 figures

    ACM Class: F.2

  2. arXiv:2507.10266  [pdf, ps, other

    math.CO cs.DM

    $(Δ-1)$-dicolouring of digraphs

    Authors: Ararat Harutyunyan, Ken-ichi Kawarabayashi, Lucas Picasarri-Arrieta, Gil Puig i Surroca

    Abstract: In 1977, Borodin and Kostochka conjectured that every graph with maximum degree $Δ\geq 9$ is $(Δ-1)$-colourable, unless it contains a clique of size $Δ$. In 1999, Reed confirmed the conjecture when $Δ\geq 10^{14}$. We propose different generalisations of this conjecture for digraphs, and prove the analogue of Reed's result for each of them. The chromatic number and clique number are replaced res… ▽ More

    Submitted 14 July, 2025; originally announced July 2025.

  3. arXiv:2506.11052  [pdf, ps, other

    cs.LG cs.AI

    ACCORD: Autoregressive Constraint-satisfying Generation for COmbinatorial Optimization with Routing and Dynamic attention

    Authors: Henrik Abgaryan, Tristan Cazenave, Ararat Harutyunyan

    Abstract: Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet their direct application to NP-hard combinatorial problems (CPs) remains underexplored. In this work, we systematically investigate the reasoning abilities of LLMs on a variety of NP-hard combinatorial optimization tasks and introduce ACCORD: Autoregressive Constraint-satisfying generation for COmbinatorial optim… ▽ More

    Submitted 22 May, 2025; originally announced June 2025.

  4. arXiv:2505.10361  [pdf, other

    cs.AI cs.LG

    Plasticity as the Mirror of Empowerment

    Authors: David Abel, Michael Bowling, André Barreto, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh

    Abstract: Agents are minimally entities that are influenced by their past observations and act to influence future observations. This latter capacity is captured by empowerment, which has served as a vital framing concept across artificial intelligence and cognitive science. This former capacity, however, is equally foundational: In what ways, and to what extent, can an agent be influenced by what it observ… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  5. arXiv:2504.12682  [pdf, other

    cs.AI cs.CL

    WebLists: Extracting Structured Information From Complex Interactive Websites Using Executable LLM Agents

    Authors: Arth Bohra, Manvel Saroyan, Danil Melkozerov, Vahe Karufanyan, Gabriel Maher, Pascal Weinberger, Artem Harutyunyan, Giovanni Campagna

    Abstract: Most recent web agent research has focused on navigation and transaction tasks, with little emphasis on extracting structured data at scale. We present WebLists, a benchmark of 200 data-extraction tasks across four common business and enterprise use-cases. Each task requires an agent to navigate to a webpage, configure it appropriately, and extract complete datasets with well-defined schemas. We s… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

  6. arXiv:2503.04511  [pdf, other

    cs.DS cs.DM

    Source-Oblivious Broadcast

    Authors: Pierre Fraigniaud, Hovhannes A. Harutyunyan

    Abstract: This paper revisits the study of (minimum) broadcast graphs, i.e., graphs enabling fast information dissemination from every source node to all the other nodes (and having minimum number of edges for this property). This study is performed in the framework of compact distributed data structures, that is, when the broadcast protocols are bounded to be encoded at each node as an ordered list of neig… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  7. arXiv:2503.01877  [pdf, other

    cs.LG cs.AI

    Starjob: Dataset for LLM-Driven Job Shop Scheduling

    Authors: Henrik Abgaryan, Tristan Cazenave, Ararat Harutyunyan

    Abstract: Large Language Models (LLMs) have shown remarkable capabilities across various domains, but their potential for solving combinatorial optimization problems remains largely unexplored. In this paper, we investigate the applicability of LLMs to the Job Shop Scheduling Problem (JSSP), a classic challenge in combinatorial optimization that requires efficient job allocation to machines to minimize make… ▽ More

    Submitted 27 March, 2025; v1 submitted 26 February, 2025; originally announced March 2025.

    Comments: arXiv admin note: substantial text overlap with arXiv:2408.06993

  8. arXiv:2502.04403  [pdf, other

    cs.AI

    Agency Is Frame-Dependent

    Authors: David Abel, André Barreto, Michael Bowling, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh

    Abstract: Agency is a system's capacity to steer outcomes toward a goal, and is a central topic of study across biology, philosophy, cognitive science, and artificial intelligence. Determining if a system exhibits agency is a notoriously difficult question: Dennett (1989), for instance, highlights the puzzle of determining which principles can decide whether a rock, a thermostat, or a robot each possess age… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  9. arXiv:2408.06993  [pdf, other

    cs.AI

    LLMs can Schedule

    Authors: Henrik Abgaryan, Ararat Harutyunyan, Tristan Cazenave

    Abstract: The job shop scheduling problem (JSSP) remains a significant hurdle in optimizing production processes. This challenge involves efficiently allocating jobs to a limited number of machines while minimizing factors like total processing time or job delays. While recent advancements in artificial intelligence have yielded promising solutions, such as reinforcement learning and graph neural networks,… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  10. arXiv:2407.10583  [pdf, other

    cs.AI cs.LG

    Three Dogmas of Reinforcement Learning

    Authors: David Abel, Mark K. Ho, Anna Harutyunyan

    Abstract: Modern reinforcement learning has been conditioned by at least three dogmas. The first is the environment spotlight, which refers to our tendency to focus on modeling environments rather than agents. The second is our treatment of learning as finding the solution to a task, rather than adaptation. The third is the reward hypothesis, which states that all goals and purposes can be well thought of a… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: RLC 2024

  11. arXiv:2405.12229  [pdf, other

    physics.chem-ph cond-mat.mtrl-sci cs.AI cs.CE physics.comp-ph

    Multi-task learning for molecular electronic structure approaching coupled-cluster accuracy

    Authors: Hao Tang, Brian Xiao, Wenhao He, Pero Subasic, Avetik R. Harutyunyan, Yao Wang, Fang Liu, Haowei Xu, Ju Li

    Abstract: Machine learning (ML) plays an important role in quantum chemistry, providing fast-to-evaluate predictive models for various properties of molecules. However, most existing ML models for molecular electronic properties use density functional theory (DFT) databases as ground truth in training, and their prediction accuracy cannot surpass that of DFT. In this work, we developed a unified ML method f… ▽ More

    Submitted 24 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  12. arXiv:2404.05902  [pdf, other

    cs.CL cs.AI

    WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents

    Authors: Michael Lutz, Arth Bohra, Manvel Saroyan, Artem Harutyunyan, Giovanni Campagna

    Abstract: In the realm of web agent research, achieving both generalization and accuracy remains a challenging problem. Due to high variance in website structure, existing approaches often fail. Moreover, existing fine-tuning and in-context learning techniques fail to generalize across multiple websites. We introduce Wilbur, an approach that uses a differentiable ranking model and a novel instruction synthe… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  13. arXiv:2310.06111  [pdf, other

    cs.CL cs.LG

    BYOC: Personalized Few-Shot Classification with Co-Authored Class Descriptions

    Authors: Arth Bohra, Govert Verkes, Artem Harutyunyan, Pascal Weinberger, Giovanni Campagna

    Abstract: Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for the… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 (Findings)

  14. arXiv:2309.14185  [pdf, other

    cs.DS

    Temporal Separators with Deadlines

    Authors: Hovhannes A. Harutyunyan, Kamran Koupayi, Denis Pankratov

    Abstract: We study temporal analogues of the Unrestricted Vertex Separator problem from the static world. An $(s,z)$-temporal separator is a set of vertices whose removal disconnects vertex $s$ from vertex $z$ for every time step in a temporal graph. The $(s,z)$-Temporal Separator problem asks to find the minimum size of an $(s,z)$-temporal separator for the given temporal graph. We introduce a generalizati… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  15. arXiv:2306.10171  [pdf, other

    cs.LG cs.AI stat.ML

    Bootstrapped Representations in Reinforcement Learning

    Authors: Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

    Abstract: In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces. While one of the promises of deep learning algorithms is to automatically construct features well-tuned for the task they try to solve, such a representation might not emerge from end-to-end training of deep RL agents. To mitigate this issue, auxiliary objectives are often incorporated i… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  16. arXiv:2305.18501  [pdf, other

    cs.LG

    DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

    Authors: Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko

    Abstract: Multi-step learning applies lookahead over multiple time steps and has proved valuable in policy evaluation settings. However, in the optimal control case, the impact of multi-step learning has been relatively limited despite a number of prior efforts. Fundamentally, this might be because multi-step policy improvements require operations that cannot be approximated by stochastic samples, hence hin… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  17. arXiv:2303.03225  [pdf, other

    cs.DM math.CO

    Odd Chromatic Number of Graph Classes

    Authors: Rémy Belmonte, Ararat Harutyunyan, Noleen Köhler, Nikolaos Melissinos

    Abstract: A graph is called odd (respectively, even) if every vertex has odd (respectively, even) degree. Gallai proved that every graph can be partitioned into two even induced subgraphs, or into an odd and an even induced subgraph. We refer to a partition into odd subgraphs as an odd colouring of G. Scott [Graphs and Combinatorics, 2001] proved that a graph admits an odd colouring if and only if it has an… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: 21 pages, 3 figures

  18. arXiv:2301.04462  [pdf, other

    cs.LG stat.ML

    An Analysis of Quantile Temporal-Difference Learning

    Authors: Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney

    Abstract: We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning. Despite these empirical successes, a theoretical understanding of QTD has proven elusive until now. Unlike classical TD learning, which can be analysed with standard stochastic appro… ▽ More

    Submitted 20 May, 2024; v1 submitted 11 January, 2023; originally announced January 2023.

    Comments: Accepted to JMLR

  19. arXiv:2111.00876  [pdf, other

    cs.LG cs.AI

    On the Expressivity of Markov Reward

    Authors: David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

    Abstract: Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajector… ▽ More

    Submitted 18 January, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted to NeurIPS 2021

  20. arXiv:2109.11203  [pdf, other

    cs.CC cs.DS

    Filling Crosswords is Very Hard

    Authors: Laurent Gourvès, Ararat Harutyunyan, Michael Lampis, Nikolaos Melissinos

    Abstract: We revisit a classical crossword filling puzzle which already appeared in Garey\&Jonhson's book. We are given a grid with $n$ vertical and horizontal slots and a dictionary with $m$ words and are asked to place words from the dictionary in the slots so that shared cells are consistent. We attempt to pinpoint the source of intractability of this problem by taking into account the structure of the g… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

  21. arXiv:2011.09464  [pdf, other

    cs.LG

    Counterfactual Credit Assignment in Model-Free Reinforcement Learning

    Authors: Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Éric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos

    Abstract: Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards. In particular, this requires separating skill from luck, i.e. disentangling the effect of an action on rewards from that of external factors and subsequent actions. To achieve this, we adapt the notion of counterfactuals from causality theory to a model-free RL setup. The key idea is to… ▽ More

    Submitted 14 December, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

  22. arXiv:2011.01297  [pdf, other

    cs.LG cs.AI

    Useful Policy Invariant Shaping from Arbitrary Advice

    Authors: Paniz Behboudian, Yash Satsangi, Matthew E. Taylor, Anna Harutyunyan, Michael Bowling

    Abstract: Reinforcement learning is a powerful learning paradigm in which agents can learn to maximize sparse and delayed reward signals. Although RL has had many impressive successes in complex domains, learning can take hours, days, or even years of training data. A major challenge of contemporary RL research is to discover how to learn with less data. Previous work has shown that domain information can b… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: 9 pages, 6 figures, Adaptive and Learning Agents (ALA) 2020 Workshop

  23. arXiv:2010.06317  [pdf, ps, other

    cs.CC

    Digraph Coloring and Distance to Acyclicity

    Authors: Ararat Harutyunyan, Michael Lampis, Nikolaos Melissinos

    Abstract: In $k$-Digraph Coloring we are given a digraph and are asked to partition its vertices into at most $k$ sets, so that each set induces a DAG. This well-known problem is NP-hard, as it generalizes (undirected) $k$-Coloring, but becomes trivial if the input digraph is acyclic. This poses the natural parameterized complexity question what happens when the input is "almost" acyclic. In this paper we s… ▽ More

    Submitted 3 January, 2022; v1 submitted 13 October, 2020; originally announced October 2020.

  24. arXiv:1912.02503  [pdf, other

    cs.LG stat.ML

    Hindsight Credit Assignment

    Authors: Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Remi Munos

    Abstract: We consider the problem of efficient credit assignment in reinforcement learning. In order to efficiently and meaningfully utilize new data, we propose to explicitly assign credit to past decisions based on the likelihood of them having led to the observed outcome. This approach uses new information in hindsight, rather than employing foresight. Somewhat surprisingly, we show that value functions… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: NeurIPS 2019

  25. arXiv:1910.07479  [pdf, other

    cs.LG stat.ML

    Conditional Importance Sampling for Off-Policy Learning

    Authors: Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney

    Abstract: The principal contribution of this paper is a conceptual framework for off-policy reinforcement learning, based on conditional expectations of importance sampling ratios. This framework yields new perspectives and understanding of existing off-policy algorithms, and reveals a broad space of unexplored algorithms. We theoretically analyse this space, and concretely investigate several algorithms th… ▽ More

    Submitted 30 July, 2020; v1 submitted 16 October, 2019; originally announced October 2019.

    Comments: AISTATS 2020 camera-ready version

  26. arXiv:1902.09996  [pdf, other

    cs.AI cs.LG stat.ML

    The Termination Critic

    Authors: Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Remi Munos, Doina Precup

    Abstract: In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We propose an algorithm that focuses on the termination condition, as opposed to -- as is common -- the policy. The termination condition is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a dif… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: AISTATS 2019

  27. arXiv:1902.01874  [pdf, other

    cs.DS

    Average-case complexity of a branch-and-bound algorithm for min dominating set

    Authors: Tom Denat, Ararat Harutyunyan, Vangelis Th. Paschos

    Abstract: The average-case complexity of a branch-and-bound algorithms for Minimum Dominating Set problem in random graphs in the G(n,p) model is studied. We identify phase transitions between subexponential and exponential average-case complexities, depending on the growth of the probability p with respect to the number n of nodes.

    Submitted 5 February, 2019; originally announced February 2019.

  28. arXiv:1810.10940  [pdf, ps, other

    cs.DS cs.DM

    Maximum Independent Sets in Subcubic Graphs: New Results

    Authors: Ararat Harutyunyan, Michael Lampis, Vadim Lozin, Jérôme Monnot

    Abstract: The maximum independent set problem is known to be NP-hard in the class of subcubic graphs, i.e. graphs of vertex degree at most 3. We present a polynomial-time solution in a subclass of subcubic graphs generalizing several previously known results.

    Submitted 25 October, 2018; originally announced October 2018.

  29. arXiv:1711.03817  [pdf, other

    cs.AI

    Learning with Options that Terminate Off-Policy

    Authors: Anna Harutyunyan, Peter Vrancx, Pierre-Luc Bacon, Doina Precup, Ann Nowe

    Abstract: A temporally abstract action, or an option, is specified by a policy and a termination condition: the policy guides option behavior, and the termination condition roughly determines its length. Generally, learning with longer options (like learning with multi-step returns) is known to be more efficient. However, if the option set for the task is not ideal, and cannot express the primitive optimal… ▽ More

    Submitted 2 December, 2017; v1 submitted 10 November, 2017; originally announced November 2017.

    Comments: AAAI 2018

  30. arXiv:1708.06551  [pdf, other

    cs.AI cs.LG

    Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets

    Authors: Denis Steckelmacher, Diederik M. Roijers, Anna Harutyunyan, Peter Vrancx, Hélène Plisnier, Ann Nowé

    Abstract: Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the… ▽ More

    Submitted 12 September, 2017; v1 submitted 22 August, 2017; originally announced August 2017.

  31. arXiv:1707.08567  [pdf

    cs.IT

    Proceedings of Workshop AEW10: Concepts in Information Theory and Communications

    Authors: Kees A. Schouhamer Immink, Stan Baggen, Ferdaous Chaabane, Yanling Chen, Peter H. N. de With, Hela Gassara, Hamed Gharbi, Adel Ghazel, Khaled Grati, Naira M. Grigoryan, Ashot Harutyunyan, Masayuki Imanishi, Mitsugu Iwamoto, Ken-ichi Iwata, Hiroshi Kamabe, Brian M. Kurkoski, Shigeaki Kuzuoka, Patrick Langenhuizen, Jan Lewandowsky, Akiko Manada, Shigeki Miyake, Hiroyoshi Morita, Jun Muramatsu, Safa Najjar, Arnak V. Poghosyan , et al. (9 additional authors not shown)

    Abstract: The 10th Asia-Europe workshop in "Concepts in Information Theory and Communications" AEW10 was held in Boppard, Germany on June 21-23, 2017. It is based on a longstanding cooperation between Asian and European scientists. The first workshop was held in Eindhoven, the Netherlands in 1989. The idea of the workshop is threefold: 1) to improve the communication between the scientist in the different p… ▽ More

    Submitted 27 July, 2017; originally announced July 2017.

    Comments: 44 pages, editors for the proceedings: Yanling Chen and A. J. Han Vinck

    MSC Class: 68P30; 94A05

  32. arXiv:1607.04777  [pdf, other

    cs.DS cs.DM math.CO

    The complexity of tropical graph homomorphisms

    Authors: Florent Foucaud, Ararat Harutyunyan, Pavol Hell, Sylvain Legay, Yannis Manoussakis, Reza Naserasr

    Abstract: A tropical graph $(H,c)$ consists of a graph $H$ and a (not necessarily proper) vertex-colouring $c$ of $H$. Given two tropical graphs $(G,c_1)$ and $(H,c)$, a homomorphism of $(G,c_1)$ to $(H,c)$ is a standard graph homomorphism of $G$ to $H$ that also preserves the vertex-colours. We initiate the study of the computational complexity of tropical graph homomorphism problems. We consider two setti… ▽ More

    Submitted 30 January, 2018; v1 submitted 16 July, 2016; originally announced July 2016.

    Comments: 27 pages, 13 figures, 1 table. Compared to the published version, this version includes all proofs and some additional figures

    Journal ref: Discrete Applied Mathematics 229:64-81, 2017

  33. arXiv:1606.02647  [pdf, other

    cs.LG cs.AI stat.ML

    Safe and Efficient Off-Policy Reinforcement Learning

    Authors: Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare

    Abstract: In this work, we take a fresh look at some old and new algorithms for off-policy, return-based reinforcement learning. Expressing these in a common form, we derive a novel algorithm, Retrace($λ$), with three desired properties: (1) it has low variance; (2) it safely uses samples collected from any behaviour policy, whatever its degree of "off-policyness"; and (3) it is efficient as it makes the be… ▽ More

    Submitted 7 November, 2016; v1 submitted 8 June, 2016; originally announced June 2016.

  34. arXiv:1602.04951  [pdf, other

    cs.AI cs.LG stat.ML

    Q($λ$) with Off-Policy Corrections

    Authors: Anna Harutyunyan, Marc G. Bellemare, Tom Stepleton, Remi Munos

    Abstract: We propose and analyze an alternate approach to off-policy multi-step temporal difference learning, in which off-policy returns are corrected with the current Q-function in terms of rewards, rather than with the target policy in terms of transition probabilities. We prove that such approximate corrections are sufficient for off-policy convergence both in policy evaluation and control, provided cer… ▽ More

    Submitted 11 August, 2016; v1 submitted 16 February, 2016; originally announced February 2016.

  35. arXiv:1502.03248  [pdf, other

    cs.AI

    Off-Policy Reward Shaping with Ensembles

    Authors: Anna Harutyunyan, Tim Brys, Peter Vrancx, Ann Nowe

    Abstract: Potential-based reward shaping (PBRS) is an effective and popular technique to speed up reinforcement learning by leveraging domain knowledge. While PBRS is proven to always preserve optimal policies, its effect on learning speed is determined by the quality of its potential function, which, in turn, depends on both the underlying heuristic and the scale. Knowing which heuristic will prove effecti… ▽ More

    Submitted 23 March, 2015; v1 submitted 11 February, 2015; originally announced February 2015.

    Comments: To be presented at ALA-15. Short version to appear at AAMAS-15

  36. arXiv:1405.5358  [pdf, other

    cs.AI cs.LG

    Off-Policy Shaping Ensembles in Reinforcement Learning

    Authors: Anna Harutyunyan, Tim Brys, Peter Vrancx, Ann Nowe

    Abstract: Recent advances of gradient temporal-difference methods allow to learn off-policy multiple value functions in parallel with- out sacrificing convergence guarantees or computational efficiency. This opens up new possibilities for sound ensemble techniques in reinforcement learning. In this work we propose learning an ensemble of policies related through potential-based shaping rewards. The ensemble… ▽ More

    Submitted 21 May, 2014; originally announced May 2014.

    Comments: Full version of the paper to appear in Proc. ECAI 2014

  37. arXiv:1401.4568  [pdf, ps, other

    cs.DM math.CO

    Strong edge-colouring of sparse planar graphs

    Authors: Julien Bensmail, Ararat Harutyunyan, Hervé Hocquard, Petru Valicov

    Abstract: A strong edge-colouring of a graph is a proper edge-colouring where each colour class induces a matching. It is known that every planar graph with maximum degree $Δ$ has a strong edge-colouring with at most $4Δ+4$ colours. We show that $3Δ+1$ colours suffice if the graph has girth 6, and $4Δ$ colours suffice if $Δ\geq 7$ or the girth is at least 5. In the last part of the paper, we raise some ques… ▽ More

    Submitted 21 July, 2014; v1 submitted 18 January, 2014; originally announced January 2014.

  38. arXiv:1306.5391  [pdf, other

    cs.DS

    Boundary-to-boundary flows in planar graphs

    Authors: Glencora Borradaile, Anna Harutyunyan

    Abstract: We give an iterative algorithm for finding the maximum flow between a set of sources and sinks that lie on the boundary of a planar graph. Our algorithm uses only O(n) queries to simple data structures, achieving an O(n log n) running time that we expect to be practical given the use of simple primitives. The only existing algorithm for this problem uses divide and conquer and, in order to achieve… ▽ More

    Submitted 23 June, 2013; originally announced June 2013.

    Comments: In Proc. IWOCA, 2013

  39. arXiv:1305.5823  [pdf, other

    cs.DS cs.DM math.CO

    Maximum st-flow in directed planar graphs via shortest paths

    Authors: Glencora Borradaile, Anna Harutyunyan

    Abstract: Minimum cuts have been closely related to shortest paths in planar graphs via planar duality - so long as the graphs are undirected. Even maximum flows are closely related to shortest paths for the same reason - so long as the source and the sink are on a common face. In this paper, we give a correspondence between maximum flows and shortest paths via duality in directed planar graphs with no cons… ▽ More

    Submitted 24 May, 2013; originally announced May 2013.

    Comments: 20 pages, 4 figures. Short version to be published in proceedings of IWOCA'13

  40. On Multiple Hypothesis Testing with Rejection Option

    Authors: Naira Grigoryan, Ashot Harutyunyan, Svyatoslav Voloshynovskiy, Oleksiy Koval

    Abstract: We study the problem of multiple hypothesis testing (HT) in view of a rejection option. That model of HT has many different applications. Errors in testing of M hypotheses regarding the source distribution with an option of rejecting all those hypotheses are considered. The source is discrete and arbitrarily varying (AVS). The tradeoffs among error probability exponents/reliabilities associated wi… ▽ More

    Submitted 25 May, 2011; v1 submitted 17 February, 2011; originally announced February 2011.

    Comments: 5 pages, 3 figures, submitted to IEEE Information Theory Workshop 2011