Skip to main content

Showing 1–4 of 4 results for author: Virochsiri, K

.
  1. arXiv:2102.05612  [pdf, other

    cs.LG cs.HC cs.SE

    Personalization for Web-based Services using Offline Reinforcement Learning

    Authors: Pavlos Athanasios Apostolopoulos, Zehui Wang, Hanson Wang, Chad Zhou, Kittipat Virochsiri, Norm Zhou, Igor L. Markov

    Abstract: Large-scale Web-based services present opportunities for improving UI policies based on observed user interactions. We address challenges of learning such policies through model-free offline Reinforcement Learning (RL) with off-policy training. Deployed in a production system for user authentication in a major social network, it significantly improves long-term objectives. We articulate practical… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

    Comments: 9 pages, 8 figures, 3 tables

    Journal ref: 2nd Offline Reinforcement Learning Workshop at NeurIPS 2021

  2. arXiv:2012.10858  [pdf, other

    cs.LG cs.AI

    Reinforcement Learning-based Product Delivery Frequency Control

    Authors: Yang Liu, Zhengxing Chen, Kittipat Virochsiri, Juan Wang, Jiahao Wu, Feng Liang

    Abstract: Frequency control is an important problem in modern recommender systems. It dictates the delivery frequency of recommendations to maintain product quality and efficiency. For example, the frequency of delivering promotional notifications impacts daily metrics as well as the infrastructure resource consumption (e.g. CPU and memory usage). There remain open questions on what objective we should opti… ▽ More

    Submitted 20 December, 2020; originally announced December 2020.

    Comments: In 35th AAAI Conference on Artificial Intelligence, February 2-9, 2021

  3. arXiv:2006.11431  [pdf, other

    cs.LG cs.AI stat.ML

    Band-limited Soft Actor Critic Model

    Authors: Miguel Campo, Zhengxing Chen, Luke Kung, Kittipat Virochsiri, Jianyu Wang

    Abstract: Soft Actor Critic (SAC) algorithms show remarkable performance in complex simulated environments. A key element of SAC networks is entropy regularization, which prevents the SAC actor from optimizing against fine grained features, oftentimes transient, of the state-action value function. This results in better sample efficiency during early training. We take this idea one step further by artificia… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

    Comments: 8 pages plus additional material

    MSC Class: 68T07 ACM Class: I.2.8

  4. arXiv:1811.00260  [pdf, other

    cs.LG cs.AI stat.ML

    Horizon: Facebook's Open Source Applied Reinforcement Learning Platform

    Authors: Jason Gauci, Edoardo Conti, Yitao Liang, Kittipat Virochsiri, Yuchen He, Zachary Kaden, Vivek Narayanan, Xiaohui Ye, Zhengxing Chen, Scott Fujimoto

    Abstract: In this paper we present Horizon, Facebook's open source applied reinforcement learning (RL) platform. Horizon is an end-to-end platform designed to solve industry applied RL problems where datasets are large (millions to billions of observations), the feedback loop is slow (vs. a simulator), and experiments must be done with care because they don't run in a simulator. Unlike other RL platforms, w… ▽ More

    Submitted 4 September, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: 10 pages