Skip to main content

Showing 1–12 of 12 results for author: Sinclair, S R

.
  1. arXiv:2412.01763  [pdf, other

    math.OC cs.LG stat.ML

    The Data-Driven Censored Newsvendor Problem

    Authors: Chamsi Hssaine, Sean R. Sinclair

    Abstract: We study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs based only on offline censored sales data, rather than historical demand realizations. Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this proble… ▽ More

    Submitted 18 December, 2024; v1 submitted 2 December, 2024; originally announced December 2024.

    Comments: 72 pages, 9 tables, 7 figures

  2. arXiv:2409.14557  [pdf, other

    stat.ML cs.LG math.OC

    Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning

    Authors: Jia Wan, Sean R. Sinclair, Devavrat Shah, Martin J. Wainwright

    Abstract: We study Exo-MDPs, a structured class of Markov Decision Processes (MDPs) where the state space is partitioned into exogenous and endogenous components. Exogenous states evolve stochastically, independent of the agent's actions, while endogenous states evolve deterministically based on both state components and actions. Exo-MDPs are useful for applications including inventory control, portfolio ma… ▽ More

    Submitted 5 February, 2025; v1 submitted 22 September, 2024; originally announced September 2024.

    Comments: 43 pages

  3. arXiv:2408.04488  [pdf, other

    math.OC

    Multi-Objective LQR with Linear Scalarization

    Authors: Ali Jadbabaie, Devavrat Shah, Sean R. Sinclair

    Abstract: The framework of decision-making, modeled as a Markov Decision Process (MDP), typically assumes a single objective. However, practical scenarios often involve tradeoffs between multiple objectives. We address this in the Linear Quadratic Regulator (LQR), a canonical continuous, infinite horizon MDP. First, we establish that the Pareto front for LQR is characterized by linear scalarization: a conve… ▽ More

    Submitted 15 January, 2025; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: 38 pages, 2 figures

  4. arXiv:2406.02402  [pdf, other

    math.OC cs.GT stat.ML

    Online Fair Allocation of Perishable Resources

    Authors: Siddhartha Banerjee, Chamsi Hssaine, Sean R. Sinclair

    Abstract: We consider a practically motivated variant of the canonical online fair allocation problem: a decision-maker has a budget of perishable resources to allocate over a fixed number of rounds. Each round sees a random number of arrivals, and the decision-maker must commit to an allocation for these individuals before moving on to the next round. The goal is to construct a sequence of allocations that… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 51 pages, 8 figures

    MSC Class: 91B32

  5. arXiv:2210.00025  [pdf, other

    cs.LG stat.ML

    Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

    Authors: Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu, Christina Lee Yu

    Abstract: Most real-world deployments of bandit algorithms exist somewhere in between the offline and online set-up, where some historical data is available upfront and additional data is collected dynamically online. How best to incorporate historical data to "warm start" bandit algorithms is an open question: naively initializing reward estimates using all historical samples can suffer from spurious data… ▽ More

    Submitted 19 March, 2025; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: 55 pages (30 pages main paper), 9 figures

  6. arXiv:2207.06272  [pdf, other

    cs.LG stat.ML

    Hindsight Learning for MDPs with Exogenous Inputs

    Authors: Sean R. Sinclair, Felipe Frujeri, Ching-An Cheng, Luke Marshall, Hugo Barbalho, Jingling Li, Jennifer Neville, Ishai Menache, Adith Swaminathan

    Abstract: Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker. We model these problems as Exo-MDPs (Markov Decision Processes with Exogenous Inputs) and design a class of data-efficient algorithms for them termed Hindsight Learning (HL). Our HL algo… ▽ More

    Submitted 23 October, 2023; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: 52 pages, 6 figures

    MSC Class: 68Q32 ACM Class: I.2.6

  7. Adaptive Discretization in Online Reinforcement Learning

    Authors: Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu

    Abstract: Discretization based approaches to solving online reinforcement learning problems have been studied extensively in practice on applications ranging from resource allocation to cache management. Two major questions in designing discretization-based algorithms are how to create the discretization and when to refine it. While there have been several experimental results investigating heuristic soluti… ▽ More

    Submitted 10 October, 2022; v1 submitted 29 October, 2021; originally announced October 2021.

    Comments: 77 pages, 7 figures. arXiv admin note: text overlap with arXiv:2007.00717

    MSC Class: 68Q32 ACM Class: I.2.6

  8. arXiv:2105.05308  [pdf, other

    cs.GT eess.SY math.OC

    Sequential Fair Allocation: Achieving the Optimal Envy-Efficiency Tradeoff Curve

    Authors: Sean R. Sinclair, Gauri Jain, Siddhartha Banerjee, Christina Lee Yu

    Abstract: We consider the problem of dividing limited resources to individuals arriving over $T$ rounds. Each round has a random number of individuals arrive, and individuals can be characterized by their type (i.e. preferences over the different resources). A standard notion of 'fairness' in this setting is that an allocation simultaneously satisfy envy-freeness and efficiency. The former is an individual… ▽ More

    Submitted 29 September, 2022; v1 submitted 11 May, 2021; originally announced May 2021.

    Comments: 42 pages, 5 figures

    MSC Class: 91B32

  9. arXiv:2011.14382  [pdf, other

    cs.GT eess.SY math.OC

    Sequential Fair Allocation of Limited Resources under Stochastic Demands

    Authors: Sean R. Sinclair, Gauri Jain, Siddhartha Banerjee, Christina Lee Yu

    Abstract: We consider the problem of dividing limited resources between a set of agents arriving sequentially with unknown (stochastic) utilities. Our goal is to find a fair allocation - one that is simultaneously Pareto-efficient and envy-free. When all utilities are known upfront, the above desiderata are simultaneously achievable (and efficiently computable) for a large class of utility functions. In a s… ▽ More

    Submitted 9 July, 2022; v1 submitted 29 November, 2020; originally announced November 2020.

    Comments: See arXiv:2105.05308 for an updated version. 36 pages, 6 figures

    MSC Class: 91B32

  10. arXiv:2007.00717  [pdf, other

    cs.LG stat.ML

    Adaptive Discretization for Model-Based Reinforcement Learning

    Authors: Sean R. Sinclair, Tianyu Wang, Gauri Jain, Siddhartha Banerjee, Christina Lee Yu

    Abstract: We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space. From a theoretical perspective we provide worst-case regret bounds for our algorithm which… ▽ More

    Submitted 23 October, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: 50 pages, 7 figures

    MSC Class: 68Q32 ACM Class: I.2.6

  11. arXiv:1910.08151  [pdf, other

    cs.LG stat.ML

    Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

    Authors: Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu

    Abstract: We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel $Q$-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff e… ▽ More

    Submitted 31 October, 2019; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: 46 pages, 15 figures

    MSC Class: 68Q32 ACM Class: I.2.6

  12. arXiv:1608.02806  [pdf, ps, other

    q-bio.TO q-bio.CB

    Normal and pathological dynamics of platelets in humans

    Authors: Gabriel P. Langlois, Morgan Craig, Antony R. Humphries, Michael C. Mackey, Joseph M. Mahaffy, Jacques Bélair, Thibault Moulin, Sean R. Sinclair, Liangliang Wang

    Abstract: We develop a comprehensive mathematical model of platelet, megakaryocyte, and thrombopoietin dynamics in humans. We show that there is a single stationary solution that can undergo a Hopf bifurcation, and use this information to investigate both normal and pathological platelet production, specifically cyclic thrombocytopenia. Carefully estimating model parameters from laboratory and clinical data… ▽ More

    Submitted 26 January, 2017; v1 submitted 29 July, 2016; originally announced August 2016.

    MSC Class: 37N25; 92B99; 92C30; 37G15

    Journal ref: Journal of Mathematical Biology volume 75, pages 1411-1462 (2017)