Skip to main content

Showing 1–12 of 12 results for author: Gattami, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.19212  [pdf, ps, other

    math.OC cs.LG

    Deep Reinforcement Learning: A Convex Optimization Approach

    Authors: Ather Gattami

    Abstract: In this paper, we consider reinforcement learning of nonlinear systems with continuous state and action spaces. We present an episodic learning algorithm, where we for each episode use convex optimization to find a two-layer neural network approximation of the optimal $Q$-function. The convex optimization approach guarantees that the weights calculated at each episode are optimal, with respect to… ▽ More

    Submitted 24 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  2. arXiv:2301.11802  [pdf, ps, other

    cs.LG cs.GT

    Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds

    Authors: Johan Östman, Ather Gattami, Daniel Gillblad

    Abstract: We consider a decentralized multiplayer game, played over $T$ rounds, with a leader-follower hierarchy described by a directed acyclic graph. For each round, the graph structure dictates the order of the players and how players observe the actions of one another. By the end of each round, all players receive a joint bandit-reward based on their joint action that is used to update the player strate… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  3. arXiv:2006.05961   

    cs.LG cs.NI eess.SY math.OC stat.ML

    Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints

    Authors: Qinbo Bai, Vaneet Aggarwal, Ather Gattami

    Abstract: In the optimization of dynamical systems, the variables typically have constraints. Such problems can be modeled as a constrained Markov Decision Process (CMDP). This paper considers a model-free approach to the problem, where the transition probabilities are not known. In the presence of long-term (or average) constraints, the agent has to choose a policy that maximizes the long-term average rewa… ▽ More

    Submitted 30 January, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: The result has error

  4. arXiv:2003.05555  [pdf, other

    math.OC cs.LG eess.SY stat.ML

    Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints

    Authors: Qinbo Bai, Vaneet Aggarwal, Ather Gattami

    Abstract: In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak Constrained Markov Decision Process (PCMDP), where the agent chooses the policy to maximize total reward in the finite horizon as well as satisfy constraints at each epoch with probability 1. We propose a model… ▽ More

    Submitted 13 June, 2022; v1 submitted 11 March, 2020; originally announced March 2020.

  5. arXiv:2002.07638  [pdf, other

    cs.LG stat.ML

    Conditional Mutual information-based Contrastive Loss for Financial Time Series Forecasting

    Authors: Hanwei Wu, Ather Gattami, Markus Flierl

    Abstract: We present a representation learning framework for financial time series forecasting. One challenge of using deep learning models for finance forecasting is the shortage of available training data when using small datasets. Direct trend classification using deep neural networks trained on small datasets is susceptible to the overfitting problem. In this paper, we propose to first learn compact rep… ▽ More

    Submitted 7 May, 2021; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: Published in ICAIF 2020 : ACM International Conference on AI in Finance

  6. arXiv:1901.07839  [pdf, ps, other

    math.OC cs.LG

    Reinforcement Learning of Markov Decision Processes with Peak Constraints

    Authors: Ather Gattami

    Abstract: In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take actions based on the observed states, reward outputs, and constraint-outputs, without any knowledge about the dynamics, reward functions, and/or the knowledge o… ▽ More

    Submitted 6 December, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

  7. arXiv:1605.04579  [pdf, other

    cs.IT

    Communicating One Bit over a Delay Constrained Gaussian MIMO Channel with Feedback

    Authors: Bo Bernhardsson, Ather Gattami

    Abstract: The energy-optimal scheme is found for communicating one bit over a memoryless Gaussian channel with an ideal feedback channel. It is assumed that the channel is allowed to be used at most N times before decoding. The optimal coding/decoding strategy is derived by dynamic programming. It is found that feedback gives a significant performance gain and that the optimal strategies are discontinuous.… ▽ More

    Submitted 15 May, 2016; originally announced May 2016.

    Comments: Submitted for publication

  8. arXiv:1511.06866  [pdf, other

    cs.IT math.OC

    Feedback Capacity of Gaussian Channels Revisited

    Authors: Ather Gattami

    Abstract: In this paper, we revisit the problem of finding the average capacity of the Gaussian feedback channel. First, we consider the problem of finding the average capacity of the analog Gaussian noise channel where the noise has an arbitrary spectral density. We introduce a new approach to the problem where we solve the problem over a finite number of transmissions and then consider the limit of an inf… ▽ More

    Submitted 23 January, 2019; v1 submitted 21 November, 2015; originally announced November 2015.

  9. arXiv:1506.00484  [pdf, other

    cs.IT

    Optimal Communication of States of Dynamical Systems over Gaussian Channels with Noisy Feedback: The Scalar Case

    Authors: Ather Gattami

    Abstract: We consider the problem of communicating the state of a dynamical system via a Shannon Gaussian channel. The receiver, which acts as both a decoder and estimator, observes the noisy measurement of the channel output and makes an optimal estimate of the state of the dynamical system in the minimum mean square sense. Noisy feedback from the receiver to the transmitter is present. The transmitter obs… ▽ More

    Submitted 1 June, 2015; originally announced June 2015.

    Comments: arXiv admin note: substantial text overlap with arXiv:1404.4350

  10. arXiv:1505.03309  [pdf, other

    cs.IT

    Time Localization and Capacity of Faster-Than-Nyquist Signaling

    Authors: Ather Gattami, Emil Ringh, Johan Karlsson

    Abstract: In this paper, we consider communication over the bandwidth limited analog white Gaussian noise channel using non-orthogonal pulses. In particular, we consider non-orthogonal transmission by signaling samples at a rate higher than the Nyquist rate. Using the faster-than-Nyquist (FTN) framework, Mazo showed that one may transmit symbols carried by sinc pulses at a higher rate than that dictated by… ▽ More

    Submitted 7 December, 2015; v1 submitted 13 May, 2015; originally announced May 2015.

  11. arXiv:1505.02997  [pdf, other

    cs.IT

    Optimal Data and Training Symbol Ratio for Communication over Uncertain Channels

    Authors: Ather Gattami

    Abstract: We consider the problem of determining the power ratio between the training symbols and data symbols in order to maximize the channel capacity for transmission over uncertain channels with a channel estimate available at both the transmitter and receiver. The receiver makes an estimate of the channel by using a known sequence of training symbols. This channel estimate is then transmitted back to t… ▽ More

    Submitted 12 May, 2015; originally announced May 2015.

  12. arXiv:1404.4350  [pdf, other

    cs.IT math.OC

    Kalman meets Shannon

    Authors: Ather Gattami

    Abstract: We consider the problem of communicating the state of a dynamical system via a Shannon Gaussian channel. The receiver, which acts as both a decoder and estimator, observes the noisy measurement of the channel output and makes an optimal estimate of the state of the dynamical system in the minimum mean square sense. The transmitter observes a possibly noisy measurement of the state of the dynamical… ▽ More

    Submitted 12 May, 2015; v1 submitted 16 April, 2014; originally announced April 2014.