Skip to main content

Showing 1–3 of 3 results for author: Nassif, H

Searching in archive math. Search in all archives.
.
  1. arXiv:2401.04265  [pdf, other

    math.ST

    Estimation of subsidiary performance metrics under optimal policies

    Authors: Zhaoqi Li, Houssam Nassif, Alex Luedtke

    Abstract: In policy learning, the goal is typically to optimize a primary performance metric, but other subsidiary metrics often also warrant attention. This paper presents two strategies for evaluating these subsidiary metrics under a policy that is optimal for the primary one. The first relies on a novel margin condition that facilitates Wald-type inference. Under this and other regularity conditions, we… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  2. arXiv:2310.04390  [pdf, other

    math.ST

    Experimental Designs for Heteroskedastic Variance

    Authors: Justin Weltz, Tanner Fiez, Alexander Volfovsky, Eric Laber, Blake Mason, Houssam Nassif, Lalit Jain

    Abstract: Most linear experimental design problems assume homogeneous variance although heteroskedastic noise is present in many realistic settings. Let a learner have access to a finite set of measurement vectors $\mathcal{X}\subset \mathbb{R}^d$ that can be probed to receive noisy linear responses of the form $y=x^{\top}θ^{\ast}+η$. Here $θ^{\ast}\in \mathbb{R}^d$ is an unknown parameter vector, and $η$ i… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS'23), New Orleans, pp. 65967-66005, 2023

  3. arXiv:2007.07443  [pdf, other

    cs.LG math.OC stat.ML

    Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions

    Authors: Sinong Geng, Houssam Nassif, Carlos A. Manzanares, A. Max Reppen, Ronnie Sircar

    Abstract: We propose a reward function estimation framework for inverse reinforcement learning with deep energy-based policies. We name our method PQR, as it sequentially estimates the Policy, the $Q$-function, and the Reward function by deep learning. PQR does not assume that the reward solely depends on the state, instead it allows for a dependency on the choice of action. Moreover, PQR allows for stochas… ▽ More

    Submitted 14 August, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Journal ref: In Proceedings of the 37th ICML, Vienna, Austria, PMLR 119, pp. 3431-3441, 2020