Active Measuring in Reinforcement Learning With Delayed Negative Effects

Gao, Daiqi; Xu, Ziping; Rawashdeh, Aseel; Klasnja, Predrag; Murphy, Susan A.

Computer Science > Machine Learning

arXiv:2510.14315 (cs)

[Submitted on 16 Oct 2025]

Title:Active Measuring in Reinforcement Learning With Delayed Negative Effects

Authors:Daiqi Gao, Ziping Xu, Aseel Rawashdeh, Predrag Klasnja, Susan A. Murphy

View PDF HTML (experimental)

Abstract:Measuring states in reinforcement learning (RL) can be costly in real-world settings and may negatively influence future outcomes. We introduce the Actively Observable Markov Decision Process (AOMDP), where an agent not only selects control actions but also decides whether to measure the latent state. The measurement action reveals the true latent state but may have a negative delayed effect on the environment. We show that this reduced uncertainty may provably improve sample efficiency and increase the value of the optimal policy despite these costs. We formulate an AOMDP as a periodic partially observable MDP and propose an online RL algorithm based on belief states. To approximate the belief states, we further propose a sequential Monte Carlo method to jointly approximate the posterior of unknown static environment parameters and unobserved latent states. We evaluate the proposed algorithm in a digital health application, where the agent decides when to deliver digital interventions and when to assess users' health status through surveys.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2510.14315 [cs.LG]
	(or arXiv:2510.14315v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2510.14315

Submission history

From: Daiqi Gao [view email]
[v1] Thu, 16 Oct 2025 05:21:36 UTC (701 KB)

Computer Science > Machine Learning

Title:Active Measuring in Reinforcement Learning With Delayed Negative Effects

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Active Measuring in Reinforcement Learning With Delayed Negative Effects

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators