-
Sufficient Markov Decision Processes with Alternating Deep Neural Networks
Authors:
Longshaokan Wang,
Eric B. Laber,
Katie Witkiewitz
Abstract:
Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. We pro…
▽ More
Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. We propose a method for constructing a low-dimensional representation of the original decision process for which: 1. the MDP model holds; 2. a decision strategy that maximizes mean utility when applied to the low-dimensional representation also maximizes mean utility when applied to the original process. We use a deep neural network to define a class of potential process representations and estimate the process of lowest dimension within this class. The method is illustrated using data from a mobile study on heavy drinking and smoking among college students.
△ Less
Submitted 17 March, 2018; v1 submitted 25 April, 2017;
originally announced April 2017.
-
A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward
Authors:
S. A. Murphy,
Y. Deng,
E. B. Laber,
H. R. Maei,
R. S. Sutton,
K. Witkiewitz
Abstract:
We develop an off-policy actor-critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view towards its use in mobile health.
We develop an off-policy actor-critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view towards its use in mobile health.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
Assessing Time-Varying Causal Effect Moderation in Mobile Health
Authors:
Audrey Boruvka,
Daniel Almirall,
Katie Witkiewitz,
Susan A. Murphy
Abstract:
In mobile health interventions aimed at behavior change and maintenance, treatments are provided in real time to manage current or impending high risk situations or promote healthy behaviors in near real time. Currently there is great scientific interest in developing data analysis approaches to guide the development of mobile interventions. In particular data from mobile health studies might be u…
▽ More
In mobile health interventions aimed at behavior change and maintenance, treatments are provided in real time to manage current or impending high risk situations or promote healthy behaviors in near real time. Currently there is great scientific interest in developing data analysis approaches to guide the development of mobile interventions. In particular data from mobile health studies might be used to examine effect moderators-i.e., individual characteristics, time-varying context or past treatment response that moderate the effect of current treatment on a subsequent response. This paper introduces a formal definition for moderated effects in terms of potential outcomes, a definition that is particularly suited to mobile interventions, where treatment occasions are numerous, individuals are not always available for treatment, and potential moderators might be influenced by past treatment. Methods for estimating moderated effects are developed and compared. The proposed approach is illustrated using BASICS-Mobile, a smartphone-based intervention designed to curb heavy drinking and smoking among college students.
△ Less
Submitted 16 August, 2016; v1 submitted 2 January, 2016;
originally announced January 2016.