Stat234: Sequential Decision Making



1.   Course Description


This graduate course will focus on reinforcement learning algorithms and sequential decision making methods with special attention to how these methods can be used in mobile health.  Reinforcement learning is the area of machine learning which is concerned with sequential decision making.  We will focus on the areas of sequential decision making that concern both how to select optimal actions as well as how to evaluate the impact of these actions.  The choice of action is operationalized via a policy.  A policy is a (stochastic) deterministic mapping from the available data at each time t into (a probability space over) the set of actions. We will consider both off-line and on-line methods for learning good policies. 


Mobile health is an area that lies within multiple scientific disciplines including: statistical science, computer science, behavioral science and cognitive neuroscience. This makes for very exciting interdisciplinary science! Smartphones and wearable devices have remarkable sensing capabilities allowing us to understand the context in which a person is at a given moment. These devices also have the ability to deliver treatment actions tailored to the specific needs of users in a given location at a given time. Figuring out when and in which context, which treatment actions to deliver can assist people in achieving their longer term health goals.  In the last 15-20 minutes of many of the classes we will brainstorm about how the methods we discussed during that class might be useful in mobile health.


This course will cover the following topics:  Markov Decision Processes, on-policy and off-policy RL, least squares methods in RL and Bayesian RL, namely posterior sampling.  Most of the course will focus on Bayesian RL via posterior sampling.  This is particularly useful in mobile health as posterior sampling facilitates off-policy and continual learning.    Also the Bayesian paradigm facilitates use of prior data in initializing an RL algorithm.    Other topics from statistics, machine learning and RL that I think are potentially important in mobile health but that we won’t cover are (you could consider in your class project) include: 1) transfer learning (using data on other similar users to enable faster learning); 2) non-stationarity (dealing with slowly changing or abrupt changes in user behavior); 3) interpretability of policies (enabling communication with behavioral scientists by making connections to behavioral theories); 4) using approximate system dynamic models to speed up learning,  5) hierarchical RL, 6) experience replay and 7) multi-task learning.


Harvard College/Graduate School of Arts and Sciences: 205213

Term: Spring 2018-2019

Location: Science Ctr 309A

Meeting Time: Tuesday 10:30 AM - 11:45 AM; Thursday 10:30 AM - 11:45 AM


Our course website is

See Canvas for scribing templates




2.   Draft Course Outline


3.   Grading



4.   Teaching


Susan Murphy is Professor of Statistics and Computer Science & Radcliffe Alumnae Professor at the Radcliffe Institute, Harvard University.  She earned her Ph.D. at the University of North Carolina, Chapel Hill.  She is a MacArthur Fellow, a Fellow of National Academy of Sciences & a Fellow of National Academy of Medicine.  Her work is focused on sequential decision making, causal inference and new ways for carrying out sequential experimentation particular focus on mobile health.  Her email address is   Susan Murphy's office hours are by appointment during 5-6pm Mondays and 2:45-3:30pm on Tuesdays in Science Center 316.05.







5.   Statistical Reinforcement Learning Lab


Members of the Statistical Reinforcement Learning Lab will be giving or participating in

some of the lectures.   They include:



Walter Dempsey.  Walter received his Ph.D. in Statistics at University of Chicago.  His email address is







Peng Liao.  Peng is a 5th Ph.D. graduate student at University of Michigan and an Associate in the Department of Statistics at Harvard University.  His email address is







Celine Liang is a junior at Harvard concentrating in Statistics and Math.  Her email address is








Portrait of Marianne Menictas 

Marianne Menictas.  Marianne received her Ph.D. in Statistics at the University of Technology, Syndey.   Marianne has worked as a data scientist at a number of companies.  Her email address is






Tianchen Qian.  Tianchen received his Ph.D. in Biostatistics from Johns Hopkins University.  His email address is








Mashfiqui (Mash) Rabbi.  Mash received his Ph.D. in Information Science at Cornell University.  His email address is







 Sabina Tomkins.   Sabine received her Ph.D. in Computer Science at University of California, Santa Cruz.   Her email address is






Serena Yeung.   Serena received her Ph.D. in Computer Science at Stanford University.   She is an Assistant Professor of Biomedical Data Science and Electrical Engineering at Stanford University and is visiting us this year.  Her email address is