This page begins with a section for beginners; below is a more in-depth questions for researchers planning an MRT.

Introduction to MRTs

What is the purpose of an MRT?

The purpose of an MRT is to provide data that can be used to construct a multi-component intervention. The MRT helps researchers answer questions including whether or not to include a time-varying component as part of an intervention package, and in which contexts which intervention component options are most effective. For examples of the kinds of questions an MRT can be used to answer, see the HeartSteps example below. Importantly, MRTs are not confirmatory studies designed to evaluate an intervention, rather they are focused on selecting and optimizing JITAI intervention components to be delivered as part of an intervention package.

What are the elements of an MRT?

What is the role of the distal outcome in an MRT?

The distal outcome in an MRT usually corresponds to a long-term, clinical outcome such as time to relapse or average level of symptoms. One doesn’t have to measure the clinical outcome in every MRT. However, the distal outcome guides the choice of proximal outcomes to be targeted by the intervention components. The goal is that by impacting the proximal outcomes, the intervention components impact the distal outcome.

What is the relationship between the proximal and distal outcomes in an MRT?

In an MRT, intervention components target proximal outcomes that are usually either a short-term version of a distal outcome (e.g., number of steps taken the current day when the distal outcome is the average daily steps over a longer period of time) or hypothesized mediators of the distal outcome (for someone attempting to quit smoking, a relaxation exercise to reduce the proximal outcome of stress over the next 2 hours will hopefully impact the distal outcome of time to relapse).

Why are there repeated randomizations in an MRT?

Another way to ask this question is “What can you learn from the repeated randomizations that are part of an MRT?” The primary rationale for randomization is that it enhances balance in the distribution of unobserved factors across groups assigned to different treatments. This enhances the ability to assess causal effects; that is, randomization reduces alternative explanations for why the group assigned one treatment has improved outcomes as compared to a group assigned an alternate treatment. The repeated randomizations in an MRT enhance balance in the distribution of unobserved factors between participants/decision points assigned to different intervention options. Thus MRTs can provide data to help answer questions including whether or not delivering an intervention component has the desired effect on the targeted proximal outcome, and whether this effect varies with time, prior dose and the current context of the individual.

What types of intervention components might be investigated via an MRT?

The repeated randomizations in an MRT are appropriate for investigating the effects of time-varying intervention components for which the proximal effect might vary by time or current context of the individual. For example, instead of sending a reminder instructing individuals to self-monitor what they eat every day, it might be more effective and less burdensome to remind a person to self-monitor only when they have not recently self-monitored. Another example would be the delivery of a relaxation exercise via a mobile phone. A researcher might want to know in what contexts delivering the relaxation exercise is most effective, for example, whether delivering a relaxation exercise versus no delivery is more effective at times when the individual is stressed.

What types of intervention components would not be investigated with an MRT?

Some intervention components considered for inclusion in an intervention package will not require further investigation. For example, it may not be worth trial resources to investigate a component because it is known to be effective in comparison to other components and the component is not burdensome. Furthermore, the inclusion of the component that requires negligible resources and is not burdensome to individuals might not be investigated.

What types of intervention components would only be randomized at baseline and not repeatedly?

Some intervention components should not be altered once provided. This might be for either scientific or ethical reasons. In this case these components would only be randomized at baseline. An example of such a component would be a health coach avatar, where it might not make sense to take away a health coach avatar once it is provided to an individual.

What is the role of the observations of an individual’s current context? In particular, what is the role of these observations in an MRT?

A first use of the current context is to inform the content of an intervention option. For example, the language in an activity message might be tailored to the participant’s current location and weather; this would be done to increase the chance that the message is useful for the participant in that context (e.g. location, weather). Thus one role of observations of an individual’s context is to tailor the content of an intervention component message. In many research settings we don’t have access to a large number of participants for our MRT. It is difficult, with small sample sizes, to detect small differences such as whether the contextually tailored activity message should be tailored to both current weather and current location versus only tailored to current weather. Thus the contextual tailoring of messages/suggestions is frequently informed by current behavioral theory, clinical experience and prior studies.

A second use of the context is to learn if some intervention components are more effective in some contexts, e.g., moderation. An MRT can be used to provide empirical data with respect to whether or not a contextual variable moderates the effectiveness of delivering an intervention component. For example, we may find that delivering a contextually tailored activity suggestion is more effective at encouraging activity than no suggestion in a context in which the weather is good. On the other hand, if the current context includes that the current weather is bad, it may make no difference if we deliver a contextually tailored message or not. In this example the intervention component is the tailored activity message component and there are two intervention options: deliver versus do not deliver. Consider another intervention component: planning of physical activity for tomorrow. This component might have three options, the first is unstructured planning, the second is structured planning and the third is no planning. Here we might use an MRT to learn whether the context, such as the participant’s mood at the time of the planning, moderates the effect of the unstructured versus the structured planning in terms of the next day’s physical activity.

How are MRTs related to N-of-1 trials?

There are three key differences between MRTs and N-of-1 trials. The first is their inferential goals. MRTs are designed to provide data to test marginal causal effects. Marginal causal effects are effects that are averaged over the population (i.e. all individuals who are in recovery support), over a subset of the population (i.e. all young adults in recovery support) or over a subset of the population in a particular context (i.e. young adults in the morning on school days). The associated primary analyses, like most primary analyses in clinical trials, involve minimal assumptions. N-of-1 trials, on the other hand, are most often conducted to provide data to ascertain the most effective treatment for a particular individual. Here nuanced assumptions based on behavioral theory are used to conduct the primary analyses.

The second difference has to do with the types of interventions the trials were developed to optimize. MRTs are designed to help decide which of multiple intervention components should be included in a multi-component intervention, where N-of-1 trials were developed for settings in which scientists wish to compare the effect of one treatment to that of another (treatment package A versus treatment package B). Thus in the N-of-1 setting, repeated trials within an individual are usually scheduled at time points sufficiently far apart so that the assumption of no carry-over effects is valid. For example, when the individual is provided treatment (A), it is taken away, and then they are provided treatment (B), their previous exposure to treatment (A) does not affect their response to treatment (B). Or if this delayed effect might occur, the associated data analyses adjust for the carry-over effect. This makes eminent sense if the goal, as stated above, is to decide if for this individual it is better to provide treatment A or better to provide treatment B.

Third, MRTs provide data to inform the sequencing of treatments (intervention options)—that is to assist in constructing decision rules that indicate in which context, how soon after prior treatment and in what order different intervention options should be sequenced to be most effective. In  N-of-1 trials the usual goal is to compare one stand-alone treatment or treatment package versus another.

What is the role of carry-over effects in an MRT?

Carry-over effects of intervention components present as moderation or delayed effects. That is, the dose of prior intervention might, due to burden/habituation, reduce the effect of an intervention component at a future decision point. A delayed effect may also simply lead to poorer future proximal outcomes at later decision points. For example, individuals may experience burden due to the intervention and thus delete the mobile application. MRTs provide data to assess such effects.

Can you provide an example of an MRT to illustrate how one works?

Example: HeartSteps version 1 MRT.

Overview: Physical activity is known to decrease the risk of several health complications, yet only one in five adults in the U.S. meet the guidelines for the number of minutes of physical activity recommended per week. Individuals can still experience health benefits if the required minutes are spread out across several days, and broken into more frequent but smaller amounts of time. The goal of HeartSteps is to develop an intervention to increase overall levels of physical activity in sedentary adults by supporting opportunistic physical activity, in which brief periods of movement or exercise are incorporated into individuals’ daily routines. HeartSteps Version 1 (v1) was a six-week MRT in which the intervention development team aimed to investigate whether contextually tailored activity suggestions, as well as support for planning how to be active, increased participants’ overall physical activity. Below we describe one of the intervention components: the contextually tailored activity suggestion component.

  1. Intervention component: Contextually tailored activity suggestion. Push notifications sent to participants’ smartphones providing a suggestion for how to be active in the current moment, with each notification tailored to the participant’s current location, weather conditions, time of day, and day of the week
  2. Intervention options: The intervention options were: (A) a suggestion of a walking activity that took 2-5 minutes to complete, (B) a suggestion of an anti-sedentary activity (brief movements) that took 1-2 minutes to complete, or (C) no suggestion.
  3. Distal outcome: The distal outcome is the total step count during the 42-day study.
  4. Proximal outcome: Total number of steps taken in the 30 minutes following a decision point.
  5. Decision Points: There were 5 individual-specific decision points every day: before morning commute, at lunch time, mid-afternoon, after evening commute, and after dinner.
  6. Observations of context: Location, weather, time of day, day of the week, prior day’s step count, prior 30-minute step count, variation in prior 30 minute step count over past 7 days, time of day, movement, usefulness of prompt, self-reports of physical activity from prior evening.
  7. Availability: Participants were unavailable when sensors on the phone indicated that they might be operating a vehicle or were currently physically active. Participants were also unavailable if they turned had off the activity notifications.
  8. Randomization probabilities: Participants who are available at a decision point are randomized with a 0.3 probability to receive (A) a contextually tailored walking activity, a 0.3 probability of receiving (B) an anti-sedentary activity, and a 0.4 probability of receiving (C) no suggestion.


Klasnja, P., Hekler, E. B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., & Murphy, S. A. (2015). Microrandomized trials: An experimental design for developing just-in-time adaptive interventions. Health Psychology, 34(S), 1220.

Klasnja, P., Smith, S., Seewald, N. J., Lee, A., Hall, K., Luers, B., Hekler, E. B. and Murphy, S. A., (In press) Efficacy of contextually-tailored suggestions for physical activity: A micro-randomized optimization trial of HeartSteps. Annals of Behavioral Medicine.


How can MRTs answer scientific questions about the delivery of contextually tailored activity suggestion?

The HeartSteps v1 MRT focused on whether delivering these suggestions had the intended effect on the proximal outcome. Also mHealth components that are delivered multiple times as individuals go about their daily lives can be burdensome, so it was necessary to understand if the effectiveness of the activity suggestions dissipated over time. The MRT was designed to address questions including:

On average across participants, does pushing the contextually tailored activity suggestion increase physical activity in the 30 minutes after the suggestion is delivered, compared to no suggestion?
If so, does the effect of the contextually tailored activity suggestion deteriorate with time (day in study)?


Planning an MRT

The mobile application that is being used in an MRT can include intervention components that are not being randomized. Why do this, and what are the implications?

Some components are not randomized in an MRT because previous scientific evidence has already demonstrated their effectiveness, efficiency, and/or because the cost/participant burden of including them as part of the intervention is negligible. If some components in an mHealth intervention are not randomized and thus not experimented on as part of an MRT, the resulting data cannot provide evidence regarding whether different options of these components (e.g. on/off, high/low) impact the effectiveness of randomized components. If there are scientific questions regarding whether the inclusion of non-randomized components impact the effectiveness of the randomized components, then further study is needed to address these questions.

What are some guidelines for choosing the decision points?

Decision points are selected so they occur at times when it makes sense to potentially provide treatment. When defining decision points, a researcher should consider the following questions:


Sense2Stop: Mobile Sensor Data to Knowledge. (2014). Retrieved from (Identification No. NCT03184389)

What are some guidelines for choosing randomization probabilities?

How to decide the length of time over which one should observe the proximal outcome?

The dominant consideration is the “signal-to-noise ratio.” For each particular intervention component, a researcher needs to determine how long after delivering that component is it necessary to wait in order for a person to respond (to be able to detect the “signal”, its impact on the proximal outcome). If this time interval is too short, then the measure of the proximal outcome will not capture the effects of the intervention component. If this interval is too long, then the measure of the proximal outcome may include too much noise due to other things happening in the individual’s life. Determining “just the right duration” over which the proximal outcome should be measured can be based on prior data and domain expertise. For example, in HeartSteps the activity suggestions were tailored based on current location and weather, and the proximal outcome was measured in terms of step count. A five-minute duration for observing step count following a decision point would be too short, as the individual doesn’t have enough time to respond. However, a 60-minute duration for observing step count following a decision point was thought to be too long as the individual’s context (location, weather) may change significantly over an hour. Therefore, the research team selected 30 minutes as the duration over which the proximal outcome was to be measured.


Portions of this content and the related scientific research were funded by National Institute on Drug Abuse awards P50 DA039838 and P50 DA010075