Causal Inference in a Longitudinal Study

Butch Tsiatis

By its very nature, the notion of "cause and effect" even in the most basic of settings involves consideration of time, as "cause" must precede an "effect." In the study of many diseases, consideration of time takes on an added dimension: treatments themselves may be used over time in different ways, and data, both responses and related variables (covariates), may be collected longitudinally with the goal of understanding differences among such time-varying exposures. For example, a study may be undertaken to assess the relative merits of using bypass surgery versus standard medical treatment (drug therapy) in patients with heart disease, focused on the question "which patients (if any) should be treated surgically, and when during the course of their disease should this occur in order to improve their chances of survival?" Controlled experiments investigating time-varying treatments are difficult to implement, as ethical and logistical considerations make it impossible for investigators to dictate how a patient's care should progress. Consequently, longitudinal observational studies are the main source of information for investigation of time-varying treatment strategies. The data consist of measurements on the response of interest, the time-varying treatment, and additional time-varying covariates.

From previous talks, we know that causal inference from observational data is complicated by the fact that the response may be confounded with other variables known as "confounders," and adjusting for confounders is a standard approach used in establishing causal effects. However, such adjustments require careful consideration: strongly ignorable treatment assignment must be thought to hold, but adjustment for variables that themselves are affected by treatment, so may be part of the causal pathway between treatment and response, may compromise such inference. In longitudinal studies, this issue is critical: time-varying covariates may be both confounders and be part of the causal pathway, and how to carry out appropriate adjustment becomes complicated. Standard "time-dependent covariate" adjustments do not address the issue of causal effects appropriately.

In this lecture, we extend the idea of counterfactuals, presented previously, to the case of time-varying treatments and define the notion of a generalized treatment regime. We discuss extension of the assumption of strong ignorability to the longitudinal setting, the so-called "sequential randomization" assumption. Under this assumption, we show that the causal effect of a time-varying treatment may be identified from longitudinal observational data. We demonstrate the use of causal graphs, introduced by James Robins, which aid in visualizing the complex considerations involved, and describe the Robins method of G-estimation for estimating the causal effect of a generalized treatment regime.

The methods described in this lecture may be very difficult to implement in practice. The important message of the presentation is elucidation of the conceptual framework necessary for posing causal questions in the context of complex longitudinal data.

Return to Biostatistics Working Group