Abstract: Epidemiological studies frequently involve an important risk factor which is difficult or expensive to measure, say Z. When a continuous response variable Y and a collection of covariates X are easy to obtain on a large sample of the population, researchers may be interested in using (Y,X) to identify an informative subsample on which to collect Z. A natural framework for such a study is a two-stage outcome-auxiliary-dependent sampling procedure in which the sampling scheme for the second stage data depends on all of the data collected in the first stage. In this talk, we propose a residual-dependent sampling procedure in which residuals from a regression model involving Y and X are used as the basis for identifying subjects that are "informative" about Z, the covariate to be collected in the second stage. This talk focuses on the motivation for this procedure as well as associated estimation and inference procedures. Time permitting, implications for study design will also be summarized. This is joint work with Lynn Johnson.