Department of Statistics Seminar
North Carolina State University


presents

Ji Zhu
University of Michigan


Variable Selection with Structure Constraints

ABSTRACT
Variable selection is an essential component of modern data analysis. Starting with a large number of variables, possibly larger than the number of observations, the aim is to determine a smaller subset that exhibits the strongest effects. Variable selection has been studied extensively in the literature. These classical methods treat the predictor variables "flatly," supposing all subsets of variable to be equally suitable for use in a multivariate model. However, in most science and engineering applications, measurements are structured in one or more ways. Incorporating such structure information into the modeling procedure poses interesting and challenging questions.

In this talk, I plan to consider the following two types of model structure:  1. Heredity structure. Interpreting regression models with
interaction terms often requires that the corresponding main effects also be considered. Through the use of such heredity constraints, we develop new variable selection methods for fitting a predictive model while simultaneously identifying important interaction terms. Such techniques are likely to be important in, for example, the study of complex diseases, such as cancer, which involves multiple genetic and environmental risk factors, with scientists particularly interested in their interactions.  2. Grouping structure. In many engineering and scientific applications,  input variables are grouped, for example, in biological applications where assayed genes or proteins can be grouped by biological role. Common statistical analysis methods such as ANOVA, factor analysis, and functional modeling with partially ordered basis sets also exhibit natural variable groupings. We develop variable selection techniques while respecting group constraints. Our new methods enjoy benefits that existing successful methods do not have, while offering the potential for achieving a theoretical "oracle" property.

Friday, March 30, 2007
3:35 - 4:35 pm
206 Cox Hall

Refreshments will be served on the second floor of Dabney Hall (left of Room 222) at 3:00 pm.