Department of Statistics Seminar
North Carolina State University
presents
Drs. Helen Zhang, Dahlia Nielsen and Daowen Zhang
North Carolina State University
A Sampler of NCSU Faculty
Statistical Machine Learning
Abstract: Machine learning (ML) is an emerging discipline to discover
patterns and relationships in (usually) large data sets. Though
ML has had its origin outside Statistics, many problems addressed
in this field can be formulated and solved by statistical methodologies.
In addition, the unique way of statistical thinking and analysis
has endowed some successful ML paradigms with an theoretical
justification, and greatly improved their performances in practice.
Support vector machines (SVM) will be used to show how and why
statisticians can make fundamental contributions to the field
of data mining.
Finding genes by examining correlated markers
Abstract: In my 15 minutes of fame, I will give an introduction to the topic of
finding genes involved in complex human diseases by examining unrelated
individuals. A complex disease is one that has complex causes, including
combinations of genetic and environmental factors. It is assumed in
these diseases that there might be a number of different genes that
contribute to the overall risk of the disease (rather than being
individually causative). To try to locate these genes among all the many
genes that exist in the genome, we rely on the correlation pattern that we
expect to find along individuals' chromosomes within a population. I will
explain the basic principle of this, and how it can help us identify genes
of interest.
Assessing the effects of reproductive hormone profiles on bone mineral
density using functional two-stage mixed models*
Abstract: In the Study of Women's Health Across the Nation (SWAN),
total hip bone mineral density (BMD) was measured
together with repeated measures of the levels of creatinine-adjusted
follicle stimulating hormone (FSH) collected daily in urine over
one menstrual cycle on more than 600 pre- and perimenopausal women.
It was of interest to investigate the effect of the FSH time profile
in a menstrual cycle on the total hip BMD, adjusting for age and body
mass index. The statistical analysis is challenged by several
features of the data. (1) The covariate FSH is measured longitudinally
and its effect on the scalar outcome BMD is complex. (2) Due to
varying menstrual cycle lengths, women have unbalanced longitudinal
measures of FSH. (3) The longitudinal measures of FSH are subject to
considerable among- and within-woman variations and measurement errors.
We propose a measurement error partial functional linear model, where
repeated measures of FSH are modeled using functional mixed effects models
and the effect of the FSH time profile on BMD is modeled using a partial
functional linear model by treating the unobserved true woman-specific FSH
time profile as a functional covariate. We develop a two-stage estimation
procedure using periodic smoothing splines. Using the connection between
smoothing splines and mixed models, a key feature of our approach is
that estimation at both stages could be conveniently cast into
a unified mixed model framework. A simple test for constant functional
covariate effect is also proposed. The proposed method is evaluated using
simulation studies and applied to the SWAN data.
*This is a joint work with Xihong Lin and Mary Fran Sowers of The University of Michigan.
Friday, August 29, 2003
3:35 - 4:35 pm
206 Cox Hall
Refreshments will be served on the second floor of Dabney Hall (left of Room 222) at 3:00 pm.