Department of Statistics Seminar
North Carolina State University
presents
Dr. Mayetri Gupta
University of North Carolina
"Statistical Challenges in Discovering Gene Regulatory Modules"
ABSTRACT
Using statistical methods to understand gene regulation is one of the major scientific challenges in the post-genome era. Short segments of DNA, known as transcription factor binding sites (also called motifs) are believed to be instrumental in initiating the process of gene regulation. Hence their identification is important in order to elucidate interactions within the gene regulatory network. In lower organisms, motifs typically are short, repetitive patterns of about 8-20 nucleotides occurring close to the start of a gene. In eukaryotic genomes, motif detection is a challenging problem as motifs tend to be sparsely spread over the genome, weakly conserved, and occur in multi-pattern clusters called regulatory modules. In this talk I will describe some of our recent modeling and methodological efforts in the discovery of regulatory modules. First, we introduce a Bayesian hidden Markov model framework for sequences containing binding site clusters, with a Markovian dependence structure for both motif ordering and motif site occurrences. A fast state-space model selection algorithm is formulated, based on evolutionary Monte Carlo (Liang and Wong, 2000) that enables us to detect likely clusters comprising the module. Simultaneously, to estimate parameters of the model, we develop a data augmentation strategy using dynamic programming-like recursions. The performance of this methodology is illustrated by applications to bacterial, fly and human genomes.
(Slides available)
Friday, October, 1, 2004
3:35 - 4:35 pm
206 Cox Hall
Refreshments will be served on the second floor of Dabney Hall (left of Room 222) at 3:00 pm.