Department of Statistics Seminar
North Carolina State University

presents

Charles (Chip) Lawrence

Wadsworth Center, Albany, NY

"A Bayesian MCMC Algorithm for Mining Protein Databases: Structural Prediction of Human Glutamate Decarboxylase"

ABSTRACT

The elucidation of a protein's structure and function from its sequence is widely recognized as one the grand challenges of biology. The human genome project and other high throughput sequencing projects have accelerated the rapid growth of protein and DNA databases. We describe a fully automated method, PROBE, for mining the protein database in pursuit of this challenge. This algorithm constructs a multiple sequence model, identifies a protein super family, and delineates conserved motifs and conserved residues. The core Markov chain Monte Carlo (MCMC) algorithm is based on Bayesian statistics, and combines the favorable features of two recently described multiple sequence alignment methods, the Gibbs sampler and hidden Markov models (HMM). Bayesian model selection procedures focus the alignment on those patterns which sequence data indicate are conserved across the protein super family. We describe in some detail the application of PROBE to the prediction of the structural for human glutamate decarboxylase (GAD). GAD (EC 4.1.1.15) is the pyridoxal-5'-phosphate (pyridoxal-P) dependent enzyme that synthesizes the gamma-aminobutyric acid (GABA), the major inhibitory neurotransmitter in vertebrate brain. Its structure is unknown and Blast results show its sequence is unrelated to any others. PROBE identified six motifs shared by GAD and a super family of 512 proteins including four proteins of known structure which serve as the basis for predicting the structure of GAD. Since the sequences of these 4 proteins are as distant from one another as they are from GAD they serve as a set of positive controls. The correspondence of the alignment produced by PROBE and independently derived structural alignments for these positive controls supports PROBE's prediction of the structure of the functional core of GAD.

Monday, October 19, 1998

8:00 am

Williams 2223 (McKimmon Room)

Refreshments will be served at 7:45 am.