/****************************************************************** Fit a loglinear regression model to the epileptic seizure data first reported in a paper by Thall and Vail (1990). These are count data, so we use the Poisson mean/variance assumptions. This model is fitted with different working correlation matrices. We use PROC GLIMMIX. ******************************************************************/ options ls=80 ps=59 nodate; run; /****************************************************************** The data look like (first 8 records on first 2 subjects) 104 5 1 0 11 31 104 3 2 0 11 31 104 3 3 0 11 31 104 3 4 0 11 31 106 3 1 0 11 30 106 5 2 0 11 30 106 3 3 0 11 30 106 3 4 0 11 30 . . . column 1 subject column 2 number of seizures column 3 visit (1--4 biweekly visits) column 4 =0 if placebo, = 1 if progabide column 5 baseline number of seizures in 8 weeks prior to study column 6 age ******************************************************************/ data seizure; infile 'seize.dat'; input subject seize visit trt base age; /***************************************************************** Fit the loglinear regression model using PROC GLIMMIX and three different working correlation matrix assumptions: - unstructured - compound symmetry (exchangeable) - AR(1) Subject 207 has what appear to be very unusual data -- for this subject, both baseline and study-period numbers of seizures are huge, much larger than any other subject. We leave this subject in for our analysis; in some published analyses, this subject is deleted. See Diggle, Heagerty, Liang, and Zeger (2002) and Thall and Vail (1990) for more on this subject. We fit a modified model exactly like the one in Thall and Vail (1990). Define - logbase = log (base/4) - logage = log(age) - trt Here, visit4 is an indicator of the last visit and basetrt is the "interaction" term for logbase and trt, allowing the possibility that how baseline seizures affect the response may be different for treated and placebo subjects. Visit4 allows that the mean response changed over the study period, but not gradually; rather, the change only showed up toward the end of the study; The DIST=POISSON option in the model statement specifies that the Poisson requirement that mean = variance, be used. The LINK=LOG option asks for the loglinear model. Other LINK= choices are available. PROC GLIMMIX is really meant for fitting generalized linear mixed effects models, discussed in Chapter 15. But it can also fit marginal models with no random effects, which we are demonstrating here. To carry this out, the syntax for specifying the "working" correlation matrix is different that in from PROC GENMOD. Instead of using a REPEATED statement, one uses a RANDOM statement with either the RESIDUAL option or a CLASS variable that identifies ordered time points, as demonstrated below. If no ordered variable is needed, as in the compound symmetric case, one can use the keyword _RESIDUAL_ instead. The EMPIRICAL option asks for the "robust sandwich" standard errors to be computed. ******************************************************************/ data seizure; set seizure; * if subject=207 then delete; logbase=log(base/4); logage=log(age); basetrt=logbase*trt; if visit<4 then visit4=0; if visit=4 then visit4=1; run; title "UNSTRUCTURED CORRELATION"; proc glimmix data=seizure empirical; class subject visit; model seize = logbase logage trt visit4 basetrt / solution dist = poisson link = log; random visit / subject=subject type=un residual; run; title "EXCHANGEABLE (COMPOUND SYMMETRY) CORRELATION"; proc glimmix data=seizure empirical; class subject visit; model seize = logbase logage trt visit4 basetrt / solution dist = poisson link = log; random _residual_ / subject=subject type=cs; run; title "AR(1) CORRELATION"; proc glimmix data=seizure empirical; class subject visit; model seize = logbase logage trt visit4 basetrt / solution dist = poisson link = log; random visit / subject=subject type=ar(1) residual; run;