/****************************************************************** Fit a loglinear regression model to the epileptic seizure data first reported in a paper by Thall and Vail (1990). These are count data, so we use the Poisson mean/variance assumptions. This model is fitted with different working correlation matrices. ******************************************************************/ options ls=80 ps=59 nodate; run; /****************************************************************** The data look like (first 8 records on first 2 subjects) 104 5 1 0 11 31 104 3 2 0 11 31 104 3 3 0 11 31 104 3 4 0 11 31 106 3 1 0 11 30 106 5 2 0 11 30 106 3 3 0 11 30 106 3 4 0 11 30 . . . column 1 subject column 2 number of seizures column 3 visit (1--4 biweekly visits) column 4 =0 if placebo, = 1 if progabide column 5 baseline number of seizures in 8 weeks prior to study column 6 age ******************************************************************/ data seizure; infile 'seize.dat'; input subject seize visit trt base age; run; /***************************************************************** Fit the loglinear regression model using PROC GENMOD and three different working correlation matrix assumptions: - unstructured - compound symmetry (exchangeable) - AR(1) Subject 207 has what appear to be very unusual data -- for this subject, both baseline and study-period numbers of seizures are huge, much larger than any other subject. We leave this subject in for our analysis; in some published analyses, this subject is deleted. See Diggle, Heagerty, Liang, and Zeger (2002) and Thall and Vail (1990) for more on this subject. We fit a modified model exactly like the one in Thall and Vail (1990). Define - logbase = log (base/4) - logage = log(age) - trt Here, visit4 is an indicator of the last visit and basetrt is the "interaction" term for logbase and trt, allowing the possibility that how baseline seizures affect the response may be different for treated and placebo subjects. Visit4 allows that the mean response changed over the study period, but not gradually; rather, the change only showed up toward the end of the study; The DIST=POISSON option in the model statement specifies that the Poisson requirement that mean = variance, be used. The LINK=LOG option asks for the loglinear model. Other LINK= choices are available. The REPEATED statement specifies the "working" correlation structure to be assumed. The CORRW option in the REPEATED statement prints out the estimated working correlation matrix under the assumption given in the TYPE= option. The MODELSE option specifies that the standard error estimates printed for the elements of betahat are based on assuming the correlation matrix is correctly specified. By default, the ones based on the "robust" version of the sampling covariance matrix are printed, too. The dispersion parameter is estimated rather then being held fixed at 1 -- this allows for the possibility of "overdispersion" The V6CORR option asks that the estimator be computed using the method discussed in the notes. ******************************************************************/ data seizure; set seizure; * if subject=207 then delete; logbase=log(base/4); logage=log(age); basetrt=logbase*trt; if visit<4 then visit4=0; if visit=4 then visit4=1; run; title "UNSTRUCTURED CORRELATION"; proc genmod data=seizure; class subject; model seize = logbase logage trt visit4 basetrt / dist = poisson link = log; repeated subject=subject / type=un corrw modelse v6corr; run; title "EXCHANGEABLE (COMPOUND SYMMETRY) CORRELATION"; proc genmod data=seizure; class subject; model seize = logbase logage trt visit4 basetrt / dist = poisson link = log; repeated subject=subject / type=cs corrw modelse v6corr; run; title "AR(1) CORRELATION"; proc genmod data=seizure; class subject; model seize = logbase logage trt visit4 basetrt / dist = poisson link = log; repeated subject=subject / type=ar(1) corrw modelse v6corr; run;