NC STATE University
SAS Consulting, Department of Statistics 

A clever use of the max function in PROC MEANS with binary data

This program was reprinted from SUG-L with the permission of Dave Scocca.

A user had a dataset with multiple observations (visits) for each patient ID number. An observation included dummy variables for several different conditions--0 if the condition was not diagnosed on that visit and 1 if it was diagnosed. She wanted a dataset with one observation per patient ID and the condition variable set to 0 if the condition was never diagnosed for that patient and a 1 if it was diagnosed on any of the multiple visits. The solution was to use the MAX statistic of the MEANS procedure with a BY group specified. This would yield the desired diagnosis codes for each patient. Here's a working sample:
 
data test ;
infile cards truncover ;
input id $ 1-5  dx1 dx2 dx3 ;
cards ;
00319 1 0 0
00319 1 0 1
00319 0 0 1
00452 0 1 0
00452 0 1 0
00472 1 0 0
00472 0 1 0
;

proc sort data=test ;
by id ;
run ;

proc means data=test noprint ;
by id ;
var dx1 dx2 dx3 ;
output out=results max=;
run ;

proc print data=results ;
run ;
(The CLASS statement would produce a similar output dataset without requiring a sort, however the first observation would be for the entire dataset and would need to be dropped to produce the desired data set of patients.) The moral of the story? Statistics like MAX make PROC MEANS useful in circumstances where you might otherwise be tempted to write strange and tangled loops through data steps.
Dave

* Dave Scocca              SAS for the Macintosh Resource Page *
* dave_scocca@unc.edu           http://metalab.unc.edu/sasmac/ *

[an error occurred while processing this directive]
Maintained by:Sandy Donaghy and Joy Smith
Last Modified: Friday, 24-Sep-1999 12:58:09 EDT
Filename: /working_groups/sas/samples/base/mean1.html