A clever use of the max function in PROC MEANS with binary
data
This program was reprinted from SUG-L with the permission of Dave Scocca.
A user had a dataset with multiple observations (visits) for each patient ID
number. An observation included dummy variables for several different
conditions--0 if the condition was not diagnosed on that visit and 1 if it was
diagnosed.
She wanted a dataset with one observation per patient ID and the condition
variable set to 0 if the condition was never diagnosed for that patient and a 1
if it was diagnosed on any of the multiple visits.
The solution was to use the MAX statistic of the MEANS procedure with a BY
group specified. This would yield the desired diagnosis codes for each patient.
Here's a working sample:
data test ;
infile cards truncover ;
input id $ 1-5 dx1 dx2 dx3 ;
cards ;
00319 1 0 0
00319 1 0 1
00319 0 0 1
00452 0 1 0
00452 0 1 0
00472 1 0 0
00472 0 1 0
;
proc sort data=test ;
by id ;
run ;
proc means data=test noprint ;
by id ;
var dx1 dx2 dx3 ;
output out=results max=;
run ;
proc print data=results ;
run ;
(The CLASS statement would produce a similar output dataset without requiring a
sort, however the first observation would be for the entire dataset and would
need to be dropped to produce the desired data set of patients.)
The moral of the story? Statistics like MAX make PROC MEANS useful in
circumstances where you might otherwise be tempted to write strange and tangled
loops through data steps.
Dave
* Dave Scocca SAS for the Macintosh Resource Page *
* dave_scocca@unc.edu http://metalab.unc.edu/sasmac/ *
[an error occurred while processing this directive]
Maintained by:Sandy Donaghy and Joy Smith Last Modified: Friday, 24-Sep-1999 12:58:09 EDT
Filename: /working_groups/sas/samples/base/mean1.html