Quiz 2 ST 445 29 March 2012 CLOSED BOOK AND NOTES NAME ____________________________ For some of the questions on this quiz, I am asking what the output will be from the SAS code. *** For each dataset created, be sure to give the number of observations and the number of variables. *** Note that the line numbers are given with the code, and remember that there's a blank column between the line numbers' field and any code or data. 1. a) How many observations, variables ? 00001 data a ; 00002 retain z 1 ; * initialize to 1 ; 00003 input x y ; 00004 if( x > 3 ) then do ; 00005 z = x + _n_ ; 00006 put 'problem ' z ; 00007 end ; 00008 output ; 00009 z = 1 ; * reset ; 00010 cards ; 00011 3 y 00012 8 5 00013 1 4 00014 10 3 00015 ; 00016 run ; 00017 proc print data=a ; 00018 title 'first dataset' ; 00019 run ; b) What is the output from this SAS program? c) What would the "put" statement produce and where would you see its results? Quiz 2 ST 445 29 March 2012 CLOSED BOOK AND NOTES NAME ____________________________ For some of the questions on this quiz, I am asking what the output will be from the SAS code. *** For each dataset created, be sure to give the number of observations and the number of variables. *** Note that the line numbers are given with the code, and remember that there's a blank column between the line numbers' field and any code or data. 1. a) How many observations, variables ? 00001 data a ; 00002 retain z 0 ; * initialize to 0 ; 00003 input x y ; 00004 if( x > 5 ) then do ; 00005 z = x + _n_ ; 00006 put 'problem ' z ; 00007 end ; 00008 output ; 00009 z = 0 ; * reset ; 00010 cards ; 00011 6 y 00012 1 4 00013 10 3 00014 ; 00015 run ; 00016 proc print data=a ; 00017 title 'first dataset' ; 00018 run ; b) What is the output from this SAS program? c) What would the "put" statement produce and where would you see its results? 2. Consider the following code: 00001 data feet big ; 00002 input iq gender $ shoesize ; 00003 if( shoesize > 11 ) then output big ; 00004 if( gender = 'x' ) then delete ; 00005 output feet ; 00006 datalines ; 00007 105 f 8.5 00008 98 f 7.0 00009 114 x 13.5 00010 83 m 6.0 00011 92 m 10.5 00012 128 m 11.5 00013 ; 00014 run ; 00015 proc means data=feet max ; 00016 var shoesize ; 00017 title 'Shoe size and Intelligence' ; 00018 run ; a) How many observations and how many variables are in FEET? b) How many observations and how many variables are in BIG? c) What is the output from PROC MEANS? 3. a) How many observations, variables ? 00001 data one ; 00002 input c ; 00003 f = 9*c/5 + 32 ; 00004 drop c ; 00005 cards ; 00006 15 cool 00007 30 hot 00008 10 cold 00009 ; 00010 proc print data=one ; 00011 title 'Fahrenheit' ; 00012 run; b) What is the output from this SAS program? 2. Consider the following code: 00001 data feet small ; 00002 input iq gender $ shoesize ; 00003 if( shoesize < 7 ) then output small ; 00004 if( gender = 'x' ) then delete ; 00005 output feet ; 00006 datalines ; 00007 105 f 8.5 00008 98 f 6.5 00009 114 x 13.5 00010 83 m 6.0 00011 92 m 10.5 00012 128 m 11.5 00013 ; 00014 run ; 00015 proc means data=feet max ; 00016 var shoesize ; 00017 title 'Shoe size and Intelligence' ; 00018 run ; a) How many observations and how many variables are in FEET? b) How many observations and how many variables are in SMALL? c) What is the output from PROC MEANS? 3. a) How many observations, variables ? 00001 data one ; 00002 input f ; 00003 c = 5*(f-32)/9 ; 00004 drop f ; 00005 cards ; 00006 59 cool 00007 41 cold 00008 86 hot 00009 50 cool 00010 ; 00011 proc print data=one ; 00012 title 'Celsius' ; 00013 run; b) What is the output from this SAS program? 2. a) How many observations, variables? b) What is the output from this SAS program? 00001 data dowj ; 00002 array indust(4) open high low close ; 00003 infile 'djave.dat' firstobs=2 ; 00004 input day open high low close change type $ ; 00005 if( type ne 'INDUST' ) then delete ; 00006 do k = 1 to 4 by 1 ; * open,high,low,close ; 00007 djave = indust(k) ; 00008 output ; 00009 end ; 00010 keep day djave k ; * only these ; 00011 run ; 00012 proc print data=dowj (obs=5) ; 00013 title 'Data courtesy of Brad Lagle' ; 00014 title2 'Dow-Jones averages' ; 00015 run ; The file 'djave.dat' looks like: day open high low close change index 21 6839.26 6855.80 6808.08 6843.87 +4.47 INDUST 21 2314.54 2323.79 2303.14 2317.63 +5.40 TRANSP 21 239.17 239.25 237.98 238.26 -0.98 UTILIT 21 2122.09 2126.67 2113.25 2122.75 +2.32 STOCKS 22 6801.16 6894.29 6800.00 6883.90 +40.03 INDUST 22 2313.00 2344.13 2304.68 2337.35 +19.27 TRANSP 22 237.49 239.59 237.28 239.52 +1.26 UTILIT 22 2112.63 2137.61 2111.15 2136.51 +13.76 STOCKS 23 6881.20 6888.13 6814.24 6850.03 -33.87 INDUST 23 2345.37 2360.47 2337.35 2355.23 +17.88 TRANSP 23 239.31 240.85 238.89 240.85 +1.33 UTILIT 23 2137.69 2137.91 2124.82 2135.69 -0.82 STOCKS 24 6880.05 6906.60 6742.28 6755.75 -94.28 INDUST 24 2362.01 2378.86 2345.68 2348.76 -6.47 TRANSP 24 241.75 243.29 239.94 240.43 -0.42 UTILIT 24 2144.05 2154.40 2111.89 2115.58 -20.11 STOCKS 4. Here's some code that I had to write for a regression example. Note my opinion of the organization of the data. This is only slightly modified from the original. 00001 /* Insulation Data -- Whiteside -- From Small Data Sets */ 00002 /* gas consumption with temperature, insulation dummy */ 00003 /* note -- data stored in stupid, stupid format */ 00004 /* b -- before insulation; a -- after insulation */ 00005 data a ; 00006 keep house temp gas insul ; 00007 infile 'insulate.dat' ; 00008 input tempb gasb tempa gasa ; 00009 house = _n_ ; * house number ; 00010 temp = tempb ; * before insulation ; 00011 gas = gasb ; 00012 insul = 'no' ; 00013 if( temp ne . ) then output ; * ne = not equal ; 00014 temp = tempa ; * after insulation ; 00015 gas = gasa ; 00016 insul = 'yes' ; 00017 if( temp ne . ) then output ; * not missing ; 00018 run ; 00019 proc print data=a ; 00020 label temp='mean temp(C)' insul='insulation' ; 00021 title 'Insulation data (Whiteside)' ; 00022 footnote 'from Small Data Sets' ; 00023 run ; Part of the file 'insulate.dat' is given below -- enough for these purposes -- the first number starts in the first column. 38.5 3.6 17.1 3.0 29.1 3.1 37.2 2.8 . . 28.0 2.7 . . 38.7 2.8 a) How many observations, variables? b) Write a DROP statement equivalent to the following: 00006 keep house temp gas insul ; c) What is the output from this SAS program? 5. Recall the following code, slightly modified, where we were trying to read from a file that was originally a spreadsheet from the NC Center for Health Statistics. /* mar15.ex3 */ /* read the asthmanc.txt file -- second try */ /* */ data a ; length county $12 ; * AAAA ; infile 'asthmanc.txt' dlm='09'x firstobs=6 ; /* BBBBBBBBB CCCCCCCCCC */ if( _n_ > 50 ) then stop ; * DDDD ; input county $ number rate ynum yrate @@ ; * EEEE ; output ; input county $ number rate ynum yrate ; * second col ; output ; run ; /* sort to put them in order */ proc sort data=a ; by county ; run ; /* print out results */ proc print data=a ; title 'NC Asthma rates from SCHS' ; title2 'second try' ; footnote 'NC is weird -- Martin & McDowell' ; run ; a) What does the following statement do? length county $12 ; *AAAA ; b) What does the following piece tell SAS? dlm='09'x c) What does 'firstobs=6' do? d) If the 'stop' is executed, will the remaining two proc's (proc sort and proc print) still run? e) What does the trailing '@@' do in this case? f) Almost forgot! How many observations, variables? 6. Below are the attendance numbers for three church activities: a service, and classes for both youths and adults for 10 weeks. WRITE THE CODE so that the the resultant SAS dataset 'attend' has just 10 observations with just week, service, youth, adult as the four variables, taking values as below (displayed as the result of a proc print): OBS WEEK SERVICE YOUTH ADULT 1 1 85 25 46 2 2 92 23 52 ... ... ... ... ... 10 10 91 26 45 data attend ; /* insert your code here */ datalines ; week 1 service 85 week 1 youth 25 week 1 adult 46 week 2 service 92 week 2 youth 23 week 2 adult 52 ... ... week 10 service 91 week 10 youth 26 week 10 adult 45