Here is a full listing of the program as it appears in the file /pub/st512/md/rcbd.sas, fully annotated with comment statements. You should include the program and submit (run) it to see what the output looks like. We will learn how to interpret the output in class. /*---------------------------------------------------------- | | | An example of using PROC GLM to construct the | | analysis of variance for a randomized complete | | block design with on observation per treatment block | | combination. There were 5 blocks (I,II,III,IV,V). | | Each of 7 varieties (a-g) appeared exactly once in | | each block. | | | ----------------------------------------------------------*/ /*---------------------------------------------------------- | | | Note that here we have constructed boxes in which | | comments are contained. All of the text between the | | slash-asterisk combinations is ignored in the | | execution of the program. | | | | Thus, it is possible to make the documentation of | | your program "stand out." | | | ----------------------------------------------------------*/ /*---------------------------------------------------------| | | | The following statement is always recommended. Its | | purpose is to use 55 lines per page in printing the | | program output. If this statement is not included, | | the lines per page will be much shorter, with annoy- | | ing frequent page breaks. | | | |---------------------------------------------------------*/ options ps=55; /*---------------------------------------------------------- | | | The DATA step: Enter the data manually into a data | | set called "bushels." See the file | | | | /pub/st512/md/sicl.txt | | | | for a detailed description. The input statement | | tells SAS that there are 3 variables, block, variety, | | which are character variables ($) and provide the | | block-treatment classification information, and | | yield, which contains the actual responses. The @@ | | allows the (block,variety,yield) triplets to be | | entered several to a line rather than requiring them | | to be entered line by line. The cards statement | | indicates that the data follow. The trailing lone ; | | indicates the end of the data step. | | | ----------------------------------------------------------*/ data bushels; input block $ variety $ yield @@; cards; I a 10 I b 9 I c 11 I d 15 I e 10 I f 12 I g 11 II a 11 II b 10 II c 12 II d 12 II e 10 II f 11 II g 12 III a 12 III b 13 III c 10 III d 14 III e 15 III f 13 III g 13 IV a 14 IV b 15 IV c 13 IV d 17 IV e 14 IV f 16 IV g 15 V a 13 V b 14 V c 16 V d 19 V e 17 V f 15 V g 18 ; /*---------------------------------------------------------- | | | The data are now printed out using PROC PRINT. The | | "title" statement places the title in single quotes | | at the top of each page of output. The title will | | appear on all subsequent pages until a new title | | statement is invoked. See the discussion in the file | | | | /pub/st512/md/sicl.txt | | | | for more on the "data=bushels" statement. | | | ----------------------------------------------------------*/ proc print data=bushels; run; /*---------------------------------------------------------- | | | The analysis of variance is now constructed using | | PROC GLM. The class statement tells the PROC which | | variables in the data set bushels provide the class- | | ification information. The model statement tells | | SAS to construct the analysis of variance for the | | additive linear model corresponding to the RCBD. See | | | | /pub/st512/md/sicl.txt | | | | for more discussion. | | | ----------------------------------------------------------*/ proc glm data=bushels; class block variety; model yield = block variety; run; We have seen several features of SAS by considering the analysis of variance example. Now we will consider a second example to illustrate how one might perform least squares regression in SAS using PROC REG. The full program appears in the file reg.sas. Rate of oxygen consumption (Y) was measured for a bird at a predetermined temperature (X). At each temperature, a different bird was used. This example is treated in greater detail in chapter 10 of this instructor's ST 511 notes. Here are the data: X Y -18 5.2 -15 4.7 -10 4.5 -5 3.6 0 3.4 5 3.1 10 2.7 19 1.8 Here is a SAS program to create a data set with 2 variables, X and Y, print out the data, plot Y versus X, and then fit a simple linear regression model to the data: Y(i) = B0 + B1 X(i) + e(i), where Y(i) and X(i) are the rate of oxygen consumption and temperature for the ith bird, and e(i) is a random error. 1 data oxygen; input x y; cards; 2 -18 5.2 3 -15 4.7 4 -10 4.5 5 -5 3.6 6 0 3.4 7 5 3.1 8 10 2.7 9 19 1.8 10 ; 11 *; 12 * print out the data; 13 *; 14 proc print; title 'Oxygen Consumption Data'; run; 15 *; 16 * plot the data; 17 *; 18 proc plot; plot y*x; run; 19 *; 20 * run the simple linear regression; 21 *; 22 proc reg; model y = x; run; In the program, a data set called "oxygen" is created. It contains 2 variables, x and y. Since both are numerical, no "$" was used. Also, note that here we did not use the "@@" in the input statement, so the data from an individual bird had to go on each line. If we had put a "@@" in the input statement as above, we could have strung the observations all out: data oxygen; input x y @@; cards; -18 5.2 -15 4.7 -10 4.5 -5 3.6 0 3.4 5 3.1 10 2.7 19 1.8 ; The data step is again followed by some comments, and then is printed out using PROC PRINT in line 14. Note that here we did not bother with adding a "data=oxygen" specification in the PROC statement, since the data set "oxygen" was the last data set created. Note that before the run statement in line 14, there is a title statement with a string in single quotes. The effect of a title statement is for each page of output to have the specified title at the top. See the SAS manuals for more on this -- you can have multiple titles and change the titles on different pages of the output. We will see this later in the course. After some further comments, line 18 invokes another PROC. PROC PLOT, as you might suspect, creates a plot with the first variable in the PLOT statement (before the asterisk) on the vertical axis, and the second (after the asterisk) on the horizontal axis. It automatically scales the axes. It is possible to make much fancier plots, both with PROC PLOT and other PROCs. Finally, after some further comments, line 20 contains a call to PROC REG. The syntax is very simple -- PROC REG tells SAS you want to do regression, and the model statement specifies the model. In this case, the model is the simple straight line given above. The specification in line 18 is how you would communicate this to SAS. We will see how to fit more complicated multiple regression models in ST 512. An annotated version of this program appears in the file /pub/st512/md/reg.sas. You should run it to see what the output looks like. We will learn how to interpret such output later in the course. Now that you have some familiarity with the SAS language, you will want to look at the files means.sas, ttest.sas, and paired.sas for examples of other analyses. These programs are fully annotated and self-explanatory.