Statements Affecting Your Output Fall 1992 SAS Primer Statements Affecting Your Output You can control the linesize and pagesize of output destined for the lineprinter and add informative titles on tables created by SAS procedures. This section describes two statements used to control these items. These statements can be used anywhere in your SAS program. SAS executes them when encountered and they remain in effect until changed or you exit your SAS session. Changing the Output Page Setup You can use the OPTIONS statement to temporarily change one or more of the page setup options from the default value chosen by the manager of your computing center. These changes will be in effect for the duration of your SAS session or until they are changed with another OPTIONS statement. The syntax for the OPTIONS statement is: OPTIONS option1 option2 ...; where any number of options are chosen from the following list: pagesize=nn each page of output will contain nn lines. linesize=nn each line of output can be up to nn characters in length. [no]date prints or suppresses printing of the date on the top of each page of output. [no]number prints or suppresses printing of the page number at the top of each page. firstobs=nn selects nn as the first observation to be processed obs=nn selects nn as the last observation to be processed FIRSTOBS and OBS allow you to select a limited number of observations to process and are useful for checking to see if your SAS program works properly before you process large amounts of data. The default pagesize chosen by the manager of your computing center will be about 24 lines which is a good size for viewing graphics on a computer screen. If you plan to print your output to paper increase the linesize to no more than 58 lines per page by using the SAS statement: options pagesize=58; anywhere in your program. Adding Titles to Tables in Your Output You can use the TITLE statement to add titles to the tables and plots in your output file. Up to ten lines can be printed on the output using this statement. The form of the title statement is TITLE[n ] [title ]; where n immediately follows the keyword TITLE to specify on which of the ten available lines this title should be printed. For the first title line, either TITLE or TITLE1 may be used. 'title' is the title you want printed on line n . Each title can be up to 132 characters long. The title should be enclosed in apostrophes. For example: TITLE 'Soybean Yield'; TITLE3 '1986 through 1988'; Titles will appear on lines one and three of the output file. Once you specify a title for a line, it is used for all subsequent output until you cancel the title or define another title for that line or one above it. To cancel all existing titles specify: TITLE; To suppress the nth and later titles, specify: TITLEn; To associate a title with a particular PROC step, include the title in the PROC step. If you want to change titles for each of the pages produced by a statement within a PROC be sure to put a RUN; statement after each of the TITLE statements. Since SAS usually starts a new page with each statement you can use this to find out which tables are produced by each statement or to give unique names to each plot or model analyzed. For example: PROC PLOT; TITLE 'PLOT 1'; PLOT X*Y; RUN; TITLE 'PLOT2'; PLOT Z*Y; RUN; will put a different title on each plot. The example "Polynomial Regression with PROC REG" also shows you how to do this. Viewing DATA: Use PROC PRINT or PROC PLOT The first step in analyzing your data is often to plot the data. PROC PLOT can be used to create a scatter plot of a data set. Graphical descriptive displays, such as histograms and stem and leaf plots can be created with PROC UNIVARIATE which is covered in the next chapter on describing your data. To verify that your data was input correctly, you can use PROC PRINT to list any data set. PROC SORT can be used to sort the observations in a data set by one or more of the variables. You may need to use PROC SORT to order the data so other SAS procedures can process the data in subsets using the BY statement. Remember that you can use the TITLE statement with any of these procedures. PROC PRINT The syntax of the PROC PRINT statement and other statements used with PROC PRINT to alter the table of data printed is shown below: PROC PRINT [DATA=setname ]; /*Initiates PROC PRINT procedure and specifies which data set to print*/ [VAR variables ]; /* Specifies which variables in the data set are to be printed*/ [SUM variables ]; /*Specifies variables whose values you want totaled.*/ [BY variables ]; /*Analyzes the variables in groups having the same values. Use PROC SORT to put the data in ascending order before using PROC PRINT with the BY statement.*/ PROC SORT Use PROC SORT to sort a data set before it is processed by other PROC steps that include a BY statement. PROC SORT produces no printed output. Below is a description of this procedure and associated statements: PROC SORT [DATA=setname OUT=newset ]; /*Inititates PROC SORT and optionally specifies the input data set (setname ) and a name for the output data set (newset ).*/ BY [DESCENDING] variable ; /*Specifies which variables are used to sort the data set. Any number of variables can be used in the BY statement.*/ For example, PROC SORT can be used to rearrange your data as follows: /*ORIGINAL DATA SET*/ X Y 1 3 2 3 3 1 2 5 1 2 1 5 PROC SORT; BY X Y; X Y 1 2 1 3 1 5 2 3 2 5 3 1 PROC SORT; BY Y X X Y 3 1 1 2 1 3 2 3 1 5 2 5 PROC PLOT PROC PLOT creates scatter plots from two variables that you specify. You can choose the plot symbol or use the value of a third variable as the plot symbol. PROC PLOT and associated statements follow: PROC PLOT [DATA=setname ]; /*Initiates PROC PLOT and optionally specifies which data set to use; If setname is omitted, the most recently created data set is used.*/ PLOT Y*X | Y*X = 'symbol' | Y*X =Z . . . [/OVERLAY]; /* Specifies that the plot should have variable Y on the vertical axis and variable X on the horizontal axis; if ='symbol' is present the plot symbol is the character inside the apostrophes; if = Z is present then the plot symbol is the value of variable Z . If you specify several plots on the same statement, the option /overlay will put all plots on the same graph.*/ [BY variables ;]@ /* Used to get separate plots on observations defined by the variables in the BY statement. Use PROC SORT with the same BY statement to make sure the data is in ascending order.*/ Example: Printing and Plotting data You are asked to plot the yield for each fertilizer and find the total yield per fertilizer in the data set from the example on creating indicator variables You can start with the data step used in the example on creating indicator variables. SAS SOLUTION: options pagesize=30; /* Set default pagesize to be 30 lines*/ data fertlzr; input fert $ @; do rep=1 to 5; input yield @; x1=(fert='A'); x2=(fert='B'); /*Note there are two SAS x3=(fert='C'); x4=(fert='D'); statements per line*/ output; end; cards; A 60 61 59 60 60 B 62 61 60 62 60 C 63 61 61 64 66 D 62 61 63 60 64 ; proc sort out=order; by fert; proc print data=order; title1 'Example of PROC PRINT with the SUM option'; title2 'Fertilizer data set'; sum yield; by fert; proc plot data=fertlzr; title1 'Example PROC PLOT using an * for the plot symbol'; plot yield*fert='*'; run; We used PROC SORT to sort the input data set by fertilizer before using the BY statement in PROC PRINT. Because of the way the data was read in by the data step the data was already ordered by fertilizer type, so we could have omitted the PROC SORT procedure. In PROC PRINT we used two title statements to annotate the printout; giving a new title1 statement in the PROC PLOT step canceled the previous title2. Note that fert is not a numeric variable; PROC PLOT places the values of non-numeric variables at equal increments on the axis. The output follows: SAS OUTPUT: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Example of PROC PRINT with the SUM option Fertilizer data set 10:04 Friday, June 26, 1992 ----------------------------------- FERT=A ---------------------- OBS REP YIELD X1 X2 X3 X4 1 1 60 1 0 0 0 2 2 61 1 0 0 0 3 3 59 1 0 0 0 4 4 60 1 0 0 0 5 5 60 1 0 0 0 ----- FERT 300 ----------------------------------- FERT=B ---------------------- OBS REP YIELD X1 X2 X3 X4 6 1 62 0 1 0 0 7 2 61 0 1 0 0 8 3 60 0 1 0 0 9 4 62 0 1 0 0 10 5 60 0 1 0 0 ----- FERT 305 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Example of PROC PRINT with the SUM option Fertilizer data set 10:04 Friday, June 26, 1992 ----------------------------------- FERT=C ---------------------- OBS REP YIELD X1 X2 X3 X4 11 1 63 0 0 1 0 12 2 61 0 0 1 0 13 3 61 0 0 1 0 14 4 64 0 0 1 0 15 5 66 0 0 1 0 ----- FERT 315 ----------------------------------- FERT=D ---------------------- OBS REP YIELD X1 X2 X3 X4 16 1 62 0 0 0 1 17 2 61 0 0 0 1 18 3 63 0 0 0 1 19 4 60 0 0 0 1 20 5 64 0 0 0 1 ----- FERT 310 ===== 1230 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Example PROC PLOT using an * for the plot symbol 10:04 Friday, June 26, 1992 Plot of YIELD*FERT. Symbol used is '*'. YIELD | | 66 + * | 65 + | 64 + * | 63 + * | 62 + * | 61 + * * * | 60 + * * | 59 + * | ---+-----------------+-----------------+------------- A B C FERT NOTE: 5 obs hidden.