Structure of SAS Fall 1992 SAS Primer Structure of a SAS Program This section of the manual describes the structure of a SAS program and the syntax of the language. Many of the most common mistakes made when programming SAS are listed in Appendix 1. SAS Program Blocks: The DATA step and the PROC step All SAS programs consist of at least two blocks of statements: the DATA step and the PROC step. A step is just a group of program statements that provide instructions to SAS for entering and modifying data (the DATA step) or analyzing data (the PROC step). Use the DATA step to read in and modify data. A data step begins with the statement: DATA setname ; and ends with one of the following statements: RUN; PROC procedure_name ; or when another DATA step starts. Between the beginning and ending statements will be statements for reading data, labeling data, and performing calculations. The collection of data values you read in or create is called a data set and has the name, setname , that you choose. If you need to modify your data (such as transform the data or create new data elements) you must do so in a DATA step, not in a PROC step. Use the PROC step to analyze and view your data. In a PROC step you can select the type of analysis (regression, analysis of variance, computation of means) for your data set. PROC steps begin with the statement: PROC procedure_name ; and end with either one of the following statement: DATA setname ; or when another PROC step starts. Other statements used in the PROC step specify the results you want computed and displayed by the SAS procedure named procedure_name . DATA and PROC steps can appear in any order (except, of course, that a PROC cannot operate on a data set that has not yet been created) and any number of DATA steps or PROC steps can be used in a program. It is important to understand the concept of steps because although some SAS statements can be used anywhere, many SAS statements are used exclusively in a DATA step and others are used exclusively in a PROC step. It is not uncommon to have to begin a new DATA step in order to perform necessary data manipulations. Syntax for SAS Statements A SAS statement is a string of keywords, names, and symbols ending in a semicolon. You may use upper or lower case when typing SAS statements. Most SAS statements are specified with the following form: keyword parameter [options] Such a SAS statements begin with a keyword identifying the kind of a statement it is. Some of the identifying keywords are DATA, PROC, OUTPUT, INFILE, FILE, and VAR. Parameters are often the names of your variables or data sets. Options are keywords specific to a particular SAS statement. Assignment statements used to create new variables or modify the values of existing variables have the familiar algebraic form you use when calculating these variables by hand. SAS statements end with a semicolon (;). COMMON ERROR: A missing semicolon is a very common syntax error and one that is hard for SAS error checking procedures to identify. For this reason the error messages you receive are often unclear. SAS statements are free-format. This means a statement can begin and end anywhere on a line, one statement can continue over several lines, several statements can be on one line, and as many blanks as you like can be used to separate fields. PROGRAMMING TIP: You can make your SAS code easier to read and debug by beginning PROC statements and DATA statements in the first space of the line and indenting all other statements. Adding a blank line and comment between steps is also helpful. Rules for SAS Names SAS uses names to identify variable names, data sets, formats, arrays, libraries and files. SAS names must conform to the following rules: (1) no longer than eight characters (2) the first character must be a letter (3) Contain only letters, numbers, or underscores Comment Statements Use comment statements to make your program easier to read and edit. SAS ignores comment statements. The comment statements will be printed to your log file. There are two ways to make a comment statement: (1) Put an asterisk, *, at the beginning of a line, write the comment, and terminate the comment with a semicolon. For example, 00001 *This is a one-line comment; A Statement following here is not ignored; (2) Start a block of comment lines with /*. Write as many comment lines as you want. Terminate the comment with */. For example, 00007 /* This is a multi-line comment. You may want to use 00008 this technique to "turn off" sections of your program. 00009 SAS will ignore all three of these lines*/ PROGRAMMING TIP: You can use comment statements; to "turn off" sections of your code; this way you can avoid looking through unwanted output.