Visit the HelpDesk at Carothers Library Lower Level or call/email us at (401)-874-HELP (4357). Email: HelpDesk@uri.edu

Handout 5

The purpose of this handout is to look at some of the commonly used procedures. Prior to a description of the procedures it would be beneficial to study some of the frequently used SAS programming statements.

Part 1: Common Procedural Statements

The statements discussed in this section are:

The VAR statement

The BY statement

The TITLE statement

The LABEL statement

The OUTPUT statement

A brief explanation of the statements follows.

The VAR Statement

This is an optional statement listing the variables to be used in the statistical procedure. The procedure will ignore all other variables in the data set. A missing VAR statement implies that the procedure will use all numeric variables in the analysis.

Form: VAR variable list;

The BY Statement

The BY statement generates separate analysis for every combination of values of the variables specified on the BY statement. That is, the BY statement allows the programmer to perform sub-analysis on the data. The only restriction is that the data set must be sorted by the values of the variables used in the BY statement. The order of the variables on the BY statement must match the order of the variables in the BY statement of the SORT procedure.

Form: BY variable list;

The TITLE Statement

This statement defines a header for each output page. The programmer can define up to 9 separate titles. However, due to paging consideration no more than three titles seem to adequately fit on a page. Entering a blank title description removes an existing title.

Form: TITLEn ‘description'; (n is the title number).

The LABEL Statement

The LABEL statement allows the programmer to provide a 40 character description to the variable. This statement can also be used in a data step. Remember if used in the data step the labels get permanently assigned to the variable. If the statement is used in the procedure step the labels are valid only for the duration of the procedure.

Form: LABEL variable1 = ‘description’
variable 2 = ‘description’
…;

The FORMAT Statement

The FORMAT statement allows the programmer to assign special formats to the values of the variables. This statement can be used both in the data step and the procedure step. When formats are assigned to the variables in the data step the variables are permanently formatted in that way. However, the procedure step allows for temporary formatting of the variables.

Form: FORMAT variable1 format1. variable2 format2. … ;

The OUTPUT Statement

This is an optional statement used in several procedures. This statement requests SAS to output statistics to a new SAS data set. Multiple OUTPUT statements can be used in the same procedure as long as different output data sets are created.

Form: OUTPUT OUT=sasdataset keyword=names;

Where: Keyword is any of the options specific to a procedure
and name(s) would name the new variable(s) containing the statistics.

Part 2: Common Procedures

The SORT Procedure
The SORT procedure is used to:
* sort the observations in a SAS data set in order.
* create a data set containing rearranged observations.

By default the variables are sorted in ascending order. It is possible to specify a descending sort for any variable by placing the keyword DESCENDING before the name of the variable.

The BY statement must be used with the SORT procedure.

It is possible to specifiy the name of the input and output datasets. In the event that the output dataset name is not specified the input dataset will be overwritten with the sorted version.

Form: PROC SORT DATA=sasdataset OUT=sasdataset;
BY variable list;

Example: PROC SORT DATA=Census;
BY Region State;

Result: The CENSUS data will be sorted by the values of REGION and the sorted by STATE withing each REGION.

The PRINT Procedure
The PRINT procedure is used to print obervations in a SAS data set. The features of the PRINT procedure are as follows:

The FORMAT procedure allows you to create formats to your own specifications. These formats can then be used with a FORMAT statement in either the DATA or PROC steps. Three optional keywords, LOW, HIGH, and OTHER are avaiable. Further, the format procedures requires either the VALUE or the PICTURE statement to be used.

Form: PROC FORMAT;

The VALUE Statement

The statement is used to generate formats that associate labels with values.

Result: Two separate analyses, one for males and the other for females is performed on variables TEST1, TEST2, and TEST3. The statistics presented are: mean, sum, and number of valid responses. An output data set STAT is created. STAT will contain the mean scores for the three variables. The output from the MEAN procedure will label the three test variables appropriately.

Result: The two variables, SEX and SCHSIZE will be crosstabulated. In addition to the default statistics the analysis will report the overall value of the chi-square test statistic and its significance value. The variable SEX will be on the vertical side of the table and the variable SCHSIZE will be on the horizontal size of the table.

The PLOT Procedure
The Plot procedure is used to graph variables in an xy-plane. Generally, use this procedure with continuous data. The graph is in the form of dots for every xy-coordinate. This procedure does not produce presentation quality graphs. The purpose of this procedure is to provide the analyst with a quick view of the data. For presentation graphs you should use the SAS/GRAPH product.

Form: PROC PLOT DATA=sasdataset;
The PLOT Statement
Include at least one PLOT statement in the PLOT procedure. The PLOT procedure uses only those variables listed on this statement.

Form: PLOT y-axis variable name * x-axis variable name / options;
Options: HREF = list of values at which to draw vertical reference lines
VREF = list of values at which to draw horizontal reference lines
VAXIS = list of values for tick marks on y-axis
HAXIS = list of values for tick marks on x-axis
OVERLAY overlays multiple plots on a single set of axes