BY group

Most SAS procedures support the BY statement, which allows you to create a report or analysis for each distinct value of a variable in your data set. The syntax is simple, and SAS procedures are usually tuned to do a good job of processing the data efficiently.

It's difficult to interleave BY output from multiple steps within your program.

For example, suppose you want to show a PROC PRINT output followed by a PROC SGPLOT chart for each value of a variable in a data set. Using the BY statement in each of these two steps would produce a series of PROC PRINT results, followed by a series of PROC SGPLOT results. (Oh, and don't forget that you can use SGPLOT and SGPANEL for classification plots without explicit BY processing.)

There is a SAS programming pattern that allows you to extend the concept of BY processing to larger segments of your SAS program. In pseudo-code, this allows you to implement program logic such as:

Let's take a simple code example through this transformation. Let's combine a PROC FREQ step with a PROC SGPLOT step. Here's the example without BY processing or classification. The results show a report and plot for ALL values of Type within SASHELP.CARS. It's okay, but it doesn't provide much insight into the different classifications of carType (Hybrid, Truck, Sedan, and so on).

Step 1. Get your program working for one value using WHERE=

Before introducing any macro statements into the mix, it's a good idea to get your non-macro logic correct. Things are much easier to debug and troubleshoot before you add any macro processing. This modification shows what the program would look like with one known value of Type:

Step 2. Count the distinct values and create macro variables for each

Once you're happy with the output for the single distinct value in you created in Step 1, it's time to gather the information you need to repeat that step for each distinct value in the data. Before you can achieve this with a SAS macro loop, you're going to need two bits of information: how many distinct values are there (for the loop index), and what ARE those distinct values (for each iteration). Here's the pattern of code to figure this out:

Note how we need to reference our macro variable that contains the distinct value: "&&varVal&index." You need a "double &" to dereference (fancy word for reference the reference) the correct macro variable for the current index value.

Step 5. (OPTIONAL) Keep tinkering towards perfection

Now you've "rolled your own" BY group processing, but don't stop there! With a few more tweaks we can make it even better:

I added ODS LAYOUT statements (officially experimental, but works well in HTML) to generate two-column output, with tables and charts side-by-side. I added SYSECHO statements so that I can track progress of the program as it runs in SAS Enterprise Guide. If you've got lots of data with lots of distinct values, it can take a while. You might appreciate the running status message. And I added some ODS GRAPHICS options to control the name of the output image files, just to make my results easier to track on the file system.

Here's an example of my final report. Can you take it even further? I'll bet that you can!