performance

In my current role I have the privilege of managing the Performance Lab in SAS R&D. Helping users work through performance challenges is a critical part of the Lab’s mission. This spring, my team has been actively testing new and enhanced storage arrays from EMC along with the Veritas clustered file system. We have documented our findings on the SAS Usage note 42197 “List of Useful Papers.”

The two different flash based storages we tested from EMC are the new DSSD D5 appliance and XtremIO array. The bottom line: both storages performed very nicely with a mixed analytics workload. For more details on the results of the testing along with the tuning guidelines for using SAS with this storage, please review these papers:

As with all storage, please validate that the storage can deliver all the “bells and whistles;” you will need to support your failover and high availability needs of your SAS applications.

In addition to the storage testing, we tested with the latest version of Veritas InfoScale clustered file system. We had great results in a distributed SAS environment with several SAS compute nodes all accessing data in this clustered file system. A lot of information was learned in this testing and captured in the following paper:

My team plans to continue testing of new storage and file system technologies throughout the remainder 2016. If there is a storage array or technology you would like to have tested, please let us know by sharing it in the comments section below or contacting me directly.

Our society lives under a collective delusion that burning out is a requirement to success. We’ve been conditioned to believe that sacrificing family, relationships and what’s personally important opens the door to achievement. But how can you be an effective leader, run a successful company or properly manage employees when […]

From time to time we’ll hear from customers who are encountering performance issues. SAS has a sound methodology for resolving these issues and we are always here to keep your SAS system humming. However, many problems can be resolved with some simple suggestions. This blog will discuss different types of performance issues you might encounter, with some suggestions on how to effectively resolve them.

Situation: You are a new SAS customer or are simply running a new SAS application on new hardwareSuggestion: Be sure you’ve read and applied all the guidelines in the various tuning papers that have been written:

Making sure you understand the performance issues will help us determine what next steps are. It’s worth noting, 90% of performance issues are because your hardware, operating system and/or storage has not been configured based on the tuning guidelines listed above. In a recent case we were able to get a 20% performance gain from a long running ETL process by adjusting two RHEL kernel parameters that have been documented for many years in our tuning paper.

Situation: Your SAS application has been running and over time gets slowerSuggestion: Determine if the number of concurrent SAS sessions/users has increased and/or the volume of data (both input and lookup tables) have increased. This is the top reason for a gradual slowdown.

Situation: Your SAS application took a significant performance hit overnight or in a short time frame.Suggestion: The first thing you want to do is see if any maintenance (tweaking of your system, hotfix, patch, …) have been made to your operating system, VMware, and/or storage arrays. A lot of customers have applied maintenance (not to SAS) and SAS all of a sudden is running 2-5 times longer. You’ll want to check that all the operating system settings, mount options, and VMware settings are the same after the maintenance as they were before maintenance.

In conclusion, if you are having performance issues, check the suggested tuning guidelines. Also, be sure to keep track of all the settings for the hardware and storage infrastructure when applying maintenance to make sure these settings are the same afterwards as they were before.

Of course, if you have followed the guidelines and maintenance is not the reason for your performance issues, please contact us. We are here to help.

SAS FULLSTIMER is a SAS system option that takes operating system information that is being collected by SAS process runs and writes that information to the SAS log. Using it can add up to 10 lines additional lines to your SAS log for each SAS step in your SAS log—so why would I recommend turning it on?

This additional information includes memory utilization, date/time stamp for when each step finished, context switch information, along with some other operating-specific information regarding the SAS step that just finished. Why would you need this much information?

This data is very useful in helping your SAS administrator and SAS support personnel determine why a SAS process may be running slower than expected. Having this information collected every time a SAS job is run means that data can be used to help determine which SAS step ran slower and at what time and under what circumstances.

Since the IT staff for most organizations are collecting hardware monitor data on daily basis, they can then use the information from the SAS log to pinpoint what time of day the performance issue occurred, on what system and using what file systems.

Again, this is just one way SAS users can be proactive in trying to solve any future performance issues. And all you need to do is add –FULLSTIMER to your SAS configuration file or to the SAS command line that you use to invoke SAS.

If you have any questions on the above, please let us know. Here are additional resources if you want to learn more about SAS FULLSTIMER and its use:

New Year to me is always a stark reminder of the inexorability of Time. In a day-to-day life, time is measured in small denominations - minutes, hours, days… But come New Year, and this inescapable creature – Time – makes its decisive leap – and in a single instant, we become officially older and wiser by the entire year’s worth.

I thought I could write a post showing how to be efficient and kill two birds with one stone. The birds here are two New Year’s Raithel’s proposed resolutions:

#2 Volunteer to help junior SAS programmers.

#12 Reduce processing time by writing more efficient programs.

To combine the two, I could have titled this post “Helping junior SAS programmers to reduce processing time by writing more efficient programs”. However, I am not going to “teach” you efficient coding techniques which are a subject deserving of a multi-volume treatise. I will just give you a simple tool that is a must-have for any SAS programmer (not just junior) who considers writing efficient SAS code important. This simple tool has been the ultimate judge of any code’s efficiency and it is called timer.

What is efficient?

Of course, if you are developing a one-time run code to generate some ad-hoc report or produce results for uniquely custom computations, your efficiency criteria might be different, such as “as long as it ends before the deadline” or at least “does not run forever”.

However, in most cases, SAS code is developed for some applications, in many cases interactive applications, where many users run the code over and over again. It may run behind the scenes of a web application with a user waiting (or rather not wanting to wait) for results. In these cases, SAS code must be really fast, and any improvement in its efficiency is multiplied by the number of times it is run.

What is out there?

SAS provides the following SAS system options to measure the efficiency of SAS code:

STIMER. You may not realize that you use this option every time you run a SAS program. This option is turned on by default (NOSTIMER to turn it off) and controls information written to the SAS Log by each SAS step. Each step of a SAS program by default generates the following sample NOTE in SAS Log:

While the FULLSTIMER option provides plenty of information for SAS code optimization, in many cases it is more than you really need. On the other hand, STIMER may provide quite valuable information about each step, thus identifying the most critical steps of your SAS program.

Get your own SAS timer

If your efficiency criteria is how fast your SAS program runs as a whole, than you need an old-fashioned timer, with start and stop events and time elapsed between them. To achieve this in SAS programs, I use the following technique.

At the very beginning of your SAS program, place the following line of code that effectively starts the timer and remembers the start time:

/* Start timer */
%let _timer_start = %sysfunc(datetime());

At the end of your SAS program place the following code snippet that captures the end time, calculates duration and outputs it to the SAS Log:

Despite its utter simplicity, this little timer is a very convenient little tool to improve your SAS code efficiency. You can use it to compare or benchmark your SAS programs in their entirety.

Warning. In the above timer, I used the datetime() function, and I insist on using it instead of the time() function as I saw in many online resources. Keep in mind that the time() function resets to 0 at midnight. While time() will work just as well when start and stop times are within the same date, it will produce completely meaningless results when start time falls within one date and stop time falls within another date. You can easily trap yourself in when you submit your SAS program right before midnight while it ends after midnight, which will result in an incorrect, even negative, duration.

I hope using this SAS timer will help you writing more efficient SAS programs.

For those of you who have followed my SAS Administration blogs, you will know that setting up your IO subsystem (the entire infrastructure from the network/fibre channels in your physical server, across your connections, to the fibre adapters into the storage array, and finally to the physical disk drives in the storage array) is very near and dear to my heart. Based on this desire, as my team learns information on better ways to configure the IO system infrastructure, we like to document it in new white papers, or by updating existing white papers.

We also clarified our position regarding the use of NFS and SAS. NFS can be a reasonable choice for small SAS Grid Manager implementations and/or when performance is less of a concern for the customer. In general, it is not a good idea to place SASWORK on an NFS file system where performance is a concern.

If you’ve used SAS Environment Manager, you know what kind of information it’s capable of providing – the metrics that show you how the resources in your SAS environment are performing. But what if it could do more? What if it could automatically collect and standardize metric data from SAS logs for your SAS applications? What if it could automatically collect and standardize metric data about the computing resources that make up your SAS system? And what if it could store all of that data in a single location, where it could be used to generate detailed predefined reports or to perform your own analysis?

All of this is possible with the SAS Environment Manager Service Management Architecture, which is a new feature in SAS Environment Manager 2.4. These new functions enable SAS Environment Manager to be a part of a service-oriented architecture (SOA) and can help your organization meet your IT Infrastructure Library (ITIL) reporting and measurement requirements.

You can think of the core of this new feature as being made up of three broad functional areas: data collection, data storage and data reporting.

Data collection

Data collection is handled by extract, transform and load (ETL) processes. These processes obtain metric data from your system, convert the data to a standard format and load the data into storage. Three ETL packages are included with SAS Environment Manager:

The Audit, Performance, and Measurement (APM) ETL is used to collect, process and store information from SAS logs. The APM metric data can show you things such as:

which SAS procedures are used most often and how much time each one is used

the top ten users of your SAS Workspace Server and the details of each user’s usage

the average response time of stored processes

The Agent-Collected Metric (ACM) ETL is used to collect, process, and store information about the computing resources in your SAS system. It handles metric data from resources such as servers and disk storage. The APM metric data can show you things such as:

the disk service time of each file mount

the number of calls to each IOM server

the memory usage for your Web Application Server

The Solution Kits ETL is used to collect, process and store information from specific SAS solutions and applications as those ETL processes are developed and delivered by SAS.

Data storage

The SAS Environment Manager Data Mart handles the data storage function. The Data Mart is made up of a set of predefined SAS data sets that store the metric data that is collected by the ETL processes. Once in the Data Mart, the data is used by the built-in reporting feature of SAS Environment Manager Service Management Architecture. However, the data is also available for you to perform your own analysis using SAS programs and applications or third-party monitoring tools.

Data reporting

The data reporting function uses the Report Center, which provides a wide variety of predefined reports that are tailored to the metric data collected by the ETL processes. These reports enable you to visualize the metric data and more easily identify potential problems. You can also create your own custom reports.

Other capabilities

In addition to these core functions, the SAS Environment Manager Service Management Architecture includes these capabilities:

Setup based on best practices – Alerts, resource definitions, resource groups and metric collection changes (based on service monitoring best practices) are all automatically applied to SAS Environment Manager. This feature doesn’t require you to use the ETL processes, so you can easily optimize your SAS Environment Manager configuration even if you don’t use any other part of the Service Management Architecture.

Event exporting – You can export events from SAS Environment Manager for use by third-party monitoring tools.

Event importing – Your SAS solutions and external applications can generate events that you can import into SAS Environment Manager.

SAS Visual Analytics autofeed –You can specify that the metric data from the SAS Environment Manager Data Mart is automatically copied to a specified location where SAS Visual Analytics can then access it and load it into the application. This feature enables you to use the powerful analysis and reporting capabilities of SAS Visual Analytics with the metric data from SAS Environment Manager.

SAS Environment Manager Service Management Architecture is provided as part of SAS Environment Manager 2.4, but it’s not active by default. To use these new features, you have to follow an initialization procedure for each of the components that you want to use.

When SAS is used for analysis on large volumes of data (in the gigabytes), SAS reads and writes the data using large block sequential IO. To gain the optimal performance from the hardware when doing these IOs, we strongly suggest that you review the information below to ensure that the infrastructure (CPUs, memory, IO subsystem) are all configured as optimally as possible.

Operating-system tuning. Tuning Guidelines for working with SAS on various operating systems can be found on the SAS Usage Note 53873.

CPU. SAS recommends the use of current generation processors whenever possible for all systems.

Memory. For each tier of the environment, SAS recommends the following minimum memory, guidelines:

SAS Compute tier: A minimum of 8GB of RAM per core

SAS Middle tier: A minimum 24GB or 8GB of RAM per core, whichever is larger

SAS Metadata tier: A minimum of 8GB of RAM per core

It is also important to understand the amount of virtual memory that is required in the system. SAS recommends that virtual memory be 1.5 to 2 times the amount of physical RAM. If, in monitoring your system, it is evident that the machine is paging a lot, then SAS recommends either adding more memory or moving the paging file to a drive with a more robust I/O throughput rate compared to the default drive. In some cases, both of these steps may be necessary.

IO configuration. Configuring the IO subsystem (disks within the storage, adaptors coming out of the storage, interconnect between the storage and processors, input into the processors) to be able to deliver the IO throughput recommended by SAS will keep the processor busy, allow the workloads to execute without delays and make the SAS users happy. Here are the recommended IO throughput for the typical file systems required by the SAS Compute tier:

IO throughput. Additionally, it is a good idea to establish base line IO capabilities before end-users begin placing demands on the system as well as to support monitoring the IO if end-users begin suggesting changes in performance. To test the IO throughput, platform specific scripts are available:

File system.The Best Practices for Configuring IO paper above lists the preferred local file systems for SAS (i.e. JFS2 for AIX, XFS for RHEL, NTFS for Windows). Specific tuning for these file systems can be found the above operating system tuning papers.

For SAS Grid Computing implementations, a clustered file system is required. SAS has tested SAS Grid Manager with many file systems, and the results of that testing along with any available tuning guidelines can be found in the A Survey of Shared File Systems (updated August 2013) paper. In addition to this overall paper, there are more detailed papers on Red Hat’s GFS2 and IBM’s GPFS clustered file systems on the SAS Usage Note 53875.

Due to the nature of SAS WORK (the temporary file system for SAS applications), which does large sequential reads and writes and then destroys these files at the termination of the SAS session, SAS does not recommend NFS mounted file systems. These systems have a history of file-locking issues on NFS systems, and the network can negatively influence the performance of SAS when accessing files across it, especially when doing writes.

Storage array. Storage arrays play an important part in the IO subsystem infrastructure. SAS has several papers on tuning guidelines for various storage arrays, through the SAS Usage Note 53874.

Miscellaneous. In addition to the above information, there are some general papers on how to setup the infrastructure to best support SAS, these are available for your review:

Finally, SAS recommends regular monitoring of the environment to ensure ample compute resources for SAS. Additional papers are available that provide guidelines for appropriate monitoring. These can be found on the SAS Usage Note 53877.

Scalability is the key objective of high-performance software solutions. “Scaling out” is a concept which is accomplished by throwing more server machines at a solution so that multiple processes can run in dedicated environments concurrently. This blog post will briefly touch on several scalability concepts that affect SAS.

Functional roles

At SAS, we have a number of different approaches to tackle the ability to scale our software across multiple machines. As we often see with our SAS Enterprise Business Intelligence solution components, we’ll split up the various functional roles of SAS software to run on specific hosts. In one of the most common examples, we’ll set aside one machine for the metadata services, another for the analytic computing workload, and a third for web services.

While this is more complicated than deploying everything to a single machine, it allows for a lot of flexibility in providing responsive resources which are optimized for each role. Now, we’re not limited to just three machines, of course.

Clusters

For each of these functional roles – Meta, Compute, and Web – we can scale them out independently of the others. Depending on the technology involved, different techniques must be employed. The Meta and Web functional roles, in particular, are well-equipped to function as clusters.

Generally speaking, a software cluster is comprised of services that present as peers to the outside world. They offer scalability and improved availability where any node of the cluster can perform the requested work, continue to offer service in the face of failure of one or more nodes (depending on configuration) and other features.

Grids

The Compute functional role has some built-in ability to act as a cluster if the necessary SAS software is licensed and properly configured – which is pretty great already – but this ability can be extended even further to act as a grid. A grid is a distributed collection of machines that process many concurrent jobs by coordinating the efficient utilization of resources which may vary from host.

With proper implementation and administration, grids are very tolerant of diverse workloads and a mix of resources. For example, it’s possible to inform your grid that certain machines have certain resources available and others do not. Then, when you submit a job to the grid, you can declare parameters on the job that dictate the use of those resources. The grid will then ensure that only machines with those resources are utilized for the job. This simple illustration can be implemented in different ways depending on the kind of resources and with a high-degree of flexibility and control.

Another common component of clusters and grids is the use of a clustered file system. A clustered file system is visible to and accessed by each machine in the grid (or cluster) – typically at the exact same physical path. This is primarily used to ensure that all nodes are able to work with the same set of physical files. Those files might range from shared work product to software configuration and backups, event to shared executable binaries. The exact use of the clustered file system can of course vary from site to site.

Massively Parallel Processing

Extending grid computing even further is the concept of massively parallel processing (or MPP). As we see with Hadoop technology and the SAS In-Memory solutions, a number of benefits can be realized through the use of carefully planned MPP clusters.

One common assumption behind MPP (especially in the implementation of the SAS In-Memory solutions) has historically been that all participating machines are as identical as possible. They have the same physical attributes (RAM, CPU, disk, network) as well as the same software components.

The premise of working in an MPP environment is that any given job (that is, something like a statistical computation or data to store for later) is simply broken into equal size chunks that are evenly distributed to all nodes. Each node works on the problem individually, sharing none of its own CPU, RAM, etc. with the others. Since the ideal is for all nodes to be identical and that each gets the same amount of work without competing for any resources, then complex workload management capabilities (such as described for grid above) are not as crucial. This assumption keeps the required administrative overhead for workload management to a minimum.

Hadoop and YARN

Looking forward, one of the challenges of assuming dedicated, identical nodes and equal-size chunks of work in MPP has been that it’s actually quite difficult to keep everything equal on all nodes all of the time. For one thing, this often assumes that all of the hardware is exclusive for MPP use all of the time – which might not be desirable for systems which sit idle overnight, on weekends, etc. Further, while breaking workload up into equal-size bits is possible, it’s sometimes tough to keep the workload perfectly equal and distributed when there exists competition for finite resources.

For these and many other reasons, Hadoop 2.0 introduces an improvement to the workload management of a Hadoop cluster called YARN(Yet Another Resource Negotiator).

The promise of YARN is to better manage resources in a way accessible to Hadoop as well as various other consumers (like SAS). This will help mature the MPP platform, evolving it from the old Map-Reduce framework to a more flexible platform to handle a wider variety of different workload and resource management challenges.

And of course, SAS solutions are already integrating with YARN to take advantage of the capabilities it offers.

Most organizations enjoy a plethora of SAS user types—batch programmers and interactive users, power users and casual—and all variations in between. Each type of SAS user has its own needs and expectations, and it’s important that your SAS Grid Manager environment meets all their needs.

One common solution to this dilemma is to set up separate configurations based on a mix of requirements for departments, client applications and user roles. The grid options set feature in SAS 9.4 makes this task much easier. A grid options set is a convenient way to name a collection of SAS system options, grid options and required grid resources that are stored in metadata.

Why it’s important to tune SAS Grid Manager for interactive users

SAS Enterprise Guide users running interactive programs typically expect the results to be returned almost immediately. At present, the current out-of-the-box grid options are set for long-running batch jobs. These options include a latency of 20 seconds on the start of every server session, so SAS Enterprise Guide may experience unhappy delays.

More good news is the fact that SAS Enterprise Guide and other SAS software products are grid-aware. Once the optimum grid options set is defined and named, it is applied automatically whenever a user accesses the application and submits a job.

In this post, I’ll use Platform RTM for SAS to walk you through a few simple steps and provide a set of options that you can use as a baseline for tuning SAS Grid grid for your SAS Enterprise Guide users.

1) Reduce grid services sleep times.

The first tuning to perform is usually at the cluster level, to reduce grid services sleep times so that the interactive session starts faster. In Platform RTM, select Config►LSF►Batch Parameters and edit these settings:

MBD_SLEEP_TIME

SDB_SLEEP_TIME

MBD_REFRESH_TIME

JOB_SCHEDULING_INTERVAL

Never set these values to 0. You should tailor the actual values to your grid, considering factors such as number of nodes, number of concurrent users, patterns of utilization and so forth. You may need multiple iterations to tune performance to suit the needs of your SAS user type. Figure 1 shows a recommended starting point.

Figure 1. Reduce grid services sleep time

2) Increase the number of job slots.

SAS Enterprise Guide and SAS Add-In for Microsoft Office are designed to keep the server session open for the full duration of the client session unless a user explicitly chooses to disconnect from the server. For SAS Grid Manager, this open session means that one job slot on that server is taken.

Therefore, for SAS Enterprise Guide use, you have to increase the number of job slots for each machine (use the MXJ parameter) from a default of 1 per core up to 5 or even 10 per core, depending on volume of usage. This step will increase the number of simultaneous SAS sessions on each grid node.

Interactive workloads are usually sporadic, intermittent, with short CPU bursts followed by periods of inactivity when the user is reviewing the results or exploring the data. Because these jobs are not I/O- or compute-intensive like large batch jobs, more jobs can be safely run on each machine

3) Implement CPU utilization thresholds for each machine.

Next, it is advisable to implement CPU utilization thresholds for each machine to prevent servers from being overloaded. With this limit in place, even if many users submit CPU-intensive work at the same time, SAS Grid Manager can manage the workload by suspending some jobs and resuming them when resources are available.

Changes in Step 2 and Step 3 are made at the host level. In RTM, select Config►LSF►Batch Hosts►default, edit Max Job Slots value and add the Advanced Attribute ut. See Figure 2.

Figure 2. Increase the number of job slots and set CPU utilization thresholds.

4) Create dedicated queues.

Even with this tuning, one user can easily use up all of the slots of a grid by starting many SAS Enterprise Guide sessions or by writing code that uses all the available slots for a single SAS session. When a machine runs out of slots, it is closed for use and work is routed to the next available slot. If all machines are closed and no machine has a free slot, no user can get another workspace. It doesn’t matter that the user with many open sessions is not actually using the resources. He or she might go for lunch, leaving his session open on a results page with no CPU, no I/O, nothing used on the server.

The best way to prevent this is by creating a dedicated queue called EGDefault, with a UJOB_LIMIT parameter low enough (for example, 3 slots as shown in Figure 3). After that, each user will be then limited to 3 concurrent server sessions, whether started from the same client or from different SAS Enterprise Guide instances. When using SAS Enterprise Guide parallel features, the value of UJOB_LIMIT should be higher, provided that proper server sizing has been performed to accommodate for the additional resources required.

In RTM, you can create this queue selecting Config►LSF►Queues►Add. To make this the default queue for SAS Enterprise Guide users, all you have to do is create a grid options set in SAS Management Console and add this EGDefault queue as a grid option to it.

Figure 3. Set job limits in an EGDefault queue.

5) Create other grid options sets as needed.

There will always be ad hoc users or projects that do not fit into default categories (for example, they might be running jobs that have a high priority or jobs that require a large number of computing resources). For users requiring higher priority for their jobs or require more computing resources, it is just a case of defining a new queue such as EGPower. To prevent misuse, it's common to limit access to this special queue to selected users.

In previous releases, additional queues would been created by defining a special user group and then adding it to the USERS parameter in the queue definition. While effective, this has the disadvantage of duplicating user-related management both in metadata and in grid configuration files. With SAS 9.4, it possible to apply metadata security to grid options sets to keep all in one place—that is, in metadata.

6) Set options for other interactive and batch queues.

Finally, if you have other queues, for example, ones dedicated for example to SAS® Data Integration Studio users or to batch processing, put job slot limits there, too, to compensate the large increase to the Max Job Slots parameter we made for default hosts. Figure 4 shows the Advanced Attribute PJOB_LIMIT added to a batch queue, to enforce the limit of one batch job per physical core on every host.

Figure 4. Set job slots parameter for batch queue.

When you have all queues defined, your final configuration may look like the following: