Troubleshooting NT Performance Monitoring

This month, I'll explore issues related to Windows NT performance monitoring and resource use. I'll concentrate on the Performance Monitor (Perfmon) tool, but I'll also discuss other tools. Bottlenecks are common in performance monitoring, and you must prevent them to improve task and system performance.

Physicist Werner Heisenberg's uncertainty principle states that measuring both the exact position and the exact momentum of a particle at the same time is physically impossible. This principle also applies to performance monitoring. When you monitor performance you use additional CPU time; if you enable disk monitoring, you will experience slower disk access. However, the resource use is negligible and rarely affects your results.

If you plan to install new servers, you need to create a baseline performance monitor of the system for later comparison when the server is running other software or greater loads. You can thus compare the baseline performance to current use to detect large changes, such as high disk access.

Task Manager (taskmgr.exe) lets you view all the processes and applications that are currently running, including the amount of memory and CPU time they're using. It also displays total memory use and CPU percentage use. This tool quickly evaluates your system's condition and identifies which processes are using memory and CPU space. To run Task Manager in NT 4.0, press Ctrl+Alt+Del and select Task Manager, or right-click the taskbar and select Task Manager.

Performance Monitor (Perfmon) is the best tool for evaluating resource use. Perfmon has exhaustive counters, and unlike Task Manager, it can log results to a file so that you can view and diagnose the results later. With the proper privileges, you can also monitor resource use for other computers on the network. Several software products add counters to Perfmon to augment the tool's monitoring abilities. You'll find Perfmon in the Administrative Tools program group. (For more information about performance monitoring, see "Related Articles in Windows NT Magazine," page 166.)

Process Explode (pview.exe) monitors all aspects of a process, such as the number of threads the process is using and the type and amount of committed mapped memory. This tool might interest developers. However, Process Explode is not useful for general users or administrators who want an overview of system resource use.

Quick Slice (qslice.exe) is a simple application that graphically displays the percentage of CPU that each active process uses. This tool gives only basic information, but it is useful for a quick graphical overview of per-process CPU use.

The resource kit includes Process Explode and Quick Slice. Several of Microsoft's Visual Development tools (e.g., Visual C++) include Process Explode.

Q: How can I identify a bottleneck on a Windows NT system?

Bottlenecks are common and can affect system performance. A bottleneck typically involves a system's memory, processor, disk storage system, network, or process. Bottlenecks might also involve more general I/O devices, such as serial ports and display. You can use Performance Monitor (Perfmon) to evaluate resources and their counters and determine which resource is causing the bottleneck.

Monitor the whole system rather than only the resource that you suspect is causing the problem. For example, you might view only the disk object and see high disk use, implying that you need faster disks or controllers. However, you might evaluate other usage statistics and discover that your disk access is high because of another problem. You might then evaluate memory use and find that the system paging is high (memory areas are being written and read to disk) because it lacks available physical memory; thus, instead of replacing disks, you must add memory.

Perfmon includes several counters for each resource. (Resources are also called objects. See Table 1 for counters that you can use to determine which resource on the system is causing the bottleneck.)

You can easily set up Perfmon to view resource use. From the Start menu, select Programs, Administrative Tools, Performance Monitor. Make sure Perfmon is in chart view (from the View menu, select Chart). You can then add counters to the chart by clicking the plus button or selecting Add To Chart from the Edit menu. A dialog box lets you choose objects and add counters, as Screen 1 shows. As you change the object choice, the counters change.

Select the object you want to monitor, select the relevant counters, and click Add. The chart will display the counters you have selected. By default, Perfmon updates counters once a second. To change this setting, select Chart from the Options menu and change the update time interval.

After you identify the resource that is causing the bottleneck, you need to identify the process that is causing the resource use. To monitor the resources that individual processes are using, select Process as the object and select the instance you want to monitor. You can use the instance option to view a list of all the processes that are running, because each process has an instance. You can monitor several of the main counters at a process level.

The Perfmon chart view requires you to constantly view the performance monitor to look for the guilty resource. An alternative is the log mode, which I explain in my next response.

Q: How do I set up a Performance Monitor log?

Usually you'll want to monitor performance over a long period and then review it. The log view lets you configure the required counters, and then Performance Monitor (Perfmon) sends the output to a file for later review.

Start Perfmon. From the View menu, select Log to see the log view. To add objects, click the plus button or select Add to Log from the Edit menu. A dialog box lets you select objects; you do not specify counters at this time.

From the Options menu, select Log. Select a location for Perfmon to store the log file in, and specify a filename. Select a disk that has sufficient free space, because the log might become large if you let it run for a long time. For example, if you monitor the objects LogicalDisk, Memory, Process, and System, the log grows by about 10KB each time Perfmon updates it. If you monitor every 10 seconds, after a 12-hour period the log will be about 40MB.

You must select the update interval. You might be tempted to choose once a second, but each reading uses resources. The default setting of 15 seconds is reasonable.

After you enter all the details, start the log. From the Options menu, select Log and click Start Log, but do not close Perfmon. After you collect enough data, stop the log. From the Options menu, select Log and click Stop Log.

To view the Perfmon log data, you must use the chart mode. From the View menu, select Chart. From the Options menu, select Data From. Then, click Log File and select the data source. Click OK. You can then add counters; however, you will see only the objects that you saved to the log.

You cannot simultaneously view a current activity chart and a log chart. To view both, you must run two Perfmon instances: one for the current chart, and one for the log chart.

When you monitor disks, you introduce overhead on the system that causes slower disk access. By default, the system does not activate the extra driver and timer that it needs. You can add disk counters, but they will not collect information. (They will display zeros.) The disks are not idle, but Performance Monitor (Perfmon) has no mechanism to collect data.

If you have software-enabled RAID volumes, you must install the new driver below ftdisk.sys (the fault-tolerant driver) in the I/O manager disk-driver stack so that you can gather information about the physical disks. If you are not using RAID, you must place the driver above the fault-tolerant driver.

To enable the extra driver, start a command session (cmd.exe). If you are not using RAID, enter

diskperf -y

If you are using RAID, enter

diskperf -ye

You will see the message Disk Performance counters on this system are now set to start at boot. This change will take effect after the system is restarted. You must then restart the computer.

To disable the disk-monitoring driver, enter

diskperf -n

After you reboot the computer, the driver will not start. The diskperf command changes the computer's Registry value for HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Diskperf\Start (0 * 0 for on; 0 * 4 for off).

Q: Can I set logs to automatically start and stop?

The MicrosoftWindows NT Resource Kit supplies utilities called datalog.exe and monitor.exe. You can use these utilities to set performance monitoring logs to automatically start and stop. You can also start and stop monitoring from the command line, but you must first create a Performance Monitor (Perfmon) workspace.

Start Perfmon, select log mode, add the objects you want to monitor, and set the update interval and file location. Then, from the File menu, select Save Workspace. Enter a filename with a .pmw extension, and save it in the %systemroot%\system32 directory.

To create the datalog service, copy datalog.exe and monitor.exe from the resource kit area to the %systemroot%\
system32 folder. From the command line, type

To install the service, go to the command shell on the system from which you want to run the datalog service, and type

monitor setup

To use the monitor service from the command prompt, you must first tell the service which configuration file to use (the workspace file you created; e.g., monitor.pmw). You don't need to enter an explicit path, because you copied this file to the %systemroot%\system32 folder. Type

monitor

To start the monitor, type

monitor start

To stop the monitor, type

monitor stop

You can add the start and stop commands to the at command to automate the start and stop:

In this example, the monitor starts at 10:00 a.m. and stops at 8:00 p.m. on weekdays. Use Perfmon to view the log file that the monitor creates.

Q: How do I use the Excel performance macro?

Users who are inexperienced with performance monitoring might want to try the Excel macro; you can use it to make recommendations on your system configuration. You can download the Excel macro from http://www.savilltech.com/
download/perfmon.zip. This macro is basic and is useful only as a guide.

Create a log file of your system, including the following counters:

Memory—Pages/sec

Memory—Available Bytes

PhysicalDisk—% Disk Time

PhysicalDisk—Current Disk Queue Length

Processor—% Processor Time

Processor—Interrupts/sec

System—Processor Queue Length

Save the log file with a .csv extension. Start Excel. From the File menu, select Open and load perfmon.xla.

After you load the file, you will see a new menu item: Planning. From the Planning menu, select Open, and load the log file. After you load the log file, select Bottlenecks from the Planning menu. You will see a dialog box, which will show any system comments or warnings.

You can also use the Excel macro to create a chart. From the Planning menu, select Create Chart, and select the counters you want to include.

Q: Can I view information about a process that started after I started the Performance Monitor log?

If you specified Process as one of the objects when you created the log file, you can later add counters about the processes (instances) that are running. However, the only instances you can see are those that were running when you started the log. You must tweak the system to add processes that started later.

Start Performance Monitor (Perfmon) and select chart view. To load the log that you created earlier, select Options, Data From, and the file. Then, add the counters you want to view. Click the plus button or select Add To Chart from the Edit menu. The only instances you will see under processes are those that were running when you started.

If you want to investigate a particular area (e.g., a spike in CPU use, disk I/O), alter the time window to start from the peak. From the Edit menu, select Time Window. Then, move the left bar until the left line is in the correct place on the chart (i.e., at the spike), as Screen 2, page 166, shows. Click OK.

From the Edit menu, select Add To Chart. You will see the processes that were running when you altered the time window. You can then identify the process that is causing the problem. After you reset the time window to the original start time (i.e., when you started the monitor log), the chart will still display the problem process. The chart will display counter information for this process only after the time when you started the process.

Q: How do you configure performance alerts?

Windows NT includes several alerts (e.g., when the boot partition free disk space falls below 50MB, NT generates an alert). You can also create alerts.

Start Performance Monitor (Perfmon). From the View menu, select Alert. To create a new alert, click the plus button or select Add To Alert from the Edit menu. In the dialog box, select the object and counter. In the Alert If section, enter a value and select Over or Under. Click Done. Select % Processor Time, enter 90, and choose Over. NT will then generate an alert if the CPU use grows to more than 90 percent. You can also configure a program to write to the event viewer (e.g., sqlalrtr -E if you have SQL Server installed).

Q: Can non-administrators monitor the performance of other computers in the network?

When you add counters to a chart or log, you can specify the computer name if you want to monitor a remote computer's resource use. You must be an administrator to add counters for Windows NT Server machines; however, an administrator can configure a computer to give non-administrators access.

Log on to the computer that you want non-administrators to monitor remotely. If the boot partition is NTFS, move to %systemroot%\system32 (e.g., d:\winnt\system32). Right-click the files perfcnnn.dat and perfhnnn.dat, and select Properties. (The variables nnn are numbers that vary depending on which language version of NT you have installed—e.g., 009 for U.S. English. This modification corresponds to the Registry entry HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\nnn.) Click the Security tab, and select Permissions. Make sure that all the users who will be remotely monitoring have at least read access.

Start the Registry editor, regedt32.exe (regedit.exe does not let you modify security on keys). Select HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib, and from the Security menu, select Permissions. Again, give the users at least read permission, and select Replace Permission On Existing Subkeys. Click OK. Repeat the previous step, this time on the Registry setting HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\SecurePipeServers\winreg. For NT 4.0, you need Service Pack 3 (SP3); for NT 3.51, you need SP5 with the security hotfix. After you configure the computer, non-administrators can remotely monitor it.

Q: What information does Task Manager give?

Task Manager has a limited built-in performance monitor that shows CPU and memory use. Click on the Performance tab to view the charts and usage bars.

When you run Task Manager, the program displays a small CPU usage bar on your taskbar, next to the time. Task Manager displays this usage bar until you exit the program.

You can configure Task Manager to show CPU use by the kernel. From the View menu, select Show Kernel Times. The kernel display appears as a red line on the CPU graph.

Click the Processes tab to see all active processes, the total amount of CPU time they have used, and the amount of physical memory they are currently using. You can stop a troublesome process by selecting the process and clicking End Process. But be careful if you stop a process that might kill the system, because you'll lose your unsaved files. The columns you see in Screen 3 are defaults. To add columns, choose Select Columns from the View menu, and check the columns you want to add. You can also remove the columns you don't want.