Where should you store your PST files?

This article has been contributed by Ryan Doon, a PFE working with Microsoft Canada. In this post, he talks about why storing PST files on a file server will lead to performance issues, and presents some options to proactively avoid those issues.

As a Premier Field Engineer at Microsoft I am always looking to determine how to optimize an environment. Whether this environment is a single computer, or many servers, there are always some best practices that can assist in optimizing your environment for performance and reliability.

The optimal location to store a PST is quite a hot topic. The Windows Server Performance Team created a very detailed blog post about this topic. In this blog post, I have attempted to simplify the message for easier understanding and I also detail possible options that you have for avoiding those issues.

Introduction

PST files were created to allow users to maintain a copy of their messages on their local computers. The PST files also serve as a message store for users who do not have access to a Microsoft Exchange Server computer (for example POP3 or IMAP email users). PST files were never designed to be a network based solution. It is possible to specify a network location for PST files, but this usage is not meant to be a long term and continuous use solution.

I/O patterns

PST Files are accessed by a method called file-access-driven, which means that this method utilizes special file access commands which will be offered by the OS to read and write files. For writing files to local disks this is an excellent method, but when writing to a fileserver via a LAN/WAN another method should be used. This method is called network-access driven and uses specific commands from the OS to send/receive data from/to other systems which are connected to the network

Since the PST file is a file-access-driven method of message storage it is efficient to store the PST locally, and not efficient on WAN or LAN links because WAN and LAN links use network-access driven methods. If there is a remote PST file (over a network link), Outlook tries to use the file commands to read from the file or write to the file. However, the operating system must then send those commands over the network because the file is not located on the local computer. This creates lots of overhead and increases the time that is required to read and write to the file.

What can happen?

Firstly, the use of a PST file over a network connection may result in a corrupted PST file if the connection degrades or fails, and writing to a PST can take 4 times longer than read actions.

Many users who host PST files on a file server might have issues with system hangs or resource depletion. System hangs are often seen during times that outlook and the PST files are heavily used. This is often seen in the morning when users are logging on and launching outlook. As mentioned above, placing PST files on the network creates a lot of overhead for the system due to PST files being file-access-driven and not network-access-driven. Multiple users using the PST files create a request of many Gigabytes of data being requested at the same time. These requests require a lot of Disk & Network I/O to be processed simultaneously. The file server “freezing” is a very common scenario during the time it tries to service all the requests.

The queuing in the server service work queues is what causes this temporary hang. The server service uses work items to handle I/O requests that come in over the network - for example: a request to extend a PST file. These work items are queued in the server service work queues, and from there they are handled by the server service worker threads. The work items are allocated from a kernel resource called Non-Paged Pool (NPP).

The server service sends these I/O requests down to the disk subsystem. If, for reasons mentioned above, the disk subsystem does not respond in time due to heavy requests, the incoming I/O requests are queued via work items in the server work queues. Since these work items are allocated from NPP, eventually this resource runs empty. Running out of NPP causes systems to hang eventually.

Many times on file servers hosting PST files you will see SRV errors in the event logs. This problem occurs because the Server service cannot keep up with the demand for network work items that are queued by the network layer of the I/O stream. The Server service cannot process the requested network I/O items quickly enough to the hard disk and exhausts available resources, which leads to system hangs.

Other considerations

From an Exchange Server perspective the practice of PST files stored anywhere pose several issues in addition to those listed above:

Data Duplication: Often when network PST’s become corrupt they are restored. Typically the analyst makes a copy of the original corrupted PST and leaves it behind “just in case” causing a duplication of data

Data leakage: This facility also affords users the ability to take data offsite sometimes with good intentions others with malicious intent to steal data. PST’s can be password protected however these can normally be broken within minutes with very little effort. In the event of a lost or stolen laptop or USB drive this could mean compromising corporate data

E-Discovery: In the case of HR or legal discoveries PST’s stored locally or on a Network file share cannot be included in a typical discovery methods

PST Bloat: With PST comes an increase in storage bloat. That is the same data stored in an Exchange database takes up less space than housing the same data in a PST. This bloat can hit 25% and higher. This is caused by the overhead of the PST structure itself and the way it stores the data.

How do we fix this?

While there are several options to addressing the issue of what to do with all the PST data, each can bring with it a different set of challenges.

Option 1: Store PST data locally on workstations

While this may be the easiest and least expensive to implement it possess the most risk in terms of data loss and protection.

Backup and recovery of local PST’s become more complex and would require a technology and support investment

Data leakage e.g.: lost/stolen laptop or user stealing corporate data. The usage of encryption technologies like BitLocker Drive Encryption would help minimize the impact of such cases.

No facility for E-discovery in the case of Legal or HR issues. Individual PST’s would have to be collected and searched individually.

Option 2: Import PST data into Exchange

Archival Option

Advantages

Considerations

In the first variation of this option, we import PST data back into the Exchange user’s Primary mailbox

The benefit is the zero learning required by the end users and administrators since this is simply part of existing database infrastructure.

Increase in mailbox quotas and primary storage implications.

This can also lead to extremely high item counts per folder in some situations depending on how large individuals PST datasets are. Item count limitations for Exchange 2007 are approximately 20000 per folder and for Exchange 2010 this increases to around the 100000 item mark per folder before starting to see performance hits on the Exchange servers.

Growing Exchange stores to larger sizes can also create challenges for backup and recovery windows depending on the technology employed.

You can also consider importing PST data back into the Exchange users Archive mailbox (Exchange 2010 only).

This would allow for the use of lower cost storage.

This comes at little to no training requirement as the Archive mailbox looks and feels the same way a PST operates today, which means it is seamless to the end user and to the administrator as well.

The Archive mailbox is simply another Exchange store and administered as such.

It can also participate in high availability with no additional requirements.

Growing Exchange stores to larger sizes can also create challenges for backup and recovery.

High items counts are also a concern as archive mailboxes grow large; however the performance impact can be isolated from the primary mailbox.

You can also consider importing PST’s back into Exchange using third party archiving solution such as Enterprise Vault. 1

This can be done while bypassing the need to store anything in Exchange other than possibly a stub object in user folders.

Care must be also taken here in relation to the high item count issues listed above, stub items equate to the same cost as having a native item in Exchange regardless of the size of the item.

Do note that the use of third party technology is a potential support overhead; it also carries increased storage costs to grow the archive. High availability configuration for this solution becomes more complex than the previous options.

Finally, you could consider the use of cloud facilities to provide a hybrid solution(compatible with Exchange 2010 only).

This would maintain the existing Exchange on premise infrastructure but leverage Microsoft online services to store Archive data. The benefits are predictive cost models, no care and feeding (e.g. no associated power and physical location costs) of server’s therefore lower support costs.

One of the challenges to this approach is the political aspect of storing critical corporate data elsewhere - potentially on foreign soil.

1 Please note that this is a third party product and is mentioned here as-is for illustration purposes. Microsoft does not endorse or recommend a specific third party product.

Conclusion

To ensure file server health, PST files should never be hosted on a file server. There is no question if hosting the PST files on the file server will cause an issue, the question is just a matter of when the issue will occur.

We have also presented some server side options which allow easier management of mails. Using the information presented in this article you can make the right decision on which method of managing the mailbox content works best for your setup.

I am having an issue with my .pst file disappearing from Outlook after I exit the program and open it again. So whenever I open Outlook I have to open the .pst or add it again from the account settings window. The .pst file is located on a network hard drive that is backed up everyday, not locally on my workstation.

I was wondering if this is something you have seen before and if saving the data file locally would prevent it from disappearing.

Corruption is a serious issue in Outlook .pst file and sometimes it causes a serious data loss situations. To handle this problem, Microsoft has provided an inbuilt scanpst tool but it does not work always and users have to face data loss issues in their repaired pst file. But, there are some third party vendors available in the market which easily resolves corruption and data loss problems in MS Outlook. One such popular tool that did a great job in our case: http://www.outlook-pstrepair.com/

I am happy to see my all emails and other data back in my folders. It’s really superb !. I would recommend this utility to those who are not able to fix corruption or data loss issues in their outlook by scanpst tool.

We have been storing outlook 2003 pst on our file servers for years with no problems. However, with 2010 we now have loads of data corruption where the pst zeroes back to 64k.
I get a feeling that this is a deliberately ploy by MS. If this happens with our 2003 from now on, I will know.