DFSR: How to properly Size the Staging Folder and Conflict and Deleted Folder

DFSR: How to properly Size the Staging Folder and Conflict and Deleted Folder

The Distributed File System Replication (DFSR) service is a new multi-master replication engine that is used to keep folders synchronized on multiple servers.

Replicating data to multiple servers increases data availability and gives users in remote sites fast, reliable access to files. DFSR uses a new compression algorithm called Remote Differential Compression (RDC). RDC is a “diff over the wire” protocol that can be used to efficiently update files over a limited-bandwidth network. RDC detects insertions, removals, and rearrangements of data in files, enabling DFSR to replicate only the deltas (changes) when files are updated.

DFSR offers a lot of benefits from high data availability in local networks to faster file access to remote ones. One important thing to consider though is how to properly size the staging folder.

Optimizing the size of the staging folder can improve performance and reduce the likelihood of replication failing. If a staging folder quota is configured to be too small, DFS Replication might consume additional CPU and disk resources to regenerate the staged files. Replication might also slow down, or even stop, because the lack of staging space can effectively limit the number of concurrent transfers with partners.

Microsoft recommends you size the staging folder to the size of the 32 largest files you have in your shared folder. In the case of read-only mesh then you only need the space of the 16 largest files. This becomes tricky as you can imagine. In my case for example I store .IMG files which can size about 4gbs for installation purposes. Multiply 4 x 32 and you’ve got yourself a quite large staging size (128 gb). It is important to understand that staging during initial replication would benefit of the 128gb but on a day to day operation it all boils down on how active changes happen. If most of my files are word files and changes at most don’t account for 1gb almost all of that staging are is going to waste. Another approach you can follow is to initially assign the calculated staging folder size and then resize as you see fit. This can be done by observing the staging folder size and the event logs.

During normal operation, if the event that indicates the staging quota (event ID 4208 in the DFS Replication event log) is over its configured size and is logged multiple times in an hour, increase the staging quota by 20 percent. Remember this only applies during normal operation, not while you are doing an initial replica to a server. There is also a low watermark and high watermark messages you can use to gauge how close you are getting:

The DFS Replication service has detected that the staging space in use for the replicated folder at local path U: is above the high watermark. The service will attempt to delete the oldest staging files. Performance may be affected.

Ideally you want to avoid error 4208 at all costs but warnings such as the one above are acceptable. During initial replication you might get them frequently but during normal operation you need to observe them carefully. If you are getting them you risk that if you lose network connectivity you will start getting error 4208. If you are constantly below the Low Watermark you might have oversized your staging folder. I know this sounds like too much trouble but proper sizing is going to provide you with the best performance. If you size this horribly wrong not only is performance going to be awful but you will risk DFSR not doing its job properly. I recommend over-sizing it over under-sizing it if it comes down to those two.

Below is a PowerShell commandlet you can execute to determine the size of your largest files: