REPLICATION LATENCY WARNING [Replications Check,<DC name>] This replication path was preempted by higher priority work. From <source DC> to <destination DC> Naming Context: DC=<DN path> The replication generated an error (8461): The replication operation was preempted. Replication of new changes along this path will be delayed. Progress is occurring normally on this path.

Symptom 2:

REPADMIN.EXE reports that the last replication attempt was delayed for a normal reason, result 8461 (0x210d).

REPADMIN commands that commonly cite the 8461 status include but are not limited to:

REPADMIN /REPLSUM

REPADMIN /SHOWREPL

REPADMIN /SHOWREPS

REPADMIN /SYNCALL

Symptom 3:

Repadmin /rehost fails and generates status code 8461.

Symptom 4:

Repadmin /queue output reveals one or more task that have a status of "PREEMPTED."

The "replicate now" command in Active Directory Sites and Services (DSSITE.MSC) fails and generates the following error message:

The replication operation was preempted.

Dialog title: Replicate Now Message text: The following error occurred during the attempt to synchronize naming context <DNS name of directory partition> from domain controller <source DC> to domain controller <destination DC>: The replication operation was preempted.

The following number of operations is waiting in the replication queue. The oldest operation has been waiting since the following time. Time: <date> <time> Number of waiting operations: <value> This condition can occur if the overall replication workload on this domain controller is too large or the replication interval is too small.

NTDS Replication 2094

Performance warning: replication was delayed while applying changes to the following object. If this message occurs frequently, it indicates that the replication is occurring slowly and that the server may have difficulty keeping up with changes.

Source: NTDS ReplicationEvent ID 1839Type: WarningDescription:The following number of operations is waiting in the replication queue. The oldest operation hasbeen waiting since the following time.Time: Number of waiting operations: This condition can occur if the overall replication workload on this domain controller is too large or the replication interval is too small.

Note Event 1580 is also logged when long-running replication tasks complete if verbose replication diagnostics logging has been set to a value of 0x1 or greater.

A long-running Active Directory Domain Services inbound replication task has finished with the following parameters.

Elapsed time (minutes): 84

Operation: Synchronize Replica

Options: 0x21000051

Parameter 1: DC=Contoso,DC=com

Parameter 2: <source DCs ntds settings object object guid>

Parameter 3:

Parameter 4:

A long-running replication task may also occur when a system has been unavailable or a directory partition has been unavailable for an extended period of time. A long-running replication task may indicate a large number of updates, or a number of complex updates occurring at the source directory service. Performing these updates during non-critical times may prevent replication delays.

A long running replication task is normal in the case of adding a new directory partition to Active Directory Domain Services. This can occur because of a new installation, global catalog promotion, or a connection generated by the Knowledge Consistency Checker (KCC).

Additional Data

Error value:

The operation completed successfully.

Cause

This replication status is returned when there are higher priority replication tasks in the destination DCs inbound queue. It does not indicate a failure condition; the replication task is not cancelled, instead, the task is put into a holding pattern until the higher priority work is completed. It is normal to see this message returned periodically in larger environments, and it is important to note that the condition is usually transient.

While this issue is common and usually transient, some other AD replication problems can cause a backlog in the queue. If this occurs, you might start seeing replication tasks being preempted frequently. Frequent logging of event 2094 "Performance warning" (sample event are shown in the "Symptoms" and "Resolution" sections) is another indication that troubleshooting may be needed.

Investigate those problemsExplanation of repadmin /queue and replication priority

Analyze queue output over time to determine whether tasks are being processedExplanation of repadmin /showrepl /verbose

Active Directory replication has been preempted.

The progress of inbound replication was interrupted by a higher priority replication request, such as a request generated manually with the repadmin /sync command.

Replication Load

Frequent updates combined with inter-site change notification which results in a high rate of redundant change notifications

Small replication schedule window

Performance-based issue

Disk and or memory performance

Network performance

Resolution

This status does not indicate a failure condition. This is a temporary issue in many cases and there are no resolution steps required.

If the status 8461 never clears, there is a lot of work to do to determine the correct path to take. This issue requires advanced knowledge of multiple troubleshooting tools. It may be necessary to seek the help of Microsoft Support to assist with the data analysis process.

You will have to determine the cause before you implement any steps to resolve an underlying problem. The cause of the replication status 8461 can occur in any of the following scenarios:

Transient condition

Replication load

Consistent Load

Temporary Load

Performance issue

OS Performance

Disk Performance

Network Performance

Determine whether this is only a transient condition. Document the time that manual replication is initiated, and find the corresponding tasks in the repadmin /queue output. Sometime later, run repadmin /queue, and determine whether the manually initiated tasks are still present.

If replication tasks are queued. Look at the currently running task, and investigate.

Use event log data, repadmin output, and performance monitor to help isolate the cause of the problem. Determine how quickly updates are being processed and what rate of change.

Replication Load

Consistent

The domain controller has too many replication partners or under too great a replication load. Symptoms of an overloaded DC would be:

A replication queue that never clears even though replication tasks are processed in a timely fashion

Note You can use repadmin /queue over time and correlate this with performance data to identify this scenario.

Excessive replication

Attributes that are very frequently updated. Identify attribute updates using verbose replication event log events (or using repadmin /showchanges) and then correlate with repadmin /showobjmeta for several objects on the destination DC. Look at the attribute identified in the event and look for a high version number, or get multiple logs for the same object throughout the day and see if the version number increases frequently.

913676 Event ID 2008 is logged in the Application log on a domain controller after you install Exchange 2003 Service Pack 2.

Temporary

Bulk changes infrequently

After hosting a partition for the first time or during a rehost

Performance-based issue

Common symptoms for performance induced queue buildup include

Event ID 2094

Event Type: Warning Event Source: NTDS Replication Event Category: Replication Event ID: 2094 Description: Performance warning: replication was delayed while applying changes to the following object. If this message occurs frequently, it indicates that the replication is occurring slowly and that the server may have difficulty keeping up with changes. Object DN: CN=JUSTINTU,OU=Workstations,DC=contoso,DC=com Object GUID: 2530ee74-85e3-4276-15f2-ba6a310471eb Partition DN: DC=contoso,DC=com Server: 4e901384-2aa1-3008-c5a2-e37f9f67d7e0._msdcs.contoso.com Elapsed Time (secs): 13 User Action: A common reason for seeing this delay is that this object is especially large, either in the sizeof its values, or in the number of values. You should first consider whether the application can be changed to reduce the amount of data stored on the object, or the number of values.If this is a large group or distribution list, you might consider raising the forest version to Windows Server 2003, since this will enable replication to work more efficiently. You should evaluate whether the server platform provides sufficient performance in terms of memory and processing power. Finally, you may want to consider tuning the Active Directory database by moving the database and logs to separate disk partitions. If you wish to change the warning limit, the registry key is included below. A value of zero will disable the check. Additional Data Warning Limit (secs): 10 Limit Registry Key: System\CurrentControlSet\Services\NTDS\Parameters\Replicator maximum wait for update object (secs)

This document and data collection strategy is meant to be used for troubleshooting Slow AD Replication.

Symptoms of slow AD Replication

Data collection

Repadmin Data

Use Repadmin /queue to document the queued replication tasks. Monitor the queue to see if there is a delay in processing replication tasks. Log all repadmin /queue output to the same text file so you have good historical data.

Review the output to see whether replication tasks are processed in a timely manner. The top of the file contains the currently running task and the length of time it has been running. If the same task is always at the top of the output, you can use verbose output of repadmin /showrepl to monitor the progress.

Repadmin changes

Repadmin /showrepl

Use Repadmin /showrepl and the /verbose option to monitor the last replication status and the number of changes that remain to be replicated.

Repadmin /showrepl /verbose DCNameDomainDN

Repadmin /showrepl /verbose 5thwardCorpDC dc=corp,dc=contoso,dc=com

To limit the output so that only the desired Source DC is displayed, use the following:

Create a new User Defined Data Collector set In Performance Monitor that uses the AD Diagnostics template.

The steps here detail how to set up a good set of baseline DC performance counters. You will need to modify some of the settings, such as duration and sample interval to fit your specific scenario.

How to Create a User Defined Data Collection Set

Open Server Manager on a Full version of Windows Server 2008 or a later version.

Expand Diagnostics > Performance > Data Collector Sets.

Right-click User Defined and select New > Data Collector Set.

Type in a name such as AD DS Diagnostics, and leave the default selection of Create from a template (Recommended) selected. Then, select Next.

Select Active Directory Diagnostics from the list of templates and then select Next and follow the Wizard prompts (making any changes you think are necessary).

Right-click the new User Defined data collector set and view the Properties.

To change the run time, modify the Overall Duration settings in the Stop Condition tab, and then click OK to apply the changes.

Options to consider:

Overall Duration –you can have the data collector stop after collecting for a set amount of time

Limits –you can have the data collect stop after a time or size threshold is reached (with the option to have it automatically restart) Setting limits is advantageous when you want to limit the log size.

Here you have the options of changing the Sample interval and adding or removing additional counters.

For this scenario, the default-sampling interval of 3 seconds should be sufficient. However, for much longer sampling times, 3 seconds is too frequent an interval.All recommended counters are included in the default AD Diagnostic’s collector set with three exceptions:

Database ==>Instances(lsass/NTDSA)\ *

LogicalDisk(*)\*

For LogicalDisk: all instances is not required - System drive and drives where database and logs are stored should be included at minimumSecurity System-Wide Statistics\*

Security System-Wide Statistics\*

To add the AD DS database counters to the User Defined Data Collection Set

In Performance Counter Properties, select Add.

Expand Database ==>Instances (all counters should be highlighted).

Under Instances of selected object, select lsass/NTDSA

Select Add, and then click OK.

Add the LogicalDisk and Security System-Wide Statistics objects also.

After the settings are configured to your liking, you can run the new data collector set directly from Server Manager or export it and deploy it to specific servers.

Command Line instructions: Gather AD Diagnostics from the command line:

To START a collection of data from the command line issue this command from an elevated command prompt:logman start "user defined\AD DS Diagnostics" –ets To STOP the collection of data before the default 5 minutes, issue this command: (get at least one full five minute sample for this issue) logman stop "user defined\AD DS Diagnostics" –ets

The default logging enabled in the directory services event log is useful to monitor for events that indicate slow application of changes. (EVENT X) Verbose diagnostic logging can be enabled to see what changes are currently being applied. Enabling diagnostic logging at the level mentioned in this article will cause the log to fill up rather quickly, so only enable it while actively troubleshooting this condition. To give an idea of rate of events logged with this level of verbosity:

Export the Directory Services event log shortly after you receive the status 8461, and reduce the diagnostic logging to a suitable level.

Review the event log for the following:

How quickly are attribute values are written to the database? ->Directory Services event log Event ID 1412 or even better, use performance counter:DirectoryServices/DRA Inbound Properties Applied/sec

At diagnostic level 5 for Replication Events, for user object creation, around 25 or so event 1412's (depending on what was written at time of user creation) are written (one per attribute value). When all attributes have been added, the object creation event is logged (Event ID 1365).

The Property section contains both the attributeID and the lDAPDisplayName of the attribute.

One event is written per value at this debug log level. Filter on the events and determine how many entries occur in a given period. Review the event details in order to determine if we are writing values for multiple attributes in order to instantiate an object or if we are writing to the same attribute across multiple objects. While this level of analysis can seem cumbersome, it can be useful in determining root cause. As an example, if you see that we are only writing a few events per second then that could indicate that transactions are being written to the database slowly or perhaps we have too many partners that are sending redundant changes (event ID 1239).

Notice that it is perfectly normal to see event ID 1239 when replication diagnostics is set to 0x5. If you filter out event 1239 and you see nothing else and you have a fairly long event log history then that may indicate a problem. This issue was observed by a customer with a large Active Directory environment that had inter-site change notification enabled. If you determine that there is a large number of events per second, replication is probably not affected by a performance problem.

Object Metadata

If an event is logged that indicates a change is taking a long time to process Event ID, record the objectGUID, and then get the following output:

Replication metadata:

Repadmin /showobjmeta * "<GUID=ObjectGUID>" >objectmeta.txt

Review the output for recently modified attributes. Pay particular attention to attributes with frequently modified version numbers. An attribute that has a very high version count could indicate that frequent changes are being made to the attribute. If you suspect this, you can either view the attribute value to get some context as to why the attribute was changed, or you can let some time pass, and then get addition repadmin /showobjmeta output in order to check whether the version of the same attribute on the same object has increased further.

Object and attribute data:

Use a utility to output the object and attribute values. Then, review the attribute data for the attributes that have recently modified data. The following examples present two methods to do this.

Connect and bind to the server in LDP and copy all the output for the object to a text file

ldpoutput.txt

Network related data

Tasklist /svc >nets.txt Netstat –anob >>nets.txt

Data AnalysisKey AD Replication specific Performance Counters

DRA Inbound Full Sync Objects Remaining

DRA Inbound Objects Applied/Sec

DRA Inbound Objects / Second (Inbound Replication)

DRA Inbound Objects Filtered / Sec (Suggests all New Attributes)

DRA Outbound Bytes Total Since Boot

Replication Queue:

DirectoryServices\DRA Pending Replication Synchronizations

Indicates the number of directory synchronizations that are queued for this server. This counter helps identify replication backlogs—the higher the number, the larger the backlog.

This counter should be as low as possible. If it is not, the server hardware is probably slowing replication.

Use this counter to determine the replication queue. Repadmin /queue DCName also reports this information.

Gauging Current Performance:

DRA Inbound Objects Applied/sec

Shows the number of objects received from neighbors through inbound replication and applied.

DRA Inbound Properties Applied/sec

Shows the total number of object properties (attributes) applied from inbound replication partners.

You can use the two counters to monitor how quickly changes are being applied to the database.

Database:

Server Performance:

DirectoryServices\DRA Inbound Object Updates Remaining in Packet

Indicates the number of object updates received in the most recent directory replication update packet that have not yet been applied to the local server.

This counter indicates that the monitored server is receiving changes, but it is taking a long time to apply them to the database. This counter should be as low as possible. If it is not, it usually indicates that server hardware is slowing replication.

Network:

Object\counter

Description

Guidelines

DirectoryServices\DRA Inbound Bytes Total/sec

Indicates the total number of bytes received per second through inbound replication. This number is the sum of the bytes of uncompressed and compressed data received during inbound

Testing

ScenarioTwo DCs were isolated (no client or other server activity)15,000 users were created from script with the minimal attributes populated on one DCEnabled the connection between the two DCs.

To give an idea of rate of events logged with this level of verbosity:

A Directory Services event log configured to 100 MB in size wrapped in less than two minutes (1 minute 27 seconds). The log contained 195,728 events. Of all events, 189,340 were event ID 1412 (attribute addition). The number of event 1412s per second:

In one minute, 4,630 user objects were created, consisting of 138,900 attributes) or about 77 objects per second.

An understanding of NTDS performance counters is needed in order to troubleshoot this issue effectively.

Object creations per second is obtained via the following performance counters:

NTDS / DRA Inbound Objects Applied/sec

Database adds/sec

NTDS / DRA Inbound Values (DNs only)/sec This number includes objects that reference other objects. Values for distinguished names, such as group or distribution list memberships, are more expensive to apply than other kinds of values because a group or distribution list object can include hundreds or thousands of members. In contrast, a simple object might have only one or two attributes. A high number from this counter might explain why inbound changes are slow to be applied to the database.

Attribute creations per second is:

NTDS / DRA Inbound Properties Applied/sec

Special Condition Frequently Encountered

Repadmin /rehost results in status 8461:

This issue occurs when the GC being rehosted is busy accepting updates for other partitions. The sourcing of writable domain partitions including schema, configuration and domain partition are by nature higher priority work-items than the rehosting of a read-only domain partition.

Repadmin /queue output should show that the request to add the partition has been queued and will eventually be processed. However, sometimes it is necessary to use an alternative method of partition rehost:

Repadmin /unhost

Wait for event ID 1660

Disable KCC connection translation

Repadmin /add

If the process is preempted before /add is complete, you can disable inbound replication and use repadmin /replicate and the /readonly and /force options to get the partition re-hosted before you re-enable inbound replication.