I have recently taken over a position with a new company. It is, for better or worse, a mess. We have 5 DC servers in corporate office, and one dc server at each remote site. They are set to replicate, and I actually made a map of how they do it, and to me it looks like a disaster. Servers are replicating to each other, and back again, and to others already being replicated to.

We have 1 domain. My co-workers state that the last network engineer promoted ALL servers as DC servers. We currently run on a windows 2000 functionality level, as one of our servers at a remote site is a 2k server. I have already built a 2003 to replace it. We have several 2k8 servers too, with some of them promoted to dc status.

We are currently experiencing an issue with mapped drives/email. When the problem happens, the drives are no longer accessible, as is email. The drives are mapped by hostname of the server. Now, its only the drives from one particular server that it happens to. I am able to map drives on nearly every other server without problem. When this happens, I can still ping that server by both hostname and IP. I CAN map the drive by IP as well. I thought a quick fix would be to just change the log on script to reflect IP, but that doe'snt help with email issue. The same happens there, the name of the server can not be found, but it can be pinged, and I can map a drive by its IP, and NOT by hostname. Rebooting the pc has been our fix up until now, but now that doe'snt seem to be working.

I've googled and googled but hav'ent found anything that has helped yet. So far, I have made sure time is syncing (all pcs are in sync, down to within a second of each other and the servers). I have checked for duplicate entries for pcs in AD, but hav'ent yet found any. I was going to remove/rejoin the server with the mapped drives from the domain, but I was'nt sure if that would help with the exchange server.

I agree with Frank - dump the 2k box and see if you can elevate the domain to at least a 2k3 level. You can also try running dcdiag and see if that tells you anything. I would make sure everything is replicating between the DCs correctly as well. After you've brought it back to a usable level (across the whole domain) then I would look at migrating up to 2k8r2 if you can. I would suggest only having two DCs at the main location, with a single DC at each remote site, but maybe that's just me?

Don't want to over simplify the fix, but it sounds like the 2k server is the issue. Get it swapped out with the new server and get all the servers off the 2k compatibility.

Thats my gut feeling as well.

DNS runs on the same DC that the drive shares are on. I tried running a dcdiag /TEST:DNS but it says not a valid test. From what I can tell DNS is ok, but how can I check its health without using dcdiag?

Events: a lot of failure audti events 675 0x19 and 0x18 on my dc that holds the shares.

The PDC has same failure audit events. There are some replication warnings/errors, such as :The File Replication Service has enabled replication from CLTDC01 to CLTEXCH for c:\winnt\sysvol\domain after repeated retries.

We don't recall noticing anything that would have cause all of this, it seems to have started slowly and progressed over a week to worse.

Testing server: Charlotte\CLTEXCH
Starting test: Replications
* Replications Check
REPLICATION LATENCY WARNING
ERROR: Expected notification link is missing.
Source HICKORYDC01
Replication of new changes along this path will be delayed.
This problem should self-correct on the next periodic sync.
REPLICATION LATENCY WARNING
ERROR: Expected notification link is missing.
Source HICKORYDC01
Replication of new changes along this path will be delayed.
This problem should self-correct on the next periodic sync.
* Replication Latency Check
DC=ForestDnsZones,DC=McLeod,DC=local
Latency information for 7 entries in the vector were ignored.
7 were retired Invocations. 0 were either: read-only replicas and are not verifiably latent, or dc's no longer replicating this nc. 0 had no latency information (Win2K DC).
DC=DomainDnsZones,DC=McLeod,DC=local
Latency information for 7 entries in the vector were ignored.
7 were retired Invocations. 0 were either: read-only replicas and are not verifiably latent, or dc's no longer replicating this nc. 0 had no latency information (Win2K DC).
REPLICATION-RECEIVED LATENCY WARNING
CLTEXCH: Current time is 2012-06-21 12:19:58.
CN=Schema,CN=Configuration,DC=McLeod,DC=local
Last replication recieved from MONROEDC01 at 2012-06-10 14:21:43.
Latency information for 23 entries in the vector were ignored.
23 were retired Invocations. 0 were either: read-only replicas and are not verifiably latent, or dc's no longer replicating this nc. 0 had no latency information (Win2K DC).
CN=Configuration,DC=McLeod,DC=local
Last replication recieved from MONROEDC01 at 2012-06-10 14:21:43.
Latency information for 23 entries in the vector were ignored.
23 were retired Invocations. 0 were either: read-only replicas and are not verifiably latent, or dc's no longer replicating this nc. 0 had no latency information (Win2K DC).
DC=McLeod,DC=local
Last replication recieved from MONROEDC01 at 2012-03-15 10:18:09.
WARNING: This latency is over the Tombstone Lifetime of 60 days!
Latency information for 23 entries in the vector were ignored.
23 were retired Invocations. 0 were either: read-only replicas and are not verifiably latent, or dc's no longer replicating this nc. 0 had no latency information (Win2K DC).
* Replication Site Latency Check
REPLICATION-RECEIVED LATENCY WARNING

Source site:

CN=NTDS Site Settings,CN=Monroe,CN=Sites,CN=Configuration,DC=McLeod,DC=local

Current time: 2012-06-21 12:20:06

Last update time: 2012-06-10 13:45:15

Check if source site has an elected ISTG running.

Check replication from source site to this server.
Site

CN=NTDS Site Settings,CN=Marion,CN=Sites,CN=Configuration,DC=McLeod,DC=local

was skipped because it has no servers in it.
Site

CN=NTDS Site Settings,CN=Hickory,CN=Sites,CN=Configuration,DC=McLeod,DC=local

was skipped because it has no servers in it.
......................... CLTEXCH passed test Replications

Well, we noticed that we are only getting the issues at the corp office, none of the remote sites are experiencing this issue.

We have decided, the tombstoned dc is being demoted, the 2k machine as well, we have a 2k8r2 that will be promoted as the PDC and the cltexch that is currently the PDC will be demoted as well. Hopefully this will help put us on the right track.

Ok, we have removed the 2k machine, demoted the tombstoned one, and upped to 2k3 functionality level, and also, promoted our 2k8 r2 server as the fsmo holder. Things are looking much better, no more errors when running a dcdiag. Also, the problem machines we were dealing with today stopped saying the original error on drive mappings, and started saying drrive not connected. After a reboot, everything appeared fine. We will see what tomorrows load of users logging in does.

Thanks to all you guys for the support/direction pointing. I'm ever so glad that the spiceworks community exists. I will post back results tomorrow after the system has had to time to do its job.

Someone mentioned it earlier but I didn't see any followup on it - make sure the sites and site links are set properly in AD Sites and Services. Without doing that, you may have workstations at the main site trying to authenticate with DC's at other locations and vice versa. Could cause some of the troubles you are stating at the central site.

Someone mentioned it earlier but I didn't see any followup on it - make sure the sites and site links are set properly in AD Sites and Services. Without doing that, you may have workstations at the main site trying to authenticate with DC's at other locations and vice versa. Could cause some of the troubles you are stating at the central site.

thats right and to follow on on that double check the replication interval on those site links, buy default they are 3 hours if you have a half decent connectio nyou will want to change that to 15 min or 30

You generally want it to mirror your physical links. When DC's are connected via slower WAN links, put them in separate sites and set up a site link between the sites. If you have DC's at separate sites that are not directly connected, then don't put a site link between them - they can replicate via a DC in the middle.