Hello, I have a problem that's been stumping me for a while and so I'm reaching for some help. I have this problem at two totally different sites. They're both small (one with 5 people, one with 13), they both use a Dell Poweredge T310 server running 2008 R2, and they both have a mix of Windows 7 x64 and Windows XP clients machines.

The problem they run into is that seemingly randomly throughout the day, users will have either disconnects or hangups when using the server. As examples, Outlook (with PSTs stored on the server) will hang, database applications with the database stored on the server will hang, and pretty much any apps (Word, Excel, Autocad, Solidworks) running with files open from the server will hang, or complain that a network connection was lost. The network connection doesn't seem to actually drop; they're able to open the files again immediately, and there's no event log errors anywhere regarding a dropped network connection, or any other indication that a connection dropped other than an error message from the application.

I've tried many things. At first I was feeling like it was a physical issue, so I replaced the NICs on the servers (Broadcom to Intel NICs) with no change. We tested all network runs. We tried a different NIC from workstations, no change (just slower, since the new NIC was a wireless card).

I've done a lot of testing, and now it appears the problem only happens with Windows 7. At both sites the Windows 7 systems are x64, so I'm not sure yet if it will happen with x32. Windows XP mode from 7 also shares the problems with its Windows 7 host, which isn't completely surprising.

I've tried a bunch of stuff (disabling autotuning, TCP chimney and such) but I'd like to know what you all might recommend trying. Thanks!

Answer Wiki

Turn off the autodisconnect feature on the server side. This will stop the clients from going to “disconnected” status. Note that in large implementations you can run into resource issues on the server with this setting. See: KB297684

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy

Processing your response...

Discuss This Question: 16 &nbspReplies

There was an error processing your information. Please try again later.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy

At the site I've been primarily working on, we actually replaced the switch. We've also done a lot of work to clean up their networking closet too, no loops were found, and we generally cleaned up and made sure nothing was twisted or anything. Only 13 users and a couple printers, so nothing too complicated. I should also say that we dropped the new server in place of an older Dell server (running server 2003), and they never had any sort of issue like this.

There are a few additional things I would look at in this case, one is I would monitor the bytes in and out of the server, see if anything is going on to utilize a large portion of what what the network can handle slowing everything else down. At the same time I would monitor cpu and memory utilization on the server to see if they correspond to the outages.
I've seen in some situations where something as simple as a scheduled anti-virus scan causes the system to appear non existent to clients due to the scan set at a higher priority than other system tasks, and it can appear random because you can offset start times of tasks by a decided number of hours for the sole reason of users not catching on that 1pm everyday there systems bog down.
If you do notice that the system resources or network resources spike when the outage is being observed then you can simply monitor individual processes for private bytes memory usage, or monitor network utilization on all the clients maybe finding a trigger.

What is your main switch in that branch? Does it allow any kind of monitoring (even externally by snmp). This might be the best point to reach the problem.
Try to put a simple network monitor (you can try Dude from Mikrotik, easy and helpfulll).
Configure it to check every network connected device - Router(s), Printers, PCs, Servers, Access Points, everything.
Did you change any NIC recently (before the debugging tests after the problem appearing?) Check if there is any duplicated MAC Address (yes, I know they're supposed to be unic, but trust me, they might be not.
I had a similar issue back in late 90s, and the problem was two NICs with the same MAC within the same physical LAN segment)
We'll waiting for any feedback

Thanks for the feedback.
I feel like we've mostly eliminated a physical problem - the fact that XP workstations work fine, while 7 systems exhibit the problem seems to point to me that something else is going on. Nonetheless, I'm going to see if I can reproduce the problem in a short time frame, and if so, I'm going to direct connect a workstation to the server with a cross-over cable, try the test again, and that should rule out networking once and for all.
I'll look into monitoring hardware utilization on the server, I haven't done much with that in the past.
They have just a dumb switch, no way to monitor anything on it.
The MAC address idea is an interesting one too that I'll look into.
At the site I've been troubleshooting, they replaced all 13 or so workstations at the same time as the server, which is part of what has made this difficult to troubleshoot. I eventually got them to have someone use an old workstation as a test (the XP workstation which worked fine). So, yeah, more or less everything is new.

Since this is only occurring with the WIN7 machines, try turning off IPv6 to see if there is still a problem.
Also, I noted you said the PST files were stored on the server. This is not a supported configuration. Move the PST files back to the local machine.

Ima have to agree with Stevesz: It doesn't appear to be a layer 1-4 issue - the hosts ARE connecting, just randomly dropping off. Your network, as you've mentioned several times, is clearly not the issue.
You mentioned you replaced the server and hosts at the same time - have you tried reconnecting the old server and trying to replicate the errors?

Well, at this point we are almost clueless.
Once this might only be related to Windows 7 machines, try to turn on them one by one reproducing in each one the normal behavior of a working day.
One other thing that just came to my mind is: could it be related to Power Save Settings?
Just try it: Please go to Device Manager, Network Adapter, Power Management and disable "Allow the Computer to turn off this device to save power",ans also any vendor specific software that might be doing similar things.
Please post back.

I had another flash.
What's the NIC on those machines?
Do you have the latest drivers installed?
Please update all those drivers and post the results. I've seen NIC drivers related problems that you can't imagine...

Saturno,
Your comment about the NIC drivers just reminded me of something. A few years ago, I was having a similar problem, but it was on a peer-to-peer network. Updating the NIC drivers did not help, nor did anything else I tried. Somewhere I ran across that updating the BIOS may help, so I checked and there was a newer version of the BIOS available. I downloaded it and applied it, and the problem disappeared.

Other items you can do on the client to clean up your environment and improve connectivity:
[CODE
]REM Allow mapped drives to be used by standard and admin tokens.
REM Enable Linked Connections
CMD /C reg add "HKLMSOFTWAREMicrosoftWindowsCurrentVersionPoliciesSystem" /v "EnableLinkedConnections" /t REG_DWORD /d 1 /f
REM If you have older hardware / OS you will want this or you may be regularly rebooting / powercycling them
REM Q323582 - Fix for Net3101 Error on OS/2 Server Because of SessionSetup SMB
REG ADD "HKLMSYSTEMCurrentControlSetServiceslanmanworkstationparameters" /v EnableDownLevelLogOff /t REG_DWORD /d 1 /f
REM Use if you have non-Windows 7 / Server 2008 infrastructure.
REM This is from Vista and Windows 2008 RTM testing
REM Change the TCP/IP AutoTuningLevel to highlyrestricted
netsh int tcp set global autotuninglevel=restricted
REM Unless you have an IPv6 infrastructure, change to prefer IPv4
REM This will reduce DNS load and issues finding services
rem since systems do not wait for IPv6 requests to timeout first
REM Set to prefer IPv4 over IPv6
reg add "HKLMSYSTEMCurrentControlSetServicestcpip6Parameters" /v DisabledComponents /t REG_DWORD /d 20 /f
REM Unless you are using / managing the tunnel adapters, turn them off.
REM Keeping them on increases network traffic and
REM opens the attack surface into your systems.
REM Setting the 6to4 adapter state to disabled
netsh interface 6to4 set state disabled
REM Setting the teredo adapter state to disabled
netsh interface teredo set state disabled
REM Setting the isatap adapter state to disabled
netsh interface isatap set state disabled

Are your users utilizing the Windows file Sync utilities? I find that in my environment we had to tune down the Sync Center settings with Windows 7 users from checking for a slow network connection from every 5 minutes to every 30 or 60 minutes. That resolved our complaints about going offline at inappropriate times.

Sorry for the long delay in answering!
The NIC is a broadcom. I've tried toying with the drivers, as well as a totally different NIC (I tried an Intel 1G server NIC for about a month), and that didn't seem to make any difference.
I've tried disabling IPv6 on both the clients and server, and that didn't seem to make a difference.
Slack400, what are the sync utilities you mentioned? I did a quick google search but didn't see anything for Windows Sync Utilities.

Have u checked that way..... on Win 7
Control Panel > All Control Panel Items > Administrative Tools > Computer Management > Local User..... and change the Guests rights and if possible remove the guests (if it is there).
Now Local Policy and Deny / Allow the guests.......

Reporting back now that we finally fixed it. It ended up being Win7 SP1 on the desktops that fixed it. SP1 on the Win2008r2 server did nothing, installing all available updates (critical, recommended, everything) on desktops did nothing, but something included in SP1 did it. This has been confirmed across a number of systems. Thanks for all the suggestions!

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy

Processing your reply...

Ask a Question

Free Guide: Managing storage for virtual environments

Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!

To follow this tag...

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy