“TCP Provider: An existing connection was forcibly closed by the remote host”.

We found this issue after SQL Server 2005 Beta2 when w2k3sp1 (Windows 2003 server service pack 1) was one of our default OS for test. Lately, quite a few customers have reported seeing this issue during connection scalability test of SQL Server 2005 or on SQL Server 2000 production server, after upgrade base OS to w2k3sp1. This issue results from a subtle winsock behavior change caused by that w2k3sp1 server has enabled SynAttackProtect by default. This key is configurable through “HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTcpipParametersSynAttackProtect{DWORD}”. As documented in SQL Server 2005 Book-On-Line, one way to resolve this issue is to disable it by set SynAttackProtect value to 0. Note that, event though we document this solution in SQL Server 2005 BOL, it applies to both SQL Server 2005 and SQL Server 2000. In this post, I want to explain what is really happening under the cover and help you to understand the risk to turn this key off.

What is the Sync Flood Attack?

Sync-Flood-Attack is a network Deny-Of-Service attack. It takes advantage of TCP connection handshake protocol by sending large amount of TCP Sync packets to a victim server with spoofed IP source address. In standard TCP implementation, once a new sync packet is received, a half opened connection handle and its associated context is allocated and queued in corresponding TCP listening socket backlog queue. With combination of high frequency of sync packet arrivals and the default timeout value of half open connection entries, the backlog queue, max to 200 by default on w2k3 standard server, can easily be exhausted; as result, sync flood attack can cause server become non-responsive to normal connection requests. Here is some more info, http://www.microsoft.com/technet/community/columns/cableguy/cg1204.mspx.

A solution to Sync Flood Attack

To mitigate such issue, a solution is to hold the half opened connection handle in a cache and only allocate full TCP connection context and en-queue the handle when a connection is fully established after TCP three way handshakes. The cache implemented in Windows is based on a hash table that can virtually hold unlimited half-opened connections. The cache also uses smaller timeout value for connection entries that stay in the cache so that unresponsive half open connection entry will be evicted faster.

The side effect to the normal connections

For normal connections, without SynAttackProtect, as soon as the backlog queue is full, server side TCP will send TCP reset in response to new TCP sync packet. The client side winsock connect() call that send first TCP sync packet to the serverl will receive the TCP reset and fail with ERROR_CONNECTION_REFUSED. When SynAttackProtect is on, even the backlog queue is full, a new connection request arrived at the server will still be accepted because the half connection cache is visually unlimited. However, once the handshake succeeds, the connection can still be rejected because the server TCP find out that the backlog queue is actually full when moves its entry from the cache to the backlog queue. With the timing shifted to send TCP reset from during the connection establishment handshakes to after, the client succeeds when calling winsock connect() but fails with WSAECONNRESET when performing the first recv() or send() on established connections.

What is the impact to SQL Server Client Driver?

SQL Server client drivers, including MDAC and SNAC, have retry logic for failure when connect() fails, a logic to improve user experience when there is transient network failure. However, they do not retry when first recv() or send() fail because of a TDS protocol compatibility issue with older versions of SQL server such as 7.0. With SyncAttackProtect enabled, due to the timing shifted, more first recv() or send() are failing inside the SQL Server client driver; and thus, more GNEs are raised to applications.

Solutions to this GNE issue.

As mentioned above, server can disable the SynAttackProtect in w2k3sp1. But keep in mind this solution makes the server subject to sync flood attack if it is open to public access. Another solution is to add a retry-logic to the SQL server client application to alleviate the problem. I am glad some applications are already doing that. Database application developers also need to keep in mind that the retry logic should be designed in such a way not to overwhelm the server.

One way to help users to diagnose this kind of problem is to be able to demonstrate it and exacerbate it under test conditions.

Can you cobble together a simple script that will stress the SynAttack logic to be able to demonstrate the problem with a live application. This test can then be included with some UAT acceptance testing of applications.

The GNE is the type of error message with MDAC. With SNAC and ADO.net2.0, the error messages are improved and lots of issues can be identified much faster and easier. If you see GNE issues, please post your finding here or on the forum,

I’m facing the same problem. When i make changes to the registry, it resolved the problem at internal environment. However changes on live environment had not effect. The error still occur. Any ideas what else is missed?

We are using MS SQL 2000 but the operating system is Windows Server 2000 SP2. Can anybody tell me if the reason of this problem on Windows 2000 server is also same or can it be due to some other reason?

Compile and run the app, hopefully a Database named "MyDatabase" will be created

Now start SQL Server Management Studio (for 2005) or Enterprise Manager (for 2000) and delete the Database just created, now run the app created earlier, you might get the dreaded error message, try to run the app atleast 3 times you will get the same error message.

Now question is , how do i correct the problem, simple – Reboot the system and try running the app again.

Indeed a painful way of overcomming an error message.

This error message ONLY show up when you have, and has no connection with the Timeout property.

Integrated Security=false

User ID=[Your User Name]

Password=[Your Password]

Note: This is Microsofts general-purpose error message and this solution may not work in all cases.

I get the following error msg while uploading a database to our corporate office’s offline database server running MS SQL server 2000. The corporate office has to manually upload the data from their server itself. This error has been raised since our VSAT provider updated their VSAT configuration (NET modem).

I saw this same error with SQL Server 2005. After much thrashing, I found that the problem was being caused by changes to the TCP/IP registry parameters, specifically

TcpMaxConnectRetransmissions and TcpMaxDataRetransmissions, both of which set to 0. Changing these values both to 2 solved the database problem.

Note: These parameters were originally set to 0 for good reason: to prevent unreasonably long timeouts when trying to access disconnected network devices. And it makes no sense that SQL Server would need to retry in this case, since the client application is on the same machine as the server. But it is what it is, so I have to sacrifice performance of my network data collection application to get SQL Server to work at all.

I got the error message, and we resolved it by changing the "Enable NetBIOS over TCP/IP" setting radio button under the "WINS" tab in the Advanced TCP/IP settings. Also, be sure and check your binding order and NIC cards under Device Manager: we had one disabled.

This SQL base was created in SQL 2000 MSDE (2x per day ERORR), than restored in SQL 2005 Express (ERROR after few mins) and now restored in SQL 2008 Std. (Error also in few minutes). Can be database, UTP cables?