Did you check the client for any errors? It looks like the client does not accept packet 87 since there is almost 300ms period between packet 87 and 88. When exchange sends packet 88 (which is dup of packet 87), the client doesn't seem to accept that either. The client then sends a dup packet of 86 in packet 89. The connection times out and is reset.

On packet 229 and 230, they are re-sends because client did not replay back to packet 229 so exchange server re-sent after 300ms which is packet 230. The client does respond back to packet 229 almost immediately after receiving packet 230 so connection continues.

So it looks like packets are making it to client but client does not ack quickly enough or denies the packet and the server starts to re-send its packets. Have you done any debugs on the client side? Is this for everyone at site or a single client? If for everyone, do you have any encapsulation you add as packets that traverse the wan?

You didn't mention if you use a Cisco router or what L2 stuff you got going on over there between you and the provider or what L3 stuff .. The logical thing to me is not to look for client to router issues but to look at the WAN link. You have connectivity so L3 is ok if it's not a loop. I guess not. So check the interface for errors. If you have none it's something on L2,5. Does your provider run MPLS ? I would say that you should check the cabling but if no errors appear under the interface there is no problem there. I can't look at the wireshark captures because I broke my kernel last night ) Damn it ! :p

We are using a 3750 at the server end, and a 3560 at client end, SVIs on both switches handle routing between each other. No errors anywhere. Provider runs MPLS. Cisco TAC couldn't explain it after 6 weeks of troubleshooting and provider doesn't care since there are no packet drops.

server/client connection behaves normal when going across an IPSec VPN connection, which was the only connection between the two sites until now.

This WAN link has been like this every since it was put in place.

MTU is fine on switches, tests up to 9000 across provider.

Its just strange how the provider isn't touching frames, but the end points behave differently as you can see in the captures.

Thanks for all your input. This has been one of those nightmare problems -two months now and paying for the circuit - where no one knows why its broke but everyone still wants you to fix it.

there are a lot or retransmissions. Lets get to the basics, a retransmission happens in case... from the senders perspective the receiver did not ACK the segment or the ACK gets lost.

hmm with my little brain, I could see that what the client sends is also received by the server so then ... maybe it's arriving at the server very late for some reason like congestion on the path... just an idea from me.

When this is so far ... then I would also suppose that u have even tried this from different computers and different TCP services... I would like to know if and when u find a solution to this interesting issue.

It looks like these were simultaneous packet captures? I went ahead and combined them all into one and used a display filter of tcp.stream eq 1. This is likely above my head, but one thing that looks odd to me is seeing the same IP address with a source MAC of two different devices.

When the RPC traffic is going back and forth before the retransmits start, when 172.16.36.9 sends requests, the L2 header shows

This may not have anything to do with it, but keep in mind that Wireshark can "incorrectly" identify duplicated traffic as retransmits. I don't see at a quick glance any real delays between the requests and responses, and the changes in MAC addresses makes me wonder if this is somehow duplicated traffic.

Here are my notes from One Note after looking at your captures and diagrams; unfortunately the color coding I used won't carry over. I used the first pic as a reference. Please confirm the MAC addresses/IPs, traffic flow, and portions where I have a quetion mark so that I have a better understanding:

1. In your server pcap, it shows the server sending traffic off subnet via what I assume is its default GW with MAC 9c:af:ca:64:2c:42?2. However, in the beginning of the capture, it is receiving traffic from the workstation with source MAC 00:1c:f6:1d:4c:11.

Something there doesn't click. I would expect to see your server's default GW MAC be the source when receiving traffic from off-subnet. Can you please tell me what these MAC addresses are?9c:af:ca:64:2c:4200:1c:f6:1d:4c:11

It appears that the app is sensitive to around ~300ms with no response.

Are you sure this is a network problem? Looking at the time stamps on the server/client pcaps, it would appear that the server receives everything the client sends and vice versa. There is a ~4 second delta between the two, but that's probably just difference in clocks. Either that, or it has nothing to do with the problem anyway because the 4 second delta is present when you don't have retransmits.

This makes me believe the client isn't sending the ACK the server is looking for, so it retransmits. It's not like I see anything additional sent from the client capture and it's retransmitting anyway. Unless I'm missing something.