Basically I'm running a Vanilla server (1.11.2) on my computer and one player has suddenly started having a strange lag/packet loss issue.

Symptoms

10 or so second joining time

The player can open chests and break/place blocks. When they interact, other players can see the interaction immediately, but the affected player does not see the intended action for 10-20 seconds or so.

Sometimes, but not always, the player will leave the server but their player model remains on the server. When this occurs, I (as admin) can do the kick command and it says successfully kicked. However, the player remains on the server and they remain in the tab menu.

On one occasion I attempted to kick the player when this was happening and instead the following appeared in the chat as if I had said it:

internal exception io.netty.handler.codec.decoderexception

At one point the player attempted to join the server and got a similar error message which was repeated in the console (unfortunately I didn't note it)

The player sees slightly worse tracert results to my IP when connected to the server

Other notes/attempted fixes:

Both the player and I have generally stable internet connections (around 10-40ms ping, 5-15 Mbps down and 3-5 Mbps up)

The player has a good connection on other Minecraft servers

The player and I are located within 10kms of each other

No other players experience lag of this nature

We have tried creating Windows Firewall rules on both ends for inbound and outbound connections

We have tried setting the firewall level to "lax" on the host router

We have tried restarting the host router

We have tried destroying all signs on the server (Googling the Java error gives results about this being the problem on 1.8~ Spigot servers but I doubt this is the problem)

I have tried adding more RAM to the server (8 gigabytes in total now)

The player used to have no lag issues, but I cannot think of anything that has changed since it started about a week or so ago

1 Answer
1

I want to stress this point: the source of the network problem might not be the network.

10 or so second joining time

The player can open chests and break/place blocks. When they interact, other players can see the interaction immediately, but the affected player does not see the intended action for 10-20 seconds or so.

Sometimes, but not always, the player will leave the server but their player model remains on the server. When this occurs, I (as admin) can do the kick command and it says successfully kicked. However, the player remains on the server and they remain in the tab menu.

These is all "Lag". It could be low bandwidth, latency, etc. It does not really help to discard what is going wrong.

On one occasion I attempted to kick the player when this was happening and instead the following appeared in the chat as if I had said it:
internal exception io.netty.handler.codec.decoderexception

At one point the player attempted to join the server and got a similar error message which was repeated in the console (unfortunately I didn't note it)

Ditto.

The player sees slightly worse tracert results to my IP when connected to the server

This sounds like the fastest route between you cannot handle the traffic, is saturated, and the network found a route with more bandwidth but more latency.

Edit: shouldn’t be happening. It suggest that at some point along the network the connection is not good enough.

Both the player and I have generally stable internet connections (around 10-40ms ping, 5-15 Mbps down and 3-5 Mbps up)

Does not seem to be the problem.

The player has a good connection on other Minecraft servers

This suggests that the problem is not the last mile on the client.

The player and I are located within 10kms of each other

This is odd. Because if it is not the last mile, we would have to blame the Metropolitan area network for this. Edit: We have to start on the assumption that the MAN is installed correctly, and is not failing.

No other players experience lag of this nature

How far from you are the other players?

We have tried creating Windows Firewall rules on both ends for inbound and outbound connections

We have tried setting the firewall level to "lax" on the host router

Shouldn't be a problem

We have tried restarting the host router

Ok.

We have tried destroying all signs on the server (Googling the Java error gives results about this being the problem on 1.8~ Spigot servers but I doubt this is the problem)

Spigot - for what I read - is a fork of Bukkit. You said you are "Basically" running Vanilla, are you using Spigot for that? If so, it might be worth investigating.

You are not using Spigot, then this is not the problem.

I have tried adding more RAM to the server (8 gigabytes in total now)

Did you replace your old RAM, or just add a new card?

If you just added a new card, try without the old one. A RAM card can be defective in such way that the OS is able to load, but you get memory corruption along the way.

Also, try Memtest (Download the ISO, burn it, and boot with it). It will tell you if there is any not evident problem with your RAM. Note: Memtest is more thorough than Windows memory diagnostic.

Side note: On my experience with Minecraft, the hard disk tends to be the bottleneck.

Addendum: Running the Minecraft server doesn't necessarily improve performance, it will allow the server to keep more chunks loaded, which in turn translate to more units being spawn, and more units eat more CPU time.

The player used to have no lag issues, but I cannot think of anything that has changed since it started about a week or so ago

I will speculate for you. Consider the following:

It might not be fault of the server, nor the IPS or the client. It could be that at the time the client connects, something else also happens... for example, other people on networks close to the client has an schedule where they watch online movies or streaming content at the same time the client joins the game.

It might be old hardware that is starting to fail. In this case, you, nor the client did any intentional change that started the problem. Addendum: Old hardware is not likely to be affecting a single client, unless we were talking about a problem on some router along the trace from the client to the server.

It might be malware. For example, a bot net could be using the network. It might be worth looking for possible malware on both sides. Again, in this case you did not change anything intentionally. Addendum: A botnet that has compromised both client and server could be using the connection between them to move data around, eating bandwidth.

It could be automatic updates. Either the change happened because of an update that placed a defective component, or more likely, has added a scheduled task that is adding latency. Addendum: this should not be affecting a single client. Yet, I do not know, bugs are bugs.

It would actually be easier to make theories for what is the problem if you had multiple clients with the problem, because then we could try to figure out what they have in common.

This is what you will try:

Diagnose the memory on the server (memtest). Replace any card that reports errors.

If the server is on wireless network, try a wired connection.

Identify and stop any other services that might be listening to the network on both server and client. Try nmap to make sure that only the ports that correspond to Minecraft are being used. There is a GUI tool for NMap, or if you have to use the terminal... Debian has a good introduction.

Use Autoruns to identify what is running on windows startup. Configure it to verify the signatures of the entries (to see if they have been tampered) and to send samples to VirusTotal to see if any antivirus software identifies them as malware.

Also use Process Explorer and configure it to do the same with whatever software is running while the problems happen.

Run sfc /scannow on a terminal with elevated privilegies, it will check if any system files has been tampered with, and attempt to repair them. It it fails, move to Dism /Online /Cleanup-Image /RestoreHealth. If that fails, go to Microsoft's support.

Run Windows Updates.

If you are Linux, follow the instructions of your distro documentation to identify and fix broken packages.

With collaboration of the client:

If the client is on wireless network, try a wired connection.

Have the client try to connect off the usual times (If they often join on the night, try on the morning or vice versa).

Make sure they are not running any other thing that is taking resources during the gaming session. A tool such as Razer Cortex would suffice. If the client is willing to, they can try nmap too.

Try connecting via VPN. I have successfully used DynVPN for these purposes. This will also allow an easier way to measure traffic between server and client.

Advanced diagnostics:

If the options above did not work, we are on diminishing returns territory. What I could suggest is to capture network traffic. However, remember that culprit may not be in your control.

Use Wireshark on both ends and capture a test gaming session (join the server, do some change on the world, leave the server). You might want to also try this on the VPN. If there are dropped packages, if there is network latency, if there are corrupted packages... you will see it there. HowToGeek has a good starting guide on Wireshark

Use Scapy. It allows for more powerful analytics than Wireshark, but it is harder to use... because it is a Python library! If you are familiar with Python, follow the Scapy tutorial to get you started.

Thanks for the detailed response. To clear things up a bit, this is a server running on my own PC. By "adding more RAM" I meant I launched the server with 8 GBs rather than 1 GB of RAM. The OS is Windows 10 Home 64 bit and the server is running the default vanilla java file, not Bukkit or Spigot. I'll try and work through the solutions posted. Under your speculation about what could've changed since this started occurring a week or so ago, I don't understand why these issues would affect just the one player on my server and not the client's connection to any other servers.
– user16421Apr 15 '17 at 1:59

also, I posted that the client isn't seeing their own actions but other players are because to me this suggests that the problem relates to outbound traffic. Is this a sound assumption?
– user16421Apr 15 '17 at 2:01

@user16421 it is equally odd. If there is problem with outbound traffic on the server, it should affect other players. If it were a problem with inbound traffic on the client, it should be a problem with other servers. That all suggest that it is problem with nodes in the middle between client and server.
– TheraotApr 15 '17 at 5:49