I got a little further last night. I'm not using conventional turns so this might get a bit confusing and might just bit a load of old rubbish but I'm going to try it anyway.

1) All the clients connect to the server

2) One client chooses to "start the game" - for now after this point other clients can't join in. It seems there are ways round this but for now I don't care about it.

3) When the game start messages reaches the server it remembers the real time that it started a game. Which I'll call "Server Start Time". It forwards the message onto the clients.

4) When the client's recieve the start game message they record the time and call this "Client Start Time".

---- Now the game is running ---

5) The client sends a command "I'd like to move player to position X". This sends off a command object to the server.

6) The server recieves the command, schedules it for 200ms in the future based on its "Server Start Time" and the current system time. The scheduling time is placed in the command and the command sent out to the clients.

7) The clients recieve the message and add it to their queue.

CLIENT LOOP:

The clients are sat rendering the game. I get the delta for the frame in milliseconds (accurate system timing required here, LWJGL Sys.getTicks()) and pass that into the game state progressor. This takes the delta and splits it into steps of 10 milliseconds (remembering the remainder for next frame). The game state is progressed each "turn" which lasts 10 milliseconds - this moves actors around in the game world.

Just before starting a 10 millisecond "turn" the queue is evaluated for commands scheduled in the time timeframe. Any found are actioned and removed from the queue.

Should the game time start getting close to time of the scheduled commands being recieved I'll lag the clients loop a bit to conpensate and adjust the clients "start time".

It would work if the probability of a late command packet was really low.

However what if you assemble a packet for transmission (including the timestamp) and then a context switch occurs (another thread or even another process). Maybe a garbage collection occurs. The packet then goes (late) with the original timestamp. Anyway, if that delay reaches 200ms, then there's going to be difficulties. It makes the code kinda fragile.

The basic problem is that timestamping a packet and sending it is not an atomic operation. Similar with reading the packet. This all adds into unpredictable network latancy. It really mucks with my ping measurements, which is why I'm averaging them.

SharpShooter Arena throws away time-expired packets and prints a message on the java console. You get some on start-up when the latency measuring code hasn't stabilised and also if you change from full-screen to windowed (or vice versa) due to the long time this takes (It also messes with my time sync code which makes it worse - I might modify the sync code to ignore ludicrously long pings to reduce that effect). Other then that, they don't seem to happen (unless one player has massive lag spike problems).

I don't see how that'd prevent having to make a client wait. For deterministic lockstep *everyone* has to use the exact same inputs at exactly the same time. Otherwise you get divergence. Enforced delays can make the gameplay smoother but worst case your buffer will empty and you'll have to wait.

Quote

Should the game time start getting close to time of the scheduled commands being recieved I'll lag the clients loop a bit to conpensate and adjust the clients "start time".

I don't see how you'll do this without breaking your determinism? If you adjust the lag/buffer you need to do that on all machines at the same time (which would be possible).

Re: Timestamping - but the timestamping isn't to do with real time but rather simulation time. So if it does get a bit delayed, well thats fine, the latency buffer should account for that and if the simluation at the client end starts getting a bit close to the time of the commands their recieving (because the client machine has gotten a little bit further ahead than it should of due to maybe server processing lag whatever) I'll slow the running of the simulation down a bit so it goes back to where its meant to be.

My understanding isn't that the commands must be actioned at the same real time, but rather they must always be actioned at the same point in the simulation - to keep the whole thing deterministic.

Quote

Quote

Should the game time start getting close to time of the scheduled commands being recieved I'll lag the clients loop a bit to conpensate and adjust the clients "start time".

I don't see how you'll do this without breaking your determinism? If you adjust the lag/buffer you need to do that on all machines at the same time (which would be possible).

The simulation will still run the same on all machines, it'll just run a bit behind on some?

The simulation will still run the same on all machines, it'll just run a bit behind on some?

And I agree, this algo could be fragile.

Kev

Thats the issue Im worried about. When a latency buffered lock-step game falls behind by longer then the pre-dtermiend amx latency, it pauses.

It seems to me that if any of your clients ever fall behind to the point where they are generating user events at a time later then another player's current simulation you have to either roll that other player abck in time or abort the entire simulation.

See?

I thin kyour just thinkign about the output and not considering input thats happening in those "past" time frames. The only other thing you can do is to always condier input "current time" but then those players whose simualtions are displayign the past are in the position of having to guess the future on their screens and react to it before they see it.

Control lag, particularly a predictable and constant lag, can easily be adjusted for. But control into the future? The human animal isnt all that reliable at predicting future events... if we were we'd all be playing the stock market for a living

Got a question about Java and game programming? Just new to the Java Game Development Community? Try my FAQ. Its likely you'll learn something!

What lock-step seems to have this caveat - "if one player lags, they all lag - or at least pause".

What the system I describe above give's me - "if one player lags, it takes longer for his/her commands to get actioned".

This is because when a command is sent from a player its scheduled simulation time is set by the server. So, even if the client has managed to keep very close to the server simulation progression, if it takes a long time for the command to reach the server the command will be scheduled in the future. If the client is lagging behind simulation time then the command will be percieved as taking a long to action - even though other clients might be further along in the deterministic simulation and have already seen it happened.

The only protection I need to provide is that clients don't manage to catch up to server simulation time and hence recieve commands scheduled for times in the past. I don't see this as possible since clients will always occur >0 lag on the "start game" message and hence we'll always be running the simulation behind server simulation time (assuming millisecond accrurate timing on client and server).

However, to protect again simulations running into the future I'm considering only allowing simulations to proceed while there is at least one command scheduled in the future and have the server sending out keep alive commands.

I'm going to try to find time at lunch to draw a diagram of how I think this works,

Kev

PS. Noticed my forum tone of recent times has become absolute, this isn't intentional, as always I'm hoping someone can find the flaws.

I think your missing something critical here... but myabe Im the one who uis missing something.

Lets take a gadanken experiment.

We have two points in time:

Server timeCLient 1 time.

The server is actually the "real" time base in the sense that any command arriving is considered to have "just occured" in server tiume and be time stamped, right?

Now lets imagine Client 1 falls far behind the server. 10 seconds for argument. What do I see as the user of client 1 at this point in time? I see a tiem in 10 seco nds "the past" as far as the server is concerned. Seeing that situatikon I react and send a command.

That command goes to the server. The server sees it however as a command occuring at ITS time. So while the user was reacting to something ten seconds ago, their reaction is effetively delayed 10 seconds in how it effects the world situation. Its sent back to the cleitn timestamped 10 seconds ahead of the client.

For 10 seconds I sit there banging the key while the state progtrersses and my input is apparently ignored. FINALLY 10 seconds later my input "arrives" and effects the game. By now ive probably over corrected banging at the keyboard....

All sounds very frustrating to me.

Got a question about Java and game programming? Just new to the Java Game Development Community? Try my FAQ. Its likely you'll learn something!

I shoudl mention that its worse then a 10 secodn delay. A fixed lag is annoying but can be accounted for by the user. In your case though the lag is both variable and unbounded. This WILL drive a user nuts even ignoring the other issues.

Got a question about Java and game programming? Just new to the Java Game Development Community? Try my FAQ. Its likely you'll learn something!

Agreed, and I've just experienced it (by pausing my local game simulation and letting it get behind intentionally). However, remember that my objective is make only the laggy player experience lag. So, for me, thats exactly what should happen, a player is so laggy that they get well behind simulation time and hence get a laggy response. However, what I hadn't really thought about is how lag spikes get compounded in this system.

If the player gets one lag spike they get a bit behind time. If they get another they get further behind time. Hence that 10 seconds is getting closer and closer What I hadn't thought about was the compound nature. So, yes there is a problem.

Time for a coping strategy.

1. For another reason I've added a keep alive message. Every so often (lagBuffer / 2) the server sends out a keep alive command. The clients arn't allowed to progress unless they've command in their local scheduled list to work towards. The keep alive keeps something in their lists. This prevents clients running off into the future (and hence screwing up the whole distributed simulation)

2. Here comes the science.. Or something. I haven't implemented this yet (some point today) so I don't know if it works. A client is sat there recieving its commands (one at miminum every lagBuffer / 2 (*)) and it knows what the lag buffer is. Lets say the buffer is set at 200ms. It knows that when it recieves a command it should be 200ms in the future of its current simulation time. So, now it knows how far infront/behind the simulation its running (well plus the transmission time from server to client).

So let says, its detected its got 100ms behind where it should be, i.e. its recieved a command thats scheduled 300ms in the future (instead of the 200 expected).

Each frame render my client goes to the game world and tells it to update by how much time has passed. Say 25ms has passed since last frame renderered. My game world currently moves forward in incremenets of 10ms, so normally I divide the amount of time passed by 10ms and work out the number of cycles to update through the data. The dividing by 10ms and working it out in terms on a set number of cycles is what allows me to keep the simulations synchronised.

So, normally this would mean I need to run 2 simulation turns and remember that I've got 5ms of time outstanding for next update. However, this time I know I'm an additional 100ms behind so naively I might run an extra 10 turns this update. This naive approach could of course mean that the rendering of the clients world would suddenly speed up, 10 additonal moves. I'm going to implement it this way initially, I spect I'll need to temper this down a bit to smooth things out. I think the neat thing about this is going to be that if my player gets to far ahead in the simulation I can detect this, miss a few turns next update and the sim comes back in line. So hopefully I'll be tending towards have a consistent lag at all times while coping with lag spikes.

Moreover, if I was add a player mid game - I could send them all the commands that have occured in the game (maybe cache them at the server or something). They'd get a command scheduled for a long time in the future. The above algorithm would detect this and on the next game update would progress their simulation all the way through to the point other players are at. Since its all deterministic, it should place them in the current game state.

Apologies for the long post, hope it makes enough sense to be able to be considered.

Kev

PS. On advice from Elias I've added checksums for the game world to ensure they're staying consistent. It turns out "CHECKSUMS ARE YOUR FRIENDS" !

(*) note that this messages is very very small (6 bytes) so the impact isn't going to be to scary)

I've just tried a pair of clients in the UK against a server running on a machine we rent in Canada. As you'd expect the TCP connection out there isn't the fastest or for that matter the most consistent.

With the test client (screenshot here) I can watch the simulations running (red dots), move my player around (yellow dot) and watch other clients move (green dot). I can see the simulation time being displayed, the checksum for the game world and the offset to the server time this client is currently running at (not including the time taken for a message to go from server to client).

The observed behaviour is that the TCP connection is relatively stable though you get odd spikes here and there where the client will get slightly ahead of time (which the algorithm auto adjusts for).

Cooler - I can pause the simulation running locally, let the server run off into the future, let the client get way behind where it should be. On starting the client running again it runs a whole bunch of simulation cycles which allow it to catch up with the other client. Once the client has caught up (close to instant) the commands become as responsive as always.

Seems to be ok - though I guess I won't know until there are more players, more operating systems, more network traffic and a real game to play

Update if anyone's interested, I've got this system running with a real client, AI/Steering for the monsters and interactions from the player (i.e. you can attack and kill the monsters).

So far, so good. I get thes same sort of performance and predicatability as a lock step game. However, when one person lags in the game only they're effected. One of my testers' connection to my home machine is very laggy (West Coast US -> UK). The TCP stream can paused for seconds at a time. Once he reaches the lag buffer limit there arn't any messages available hence he sees lag in the form of his game being frozen (whcih eventually I'll add a message explaining whats going on) - however, once the connection frees up again the simulation catches up and his game continues.

During this lag period everyone else is free to continue play. Its just their friend might get left behind on the adventure.

I'm still not conviced it is just lag causing the problem (since I've never seen a TCP connection that lags for 3-10 seconds) but so far there isn't another explanation.

I'm also not entirely convinced there isn't something I've missed and this method is going to fail in some ornate way eventually.

During this lag period everyone else is free to continue play. Its just their friend might get left behind on the adventure.

Several MMORPG's do this. It's usually catastrophic - freiend de-lags in middle of group-only dungeon to find themslef totally outclassed and dead the moment they walk around a corner and meet any monster at all.

During this lag period everyone else is free to continue play. Its just their friend might get left behind on the adventure.

Several MMORPG's do this. It's usually catastrophic - freiend de-lags in middle of group-only dungeon to find themslef totally outclassed and dead the moment they walk around a corner and meet any monster at all.

Well, since during that pause hes not sending moves either you could attack that be either (a) makign him temporarily invulnerableor(b) making the monsters only target responding players.

As for his friends leavign him behind, if they do that then they arent very good frei nds, are they?

Got a question about Java and game programming? Just new to the Java Game Development Community? Try my FAQ. Its likely you'll learn something!

Interesting outcome - my TCP version was indeed suferred large lag spikes which I believe was down to backoff and resend times in the TCP stack (which I can't configure from the user API).

So, I went on and implemented a UDP version which ensures inorder delivery of commands - basically the same properties as TCP but no where near as complicated. I get a better granularity of packet control that just turning nagles off, I get to choose back-off values (i.e. I don't have any - yes, this would screw up a low bandwidth network) and resend times (current a constant). Its given me a smooth stream of commands that arrive in a timely fashion and the lag spikes are no where to be seen.

I think basically that TCP streaming properties are designed to adapt for busy networks by detecting lost packets based on timeouts and then backing off before resending. While this is a great strategy for improving network usage, I'm finding that across a long distance network connection the configuration of the TCP stack isn't suitable for keeping a consistently smooth stream of information.

I think the UDP vs TCP debate resolution still stands - only implement a streaming UDP protocol when you absolutely have to - I just had to in this case

I'm using classic/normal IO at the moment. I fear change - or rather I fear the bugs that seem to be associated with NIO atm and since I'm talking 4, maybe 8 players I just don't see the point in NIO for this.

Yep, ack = acknowledgement. However, in this case (like q3 networking) I only need to send an Ack the highest sequence number recieved since all the previously unacked messages are sent in each packet.

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org