Comments

edited

Objective

To be able to support matchmaking systems that require a gameserver to register themselves with the matchmaker -- such as first party matchmakers.

Background

Many matchmakers that console and others companies provide as hosted services, have a matchmaker workflow in which:

A gameserver will register itself with the matchmaker when ready (usually for a specific period of time), with their IP and port(s) for gameplay

The matchmaker will choose the gameserver out of the pool that have registered themselves

The matchmaker will communicate to the game server that the matchmaker has chosen it for gameplay

Players will then play on the game server.

This would be an alternative flow to FleetAllocation or GameServerAllocation, which will remain the preferred method of GameServer allocation, so that Agones can retain fine control of scheduling within the cluster, but since this is quite a prevalent workflow, Agones should also support it, with appropriate documentation on the tradeoffs.

Requirements and Scale

The design and implementation must ideally have no potential race conditions, and actively prevent the user from incurring race conditions in their usage as well.

Design Ideas

To support this within Agones, we will need to add three enhancements:

A new ReservedGameServer state

A new SDK function Reserve(seconds)

A new SDK function Allocate()

Reserved GameServer State

Reserved state is to signify that the GameServer cannot be deleted, as it may move to allocated in a given time frame. Therefore:

When scaling down a Fleet, Reserved instances will not be deleted

When autoscaling a fleet, they will be counted towards the current buffer, and therefore a change of state to Reserved will not incur an increase in GameServers in the Fleet.

This will mean that if a GameSerer is not demarcated for a game session by the matchmaker, it can move back to Ready in a timely manner, and is able to be scaled down as needed.

SDK Function: Reserve(seconds)

This new SDK function, Reserve will set the GameServer record to the Reserved state for the given number of seconds. (0 indicating forever). When that time period has ended, the GameServer shall revert back to Ready.

It is assumed that when working with a matchmaker, the developer will mark the GameServer as Reserved for slightly longer than it is registered with the matchmaker, so as to avoid scale down race conditions.

Technical Details

Rather than implementing this with a queue, this should be a synchronous call to the Kubernetes API, with in-built retry and a timeout (30s) on failure. Otherwise there is potential for race conditions between calling the SDK function, and the GameServer being moved to Reserved state.

SDK Function: Allocate()

This new SDK function all allows a game server to mark itself as Allocatedwhen called.

Technical Details

Rather than implementing this with a queue, this should be a synchronous call to the Kubernetes API, with in-built retry and a timeout (30s) on failure. Otherwise there is potential for race conditions between calling the SDK function, and the GameServer being moved to Allocated state.

SDK Function: Ready()

As currently exists, Ready() should return the GameServer to a ready state, but also remove any timeout that may be in place from a Reserve(n).

Proposed Matchmaker Workflow

The following would be the workflow for a game server process as it is integrated with Agones and 1st party matchmaker.

In this workflow, there is no requirement for a GameServer to mark themselves as Ready - they can Reserve(n) themselves as soon as they are about to register themselves with the matchmaker.

This comment has been minimized.

This new SDK function, Reserve will set the GameServer record to the Reserved state for the given number of seconds. (0 indicating forever). When that time period has ended, the GameServer shall revert back to Ready.

What happens after the GS returns to Ready state (after timeout)? Should it try to set itself to Reserve state and then Register with MM?

This comment has been minimized.

What happens after the GS returns to Ready state (after timeout)? Should it try to set itself to Reserve state and then Register with MM?

That's up to the game server. But yes, I expect it will re-register, possibly after a short period, to give an opportunity to scale down if there aren't any players currently playing.

I don't understand the purpose of reserving for a limited amount of time though... :-\

It's my understanding, that for many matchmakers, a gameserver will register themselves for a time period with the matchmaker - i.e. "I'm available to play a game on, for the next 5 minutes". Once that time has passed, the matchmaker no longer has it available as an option, unless it re-registeres. This happens so that a fleet can scale down as needed, if there are less people than anticipated playing a game, and one needs to scale the fleet down.

This comment has been minimized.

It's my understanding, that for many matchmakers, a gameserver will register themselves for a time period with the matchmaker - i.e. "I'm available to play a game on, for the next 5 minutes". Once that time has passed, the matchmaker no longer has it available as an option, unless it re-registeres. This happens so that a fleet can scale down as needed, if there are less people than anticipated playing a game, and one needs to scale the fleet down.

Matchmaker could also delete those unused gameservers via the k8s API.

This comment has been minimized.

I would change Reserve(n) into Reserve() and Unreserve() (or other equivalent names), because it's more flexible and allows both the type of implementation when we know the reservation time in advance and also the kind in which the matchmaker tells a server to shut down.

This comment has been minimized.

edited

I would change Reserve(n) into Reserve() and Unreserve() (or other equivalent names), because it's more flexible and allows both the type of implementation when we know the reservation time in advance and also the kind in which the matchmaker tells a server to shut down.

Not all languages have function overloading, but those that do can do Reserve() and Reserve(n), those that don't could do Reserve(n) and Reserve(0) (where 0 is forever).

Unreserve we already have 😄 it's called Ready() - just need to make some tests to make sure this logic path works, and make any adjustments as need be (like removing the timeout if it exists).

So I think then that the above design should work out well. I added a section on Ready() to indicate that it should remove the timeout, and make explicit its requirements.

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although the gRPC
libraries have been regenerated). Other languages can come in later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (googleforgames#660)

Now GameServers can self Allocate!
This is just the implementation of the GO SDK at this stage (although
the gRPC libraries have been regenerated). Other languages can come in
later PRs.
This is the first part of 1st Party MatchMaking support (#660)

This comment has been minimized.

I was digging into this more - from the design and the original version of Allocate (and eventually Reserve) was to make it a synchronous operation with a 30 second timeout - the idea being to stop race condtions.

Looking at the code, I don't think this is a good idea. They should be async like Ready and Shutdown. Having some SDK functions that change status.state values are sync and some that are async (a) gives us an inconsistent interface for the SDK and (b) will actually cause the race conditions that I previously was trying to remove.

I'll shift the current implementation of Allocate over to using the queue like the other implementations targeted at the next release. The API surface will stay the same though.

This also means it doesn't matter what state change you are trying to implement - you can use the watch command to see when the final change occurs - so you only need one logical path to implement.

Please let me know if anyone has objections / that doesn't make sense.

Update dot and generated PNG with the Reserved state, and resultant
flow from there.
Tried to keep it as simple as possible, while still representing
potential state changes.
Should be final work on googleforgames#660 except for missing SDK functions.

Update dot and generated PNG with the Reserved state, and resultant
flow from there.
Tried to keep it as simple as possible, while still representing
potential state changes.
Should be final work on googleforgames#660 except for missing SDK functions.

Update dot and generated PNG with the Reserved state, and resultant
flow from there.
Tried to keep it as simple as possible, while still representing
potential state changes.
Should be final work on googleforgames#660 except for missing SDK functions.

Update dot and generated PNG with the Reserved state, and resultant
flow from there.
Tried to keep it as simple as possible, while still representing
potential state changes.
Should be final work on #660 except for missing SDK functions.

This comment has been minimized.

The remaining item is support for all the languages/engines. How do we feel about closing this ticket -- wait for the SDK functionality to be finished, or create a new ticket for the missing (which we have in #927 .