I don't know if this is the proper forum to discuss this, but lately, after noticing all the great advancements in the field of machine learning, especially deep learning, I was wondering whether anyone has thought about the 'Writing a Wesnoth AI is hard' statement in a different light.

Notwithstanding the complications, branching factor or the inherent fuzziness of Wesnoth, traditional AIs with built in evaluation systems are known to be hard to program. But it seems as though deep learning methods, which are known to come up with their own evaluation systems, given all of the (gigantic) game database that we have for the learning process, could be a perfect solution to programming the strongest AI for the game.

I'd like to hear more thoughts on this, if anyone else had also considered this idea. I'm not an expert in machine learning, but would like to contribute to any such effort at writing such an AI.

This is a video game of size around 350 MiB, a database for deep learning would be much larger. It would not be possible to have each client create its own, because it would lead out of sync errors.

Also, it's easier said than done. The main people who program AI are mattsc and SeatleDad (not sure about the second), and mattsc told me that he is just an amateur who has gained some experience over the time he's been working with it. There's no team of AI scientists to create something like that.

The AI is somewhat evolving, new changes are tried to see if they perform better, but the changes in this evolution are human controlled.

I don't think this idea is so bad that it should be dismissed out of hand. Actually, it sounds pretty interesting. First, as milwac said, this database already exists. We don't need anybody to create one. I'm sure it is very large, but you don't need to include it with the game, you would just use the statistics to help create a new AI. I'm aware that there are problems with that, but milwac is not just telling the developers to do it, s/he is also volunteering to help. So, milwac, do you have any idea how to code this or how the machine learning algorithm will evaluate moves? If you don't, the idea really will be dead in the water, but if you do, it might go somewhere.

The biggest issue with any ML AI would be overfitting. I would not be surprised to discover that almost all of the samples in the database come from a very small fraction of the maps that are possible (for instance I don't there there is a good source of campaign replays). In addition, most of the games probably come from a small subset of the possible factions and eras.

So you could probably get a very good AI without too much difficulty for the default era on most two player maps, Isar's cross and other popular multiplayer maps, but one which may exhibit interesting and exploitable behaviour outside that context. In addition, it is likely that any map or balance change to the units will cause the AI to play worse again until it can be retrained.

In fact, SeattleDad did do exactly that for recruiting, possibly with the addition of AI vs AI games to generate more training data. The result was only able to recruit for the default era and could not recruit any unit not from it, as I recall.

That said, it's certainly possible to spend more effort to come up with a feature set that can avoid the default only problem and play AI vs AI games across all maps with all currently known factions to fix any issues arising from just using the existing collections of games available. Note that the unit characteristics for old versions of the game would have to be known while the AI learns from the replays - a case where the various balance changes that have occurred over time actually helping to produce a better AI.

Oh, and Aethermaw and other maps that change over time in predictable ways will probably always be a way to give the AI difficulties.

I was thinking about it and I came to an opinion that maybe deep learning is too advanced and demanding (assuming the deep learning mentioned was the deep learning in computer science and not just any machine learning), an evolutionary algorithm could possibly work.

An evolutionary algorithm requires some code that would mutate. A classical algorithm, if it randomly changes, would easily get into a hard-to-deal-with problem like broken loop or segfault. So the usual algorithm would be rather bad for this.

The mutable algorithm has to be general enough to express almost everything, including most of the current AI, so that it could find a really good algorithm no matter how can it look. It must also be able to execute quickly. It should express the numerical rating where should each unit move or attack (attack could be expressed as a path-dependent rating of the hex where the enemy is standing), trying only some of the possibilities (such as moving by several steps, starting from the unit and moving to the nearest hex with better rating, ending when a minimum is reached or the moves are depleted) I thought about a recursive functional data structure, something like this (written in C for readability):

Where operation_type marks the type of operation, operations may interpret the operands differently. In most cases, they would be smart pointers to other nodes (counts would be in some central repository), and the operation would mark some kind of operation between them. It could be a constant function, where the operands would be double and unused (double is typically 64-bit, like a (smart) pointer). It might be a game-related value, like a unit (with special operations on it defined, threat, distance, etc.), a list of units (with filtering and folding operations on it defined), hex properties (village, ownership). It might be a logic function, like comparison (leading to bool, where true is > 0.5, false is < 0.5) or branching (if operand_1 is true (> 0.5), operand_2 is computed, or it's zero, if/else could be a sum of two), but no cycle.

The mutations would modify constants and/or modify the structure of functions. The changes in constants would either multiply by a small value (something like random between 0.666 - 1.5, or maybe a square of such range to focus the distribution on its start) or add something (so that it could get from negative to positive and back, but do not do it with large values where it would do nothing). The alterations in functions would randomly remove or add some operations somewhere, with default values that do not change the structure much. It might change one operation into another too, but only in some defined way (so that the result would not get absolutely bad).

It would start with something hand-written that would have some tactical abilities. The testing would try to match two variations against each other, the loser becomes a mutation of the winner (it is deliberate that winner may be victorious because of lack). Variations that execute slower than the opponent would get penalised. It would match full random sides on one of the defined maps using two of the set of tested variations (20?) and go with combat animations and movement not shown. It would need a large number of games, maybe tens of thousands, but 'trainers' could have more clients open in the background and mix their variations so that breakthroughs would get far.

I don't think that it's easy or trivial, but it does not look absurdly hard to do or requiring some really complex science.

Sorry for replying more than a year later. I think this question is still relevant today and in the past year or so after gaining my PhD in Computer Science, I think I am slightly better equipped at providing some ideas and brainstorm with all of you better.

I previously mentioned using the old game database to train an AI, but right now, with the current advancements of reinforcement learning techniques (especially developed by DeepMind) I think the same technique can very well be applied to Wesnoth. A key component of any reinforcement learning algorithm is the input representation. I am not much aware of how the current AI in Wesnoth refers to units, but I feel that a generic framework where a unit is given by it's time of day orientation, attack types, damage, HP, MP, number of strikes, defenses in different terrains and resistances to different attack types will work best. In this way we can deal with units from all factions and eras in the same way and would not have to write ad-hoc statements like 'recruit more skeleton archers if cavalry spam' and so on. I am just mentioning this, since I don't know how the current AI works, probably the generic version already exists.

That was about how a unit can be represented. Now an atomic move according to me consists of 1. moving a single unit across a single hex 2. attacking an enemy unit and 3. recruiting a single unit. All of these actions change the game state and the AI must determine the correct sequences of these atomic moves from every game state within a single turn. So basically, unlike in Chess or Go, a player turn in Wesnoth will consist of a number of atomic moves which must be performed in a sequence resulting in the best possible outcome.

I will have to read more into existing papers and books on reinforcement learning to discuss more about how the reward system will work here. The primary reward is of course leaderkills in N v N games, and attaining the stated scenario objectives in campaign scenarios. Once that is done, all that remains is to make the AI play against itself and improve its evaluation function for any game state.

More thoughts are welcome! If a sizable number of people are willing to work on this together, it might be worth a shot to try this out!

Last edited by milwac on December 13th, 2017, 5:22 pm, edited 3 times in total.

@Elder but we have asked for the disclosure of all the cheese and we treat it seriously. Everyone on this site can get banned, everyone.
@Elder ale prosiliśmy o wyjawianie wszystkich serów i traktujemy to poważnie. Każdy na tej stronie może zostać zbanowany, każdy.

ElderofZion wrote:Im just a player but, isn't wesnoth too non deterministic for that?

Yes Wesnoth is very non deterministic, but that doesn't mean that an AI can't be developed for it. For example, poker playing AIs exist and are pretty good at the game. Basically a Wesnoth AI would play the game pretty much like a human does - make optimal decisions based on parts of the map they can see, and calculate battle outcomes based on probabilities and try to maximize this probability whenever possible.

milwac wrote:Sorry for replying more than a year later. I think this question is still relevant today and in the past year or so after gaining my PhD in Computer Science, I think I am slightly better equipped at providing some ideas and brainstorm with all of you better.

I previously mentioned using the old game database to train an AI, but right now, with the current advancements of reinforcement learning techniques (especially developed by DeepMind) I think the same technique can very well be applied to Wesnoth. A key component of any reinforcement learning algorithm is the input representation. I am not much aware of how the current AI in Wesnoth refers to units, but I feel that a generic framework where a unit is given by it's time of day orientation, attack types, damage, HP, MP, number of strikes, defenses in different terrains and resistances to different attack types will work best. In this way we can deal with units from all factions and eras in the same way and would not have to write ad-hoc statements like 'recruit more skeleton archers if cavalry spam' and so on. I am just mentioning this, since I don't know how the current AI works, probably the generic version already exists.

The biggest practical issues I can see with any ideal Wesnoth AI (not just ML based) is the problem that units and maps can and do have arbitrary WML associated with them. While adding some form of search (probably Monte Carlo given the ridiculous branching factor) can address these issues, the simulation is frustratingly expensive.
To look ahead a full turn needs a search depth of 20+ moves (10 units per player * 2 players) by about turn 3 and for some maps, eventually well over 100 (e.g. any 8+ player map) just to see what the next turn might look like.
Because turn schedules are also arbitrary as is map terrain by turn (Aethermaw), simulating to the next turn at some level is required. And unfortunately, the map on the next turn can be randomized (see multiplayer map Dark Forecast for one of the more dramatic examples, though it's also), so the ideal approach is to do full simulations into the next turn and not just a quick single check that just ends the turn and provides the next map state as additional input.

Something else I'd be curious about is how much disk space the training data would need. Even adding just a few gigabytes of pre-executed training data for the AI to reference for the default maps+era wouldn't really be ideal, I wouldn't think, since that would by itself be several times larger than the entire rest of the game.

99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code

You'd only need the training data if you were generating a new version of the AI. The final weights (which is all the binary needs) would be very small, and not a significant issue compared to the vast amount of art present. If you're asking about size for developers, there's no reason the training data can't be in a separate repository or even just left on the current server if using the current multiplayer replay storage with the expectation that running the training again requires downloading it from there first.

I imagine it's possible to write the AI and feed it the games it needs.

Let's ignore the months (years?) of programming effort codifying the maps and game rules in a form an AI can use, or changing the user interface and coding the AI to play the game directly, and jump to the end.

Why don't you rent a few thousand AWS VMs and have at it. You'd probably only need them for a few months per scenario before you came up with a usable dataset to preload into your AI. Don't forget to run it against all the UMC maps. And don't forget some of the maps have random changes so you'll need to rerun them a number of times before your AI comes up with schemes to handle those maps.

My point: just because something is feasible does not mean it's affordable.

Seems to me your most affordable course of action would be to divert all those BitCoin botnets to running your AI simulations, instead. Of course, that means no BitCoins being produced so you might upset whoever it is running those botnets