Hello, I am trying to create a very basic evolution simulation but there are few things I didn't understand clearly.

I am creating "worms" and "mushrooms" and putting them into my virtual world, and at each frame I give the coordinates of the nearest mushroom to every worm and thus they update their direction and location properly.
So the input to my "worms" are 4 doubles, the nearest mushrooms x and y coordinates and it's own x and y coordinates, the best solution would be to chose direction as atan2 function with parameters x2-x1 and y2-y1, so in bare bones I am trying to teach my neural network how to calculate atan2 function.

I am using NeuronDotNet in C# but I don't know if these things are common in NN libraries since this is the first I am using.
I created a LinearLayer as input layer and sigmoid layers for hidden and output layers. input has 4 neurons, output has 1 neuron these are fixed(I guess), I experimented with hidden layer assigning 1,2 and 3 neurons but I couldn't spot a big difference.

So my questions are would it be more successful if I decreased input neurons to 2 and give the difference pair as input(since atan2 function needs these two) ? But I assume no matter what, after training my NN, it should be able to understand the relation between those "pairs".

Also how can I decide which type of layer I should be using? The direction is between 0 and 2*pi so I assumed using sigmoid layer and multiplying the output with 2*pi will do the job. However linear layers are giving better results strangely(or expectedly?)

About the genetic algorithm part I first randomly assign weights to all neurons but I couldn't find in the documentation about the range of the weight and bias values. Should I assign unity values so far I've been assigning between -20 and +20 but everytime I test a random chromosome it is either giving me a value very close to 0 or very close to 1. I mean is it because the range I use or is it just VERY likely to get a sigmoid output that is very "thin" with random weights.

First, let me point out that, no matter how large a network you build, every linear network is going to be equivalent to a one-neuron network. I.e., every linear network is just a function

f(x,y) = a x + b y

for some weights a and b. So that's all we need to think about.

Next, an observation: If we are talking about the version of atan2 that outputs in the range [-pi, pi), that the very simple linear function

f(x,y) = 0 x + 1 y

has the same sign as atan2 everywhere.

This means that, if agents choose their heading angle as,

theta = clamp(f(x,y), -pi, pi)

then their y-error will always be decreasing. Once the y-error is zero, we have

f(x,0)=0

so the agents will at that point walk due east (I'm assuming theta=0 is east). If they are west of the goal, this will get them there; if they are east, they will be walking away from it. So choosing their angle according to this simple, linear function works about half the time. When you're evolving the weights from totally random values, and you see your success rate move up from low numbers all the way to 50%, you might feel encouraged -- but the above is sufficient to explain it! Moreover, if your world wraps around (i.e., has the topology of a torus), then you'll have 100% success with this linear function.

One more thing to add to emergents posts: The linear function may be working better because your inputs are out of range for the sigmoid activation function. Most NNs want inputs in the range of -1 and 1. I don't know anything about the library you mentioned, but I wouldn't be surprised if this is what it expects.

Something else you might want to try before synthesizing arc tan. Why not make your bots in to little tanks? You have two outputs, one for each tank tread, that control how much that tread should spin. It's an easier problem for your GA NN to figure out, since overloading an output results in spinning (going nowhere).

It doesn't help that most people's way of thinking of neural networks is they're just magical artificial brains.

In fact Neural Networks are mostly bullshit, created to solve problems that you don't want to or are not intelligent enough to code up a proper solution to. In these cases you just accept approximate solutions and state that you want any inputs to be mapped to outputs that are known to arise from similar inputs.

From what I recall, sigmoid functions are used to convert a floating point number into almost binary, e.g. likely to be very close to 0 or 1, but occasionally in the middle. This kind of output is great if you want to make a binary decision, e.g. do I go up or down, but not so great when the output should be an evenly distributed floating point number.

It doesn't help that most people's way of thinking of neural networks is they're just magical artificial brains.

In fact Neural Networks are mostly bullshit, created to solve problems that you don't want to or are not intelligent enough to code up a proper solution to. In these cases you just accept approximate solutions and state that you want any inputs to be mapped to outputs that are known to arise from similar inputs.

TL;DR: NNs are way over-hyped.

While I agree with your sentiment, having just read two posts of yours where your tone was obnoxious, it's hard to take you seriously. Back it down a notch.

It doesn't help that most people's way of thinking of neural networks is they're just magical artificial brains.

In fact Neural Networks are mostly bullshit, created to solve problems that you don't want to or are not intelligent enough to code up a proper solution to. In these cases you just accept approximate solutions and state that you want any inputs to be mapped to outputs that are known to arise from similar inputs.

TL;DR: NNs are way over-hyped.

Over hyped, yes. Bullpoo, no. Misunderstood, definately.

An ANN is an approximation of a function based on observations. So is a linear regression. So is a running average. So is almost all of statistics. Hardly bull poo.

The problem is that an ANN tends to have too many parameters, with the consequent complications in training and risk of overfitting the data. There is a class of ANNs that I find very useful: They have a single neuron, and are more commonly called multilinear regression (if the activation function is linear) or logistic regression (if the activation function is a sigmoid). Beyond that, things get messy very quickly, at least with traditional back-propagation networks.

Alvaro: When I'm hitting data for the first time I always build an OLS model first. It's rare that it's ever the top contender, though it does happen. The only model type that seems to consistently reign supreme is a boosted ensemble of trees.

Regardless, in many stats packages (mine included) there is no hand tuning necessary. Provide data, click go, and you have a model which has been validated etc... Even the choice of activation function at each layer can be automated these days. If you had to write everything from scratch then it definately would be more challenging than multivariate OLS.

Dave: I really can't argue with you as I've never put any form of regression model in a game. You are right that it's pretty much impossible to hand tune after the fact. A decision tree would be better suited for what you want, but only if you have a bunch of data and don't know what it means before hand. .

I rarely run across a video game where I think "this needs some machine learning", but it does happen. Many board games have terrible AI and they get boring really fast. Some "sandbox" games, like MineCraft, could definately use some novel approaches to NPCs. These sorts of games don't have a specific win condition and players don't want scripted or repetitive things happening.

It's still early days for video games. I'm not ready to call "case closed" on statistical inference just yet.

You can easily get non-scripted and non-repetitive out of traditional AI techniques. It's actually not hard at all just through things like parametric noise, weighted random decisions, etc. Then just let chaos theory kick in.