Best method for agent learning?

This is a discussion on Best method for agent learning? within the General AI Programming forums, part of the Cprogramming.com and AIHorizon.com's Artificial Intelligence Boards category; Ive made an artificial life simulator where agents begin with simple, randomly generated neural networks but develop strategies and complex ...

Best method for agent learning?

Ive made an artificial life simulator where agents begin with simple, randomly generated neural networks but develop strategies and complex behaviors through natural selection. However, they remain with their behaviors until they die and cannot change them. I was hoping to somehow modify the neural networks, use reinforcement algorithms, or do something so that they could adapt as they lived. The inputs right now are whether an herbivore/carnivore/plant is to their left/right/front/proximity, and their health status. Energy is gained by eating the type of food theyre supposed to. There is no way for herbivores to judge whether or not carnivores are dangerous right now since they will just be eaten, but i guess I could change it so that an herbivore has a certain probability of surviving. What would be the best way to implement adaptive learning for this kind of simulator?

Nah, this needs to be actual AI learning, not just probability actions. By the way, im sending this to MIT too, and i care more about them being impressed than my school. I have until the end of january to finish this thing for MIT. Since I now have study hall every day at school, I should have enough time to add some sort of AI learning. The basic simulator itself I finished in two weeks over the summer.

I dont see how that would work though for any other agent action besides eating. How would an herbivore for example know that turning when a carnivore is infront of it would be a good thing? From what ive been reading, maybe reinforcement learning might be a better idea than neural networks. What do you think?

>Nah, this needs to be actual AI learning, not just probability actions

boy are you gonna be disapointed when you take your first AI class. Have a look at Bayesian Nets.. its just a bunch of probabilistic decisions where the "learning" updates the probabilities. Or Neural Nets, where training data defines the probabilities of activities of the nodes (neurons).

Ok, i looked for a while, and I tried using temporal difference learning with neural networks, kinda like in TD-Gammon. Heres the algorithm I used

1. The agent acts based on which output cell has the highest value
2. Store the agent's inputs when it acted
3. Set the reward equal to the difference in the agen'ts health from before it acted to its current health.
4. Repercieve the new state of the agent.
5. Store the new output cell with the highest value
6. Error = reward + learningRate * (new value of highest output cell) - (value of output cell from the agent's previous action)
7. Find the weights tied from any non-zero inputs of the agent when it acted to the output cell of its action, and add the error to each of those weights.

Does this sound right? The agents that do this dont really seem to be doing any better than the ones that would just evolve, maybe a bit worse even.

That sounds about correct to me. TD-Learning doesn't necessarily gurantee good results. Also at the outset, TD-Learning won't beat your evolution thing either. Think of how TD-Gammon got so good. It started terrible, but by playing for a long time it got very good.

It could be that you're not carrying over the data from previous experiments to subsequent ones.

I think im screwing up with how the weights are adjusted. I looked at some of the agent weights in the simulator and the numbers were pretty huge. I could see how this would happen, since if the error keeps getting added to the weights, the state values increase as well and so everything just keeps increasing or decreasing like crazy. Is there a better way to apportion them so that the weights will stay within a reasonable range?