"There is a chasm between the two sets of people who commonly create artificial neural network (ANN) models. Computational neuroscientists are, by definition, interested in creating models that are biologically realistic, although the focus is on the elucidation of the computational and information processing properties and functions of the biological neural network (BNN) structures being modeled. On the other hand, most machine learning and artificial intelligence researchers are generally not fussed about biological realism, opting for neural network models that, while biologically inspired, seem to be based on an understanding of the brain from many decades ago (a primary example of which is the almost ubiquitous use of a rate-based encoding scheme for neuron action potentials).

Why have ML and AI researchers and teachers been so singularly focused, for the most part, on vastly simplified models? (Of course, there are some exceptions, but they are just that.) Why haven't the models we typically use been kept up-to-date with findings from neuroscience? Perhaps it's simply because a lack of computational power has made more biologically realistic models too cumbersome to work with? Perhaps it's simple inertia, using the same model we've always used, and that everyone else is using, and that we understand well? Perhaps most of us think more biologicially plausible models are not necessary?

As noted in the first blog post, my current research project will explicitly attempt to make use of known functional properties of biological neural networks in order to create artificial neural networks that perform online learning, by way of self-modification mechanisms modelled on those found in biological neural networks. I've been collating a list of these known low-level functional properties and their probable/speculated (depending on level of available evidence) role in various learning mechanisms employed in biological neural networks. And let me tell you, I found it quite an eye-opener! That is, once I had deciphered some of the neuroscience lingo. Perhaps that's one more barrier to ML and AI researchers incorporating these functional properties into their models?

Hi Oliver, that s a good discussion topic. There s a lot of disagreement and confusion over the the question of what to take from biology and what to abstract

Message 2 of 16
, May 10, 2012

Hi Oliver, that's a good discussion topic. There's a lot of disagreement and confusion over the the question of what to take from biology and what to abstract away. It sounds like you are interested in a number of low-level phenomena that researchers often do not incorporate into neural models. Some of these are undoubtedly critical (such as plasticity) while others may or may not be important (such as polarity). But I guess what's interesting is whether there is anything we can say in general on what we should care about. I've found it interesting sometimes to start from the assumption that the seemingly most essential phenomena are actually unnecessary. That is, it's easy and natural to kind of look at a list of low-level natural features and say "hey, your model is missing this or that," but it's much more challenging to start by saying, "all of this could be irrelevant and I have to find an argument for why it's relevant."

But even more interesting than that, I think, is to try to abstract away phenomena that seem essential into something that looks very different. For example, a CPPN is a kind of abstraction of development through growth and local interaction, yet the CPPN does not itself involve any growth or local interaction. So you get a kind of new perspective on what really makes developmental what it is (and the answer may turn out to be something much more abstract, like simply function composition).

Nature chooses specific phenomena that are necessitated by physics to embody its essential functions, but those functions themselves may be significantly more abstract than the physical apparatus that embody them. The challenge is to find that right level of abstraction to get the same kind of power without the cost.

ken

--- In neat@yahoogroups.com, "olivercoleman04" <oliver.coleman@...> wrote:
>
> Hi all,
> I've been doing some research into neuroscience and the functional
> properties of biological neural networks that might be useful for
> online-learning / adaptive networks. I'd love to hear your thoughts if
> you're interested in this, I wrote a blog post about it at
> http://ojcoleman.com/content/models-brains-what-should-we-borrow-biology
> <http://ojcoleman.com/content/models-brains-what-should-we-borrow-biolog\
> y> . Below is a snippet...
> Cheers,Oliver
> "There is a chasm between the two sets of people who commonly create
> artificial neural network (ANN) models. Computational neuroscientists
> are, by definition, interested in creating models that are biologically
> realistic, although the focus is on the elucidation of the computational
> and information processing properties and functions of the biological
> neural network (BNN) structures being modeled. On the other hand, most
> machine learning and artificial intelligence researchers are generally
> not fussed about biological realism, opting for neural network models
> that, while biologically inspired, seem to be based on an understanding
> of the brain from many decades ago (a primary example of which is the
> almost ubiquitous use of a rate-based encoding scheme for neuron action
> potentials).
> Why have ML and AI researchers and teachers been so singularly focused,
> for the most part, on vastly simplified models? (Of course, there are
> some exceptions, but they are just that.) Why haven't the models we
> typically use been kept up-to-date with findings from neuroscience?
> Perhaps it's simply because a lack of computational power has made more
> biologically realistic models too cumbersome to work with? Perhaps it's
> simple inertia, using the same model we've always used, and that
> everyone else is using, and that we understand well? Perhaps most of us
> think more biologicially plausible models are not necessary?
> As noted in the first blog post, my current research project will
> explicitly attempt to make use of known functional properties of
> biological neural networks in order to create artificial neural networks
> that perform online learning, by way of self-modification mechanisms
> modelled on those found in biological neural networks. I've been
> collating a list of these known low-level functional properties and
> their probable/speculated (depending on level of available evidence)
> role in various learning mechanisms employed in biological neural
> networks. And let me tell you, I found it quite an eye-opener! That is,
> once I had deciphered some of the neuroscience lingo. Perhaps that's one
> more barrier to ML and AI researchers incorporating these functional
> properties into their models?
> So I now present these eye-opening properties, and their possible
> relevance to online-learning..." Read more at
> http://ojcoleman.com/content/models-brains-what-should-we-borrow-biology
> <http://ojcoleman.com/content/models-brains-what-should-we-borrow-biolog\
> y>
>

Oliver Coleman

Hi Ken, Yes, I m pretty sure that not all of the phenomena I listed are important; and that a good starting point in general is to assume that they are not. I

Message 3 of 16
, May 10, 2012

Hi Ken,

Yes, I'm pretty sure that not all of the phenomena I listed are important; and that a good starting point in general is to assume that they are not. I also agree with your argument that a lot of the low-level phenomena we see may be a result of implementation with particular physical systems (and I would add perhaps as a result of evolutionary happenstance). The CPPN is a particularly compelling example of significant abstraction of developmental processes, producing many of the same features of the end result of developmental processes. One thing it does abstract away, in the context of plastic networks, is the effect of external input on the developmental process (which may or may not be an issue depending on details of implementation, problem domain, etc...).

Perhaps we could also assume that, rather than some specific set of functions being the only workable set, what matters is having a workable combination of functions, and that there are many possible combinations that would work equally well. In this framework we could assume that biological neural networks represent at least a reasonably good combination of low-level functions, and so we could use this combination as a guide (but of course this doesn't answer what functions in this combination are actually important, or what things can be abstracted away). Also, some combinations may be workable, but are far harder to evolve solutions with, or require much larger networks, etc (eg evolving networks incorporating neuromodulation of synaptic plasticity can be much easier for some tasks than for those without this type of neuromodulation).

I'm intending to run some experiments to explore these questions (which phenomena are important, acceptable level of abstraction, etc), but of course to try and thoroughly explore all of these functions in many combinations would be a massive undertaking, and is not my main interest, so at some point I will have to pick a model and run with it after only a few, hopefully well chosen, experiments... Perhaps one approach is to create flexible parameterised versions of these functions, and let evolution determine what combination is right (like your approach described in "Evolving adaptive neural networks with and without adaptive synapses", but perhaps more flexible and applied to more functions).

Do you mind if I post/quote some/all of this discussion in the comments of my blog post?

Cheers,

Oliver

Ken

HI Oliver, please feel free to quote or post this discussion on your blog, I don t see any harm in that. ken

Message 4 of 16
, May 15, 2012

HI Oliver, please feel free to quote or post this discussion on your blog, I don't see any harm in that.

ken

--- In neat@yahoogroups.com, Oliver Coleman <oliver.coleman@...> wrote:
>
> Hi Ken,
>
> Yes, I'm pretty sure that not all of the phenomena I listed are important;
> and that a good starting point in general is to assume that they are not. I
> also agree with your argument that a lot of the low-level phenomena we see
> may be a result of implementation with particular physical systems (and I
> would add perhaps as a result of evolutionary happenstance). The CPPN is a
> particularly compelling example of significant abstraction of developmental
> processes, producing many of the same features of the end result of
> developmental processes. One thing it does abstract away, in the context of
> plastic networks, is the effect of external input on the developmental
> process (which may or may not be an issue depending on details of
> implementation, problem domain, etc...).
>
> Perhaps we could also assume that, rather than some specific set of
> functions being the only workable set, what matters is having a workable
> combination of functions, and that there are many possible combinations
> that would work equally well. In this framework we could assume that
> biological neural networks represent at least a reasonably good combination
> of low-level functions, and so we could use this combination as a guide
> (but of course this doesn't answer what functions in this combination are
> actually important, or what things can be abstracted away). Also, some
> combinations may be workable, but are far harder to evolve solutions with,
> or require much larger networks, etc (eg evolving networks incorporating
> neuromodulation of synaptic plasticity can be much easier for some tasks
> than for those without this type of neuromodulation).
>
> I'm intending to run some experiments to explore these questions (which
> phenomena are important, acceptable level of abstraction, etc), but of
> course to try and thoroughly explore all of these functions in many
> combinations would be a massive undertaking, and is not my main interest,
> so at some point I will have to pick a model and run with it after only a
> few, hopefully well chosen, experiments... Perhaps one approach is to
> create flexible parameterised versions of these functions, and let
> evolution determine what combination is right (like your approach described
> in "Evolving adaptive neural networks with and without adaptive synapses",
> but perhaps more flexible and applied to more functions).
>
> Do you mind if I post/quote some/all of this discussion in the comments of
> my blog post?
>
> Cheers,
> Oliver
>

Jeff Clune

Hello Oliver, I m much delayed in reading all of this as I have been insanely busy lately, but I have a few thoughts that might help you out: 1) Be careful

Message 5 of 16
, Jul 10, 2012

Hello Oliver,

I'm much delayed in reading all of this as I have been insanely busy lately, but I have a few thoughts that might help you out:

1) Be careful with meta-evolution (evolving the parameters of evolutionary algorithms). It sounds good in theory, but can be tricky in practice because evolution is short-sighted and conservative, preferring exploitation over exploration, which can be very harmful vis a vis long-term adaptation. Check out my PLoS Computational Biology paper for a smoking gun on this front (the evolution of mutation rates). You may face the exact same problem if you go down this road. [Note, however, that using a divergent search algorithm like novelty search may allow you to take better advantage of meta-evolution: see Joel and Ken's 2012 alife review article on that subject.]

2) Another reason people do not throw all of the biology into the soup to see what happens is because scientifically you end up in an impenetrable quagmire where you can't figure out what is going on and you end up not learning much/anything. The scientific method demands keeping all else equal, and if you have a lot of variables you don't perfectly understand, it takes years to figure out what is going on if you are lucky! Even if you keep all else equal, if you are doing so against a backdrop that involves a lot of complexity you don't understand, any difference you see may be due to an interaction effect with one of the features in your backdrop...and that may invalidate generalizing your result to other backdrops of interest. In my limited experience, I have found that most new scientists want to throw a million things into their model--especially biologically motivated phenomena--to see what happens, and as they grow older/more jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as simple as possible. In fact, a pretty good heuristic for good hypothesis-testing science is to keep things absolutely as simple as possible while allowing the question to be asked. However, that may not be a good heuristic for more exploratory science where you just set out and see what you discover.

3) You should check out Julian Miller's papers on evolving a checkers player (with his student M. Khan, I believe). Or, better, email/Skype him (he's an extremely nice guy and I'm sure he would be happy to talk to you). He decided in the last few years that he is running out of time as a scientist and has tenure and he has spent years keeping things as simple as possible, and he now just wants to do what he originally wanted to do when he started: throw as much biology in the soup as possible and see if a golem crawls out. He has incorporated a ton of biologically inspired low-level mechanisms in evolving neural networks. From what I recall, however, it did become very difficult to figure out which ingredients were essential and exactly what was going on because of all the involved complexity. He may have updated results since I last checked in, however. So, you may benefit the actual work that he has done on this front and, more generally, from his opinions on the general scientific approach you are proposing.

I hope that helps. Best of luck, and I look forward to hearing what you learn!

> Hi Ken,
>
>
> Yes, I'm pretty sure that not all of the phenomena I listed are important; and that a good starting point in general is to assume that they are not. I also agree with your argument that a lot of the low-level phenomena we see may be a result of implementation with particular physical systems (and I would add perhaps as a result of evolutionary happenstance). The CPPN is a particularly compelling example of significant abstraction of developmental processes, producing many of the same features of the end result of developmental processes. One thing it does abstract away, in the context of plastic networks, is the effect of external input on the developmental process (which may or may not be an issue depending on details of implementation, problem domain, etc...).
>
> Perhaps we could also assume that, rather than some specific set of functions being the only workable set, what matters is having a workable combination of functions, and that there are many possible combinations that would work equally well. In this framework we could assume that biological neural networks represent at least a reasonably good combination of low-level functions, and so we could use this combination as a guide (but of course this doesn't answer what functions in this combination are actually important, or what things can be abstracted away). Also, some combinations may be workable, but are far harder to evolve solutions with, or require much larger networks, etc (eg evolving networks incorporating neuromodulation of synaptic plasticity can be much easier for some tasks than for those without this type of neuromodulation).
>
> I'm intending to run some experiments to explore these questions (which phenomena are important, acceptable level of abstraction, etc), but of course to try and thoroughly explore all of these functions in many combinations would be a massive undertaking, and is not my main interest, so at some point I will have to pick a model and run with it after only a few, hopefully well chosen, experiments... Perhaps one approach is to create flexible parameterised versions of these functions, and let evolution determine what combination is right (like your approach described in "Evolving adaptive neural networks with and without adaptive synapses", but perhaps more flexible and applied to more functions).
>
> Do you mind if I post/quote some/all of this discussion in the comments of my blog post?
>
> Cheers,
> Oliver
>
>

Oliver Coleman

Hi Jeff, Thanks for taking the time to write your thoughts. :) Perhaps I did not describe my intentions clearly: I don t intend to evolve the parameters of the

Message 6 of 16
, Jul 12, 2012

Hi Jeff,

Thanks for taking the time to write your thoughts. :)

Perhaps I did not describe my intentions clearly: I don't intend to evolve the parameters of the EA, but rather the parameters of some neural network components or functions (eq the parameters for synaptic plasticity update rules).

Yes, I've been wondering how best to go about testing the usefulness of multiple functional properties in a model. I'm aware of Miller and Khan's work. They certainly did throw just about everything into the pot. And it was hard to draw many solid conclusions as a result. I think the best approach I've come up with so far (and I'm open to criticism or other ideas) is to create a model with (computationally tractable abstractions of) most or all of the biological phenomena/properties for which there is evidence of a role in learning, and/or which are computationally cheap, and then perform an ablative study where each property is disabled in turn: if disabling a property reduces the efficacy of the model in some learning task or other (in terms of evolvability, quality of solutions, or some other metric), then that property is useful (I believe Stanley and Miikkulainen used this approach with their NEAT algorithm). This way one doesn't have to try every combination of properties, and avoids the problem where one property might only be useful in combination with one or more other properties (which would be a problem if only one property is enabled at a time).

Comparing the performance of the new model against other existing models on the same learning tasks would help demonstrate whether the new model as a whole is an improvement or not, although like you say this doesn't avoid the issue of "interaction effect[s] with one of the features in your backdrop". Perhaps testing against many kinds of learning tasks would help alleviate this effect?

I'm much delayed in reading all of this as I have been insanely busy lately, but I have a few thoughts that might help you out:

1) Be careful with meta-evolution (evolving the parameters of evolutionary algorithms). It sounds good in theory, but can be tricky in practice because evolution is short-sighted and conservative, preferring exploitation over exploration, which can be very harmful vis a vis long-term adaptation. Check out my PLoS Computational Biology paper for a smoking gun on this front (the evolution of mutation rates). You may face the exact same problem if you go down this road. [Note, however, that using a divergent search algorithm like novelty search may allow you to take better advantage of meta-evolution: see Joel and Ken's 2012 alife review article on that subject.]

2) Another reason people do not throw all of the biology into the soup to see what happens is because scientifically you end up in an impenetrable quagmire where you can't figure out what is going on and you end up not learning much/anything. The scientific method demands keeping all else equal, and if you have a lot of variables you don't perfectly understand, it takes years to figure out what is going on if you are lucky! Even if you keep all else equal, if you are doing so against a backdrop that involves a lot of complexity you don't understand, any difference you see may be due to an interaction effect with one of the features in your backdrop...and that may invalidate generalizing your result to other backdrops of interest. In my limited experience, I have found that most new scientists want to throw a million things into their model--especially biologically motivated phenomena--to see what happens, and as they grow older/more jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as simple as possible. In fact, a pretty good heuristic for good hypothesis-testing science is to keep things absolutely as simple as possible while allowing the question to be asked. However, that may not be a good heuristic for more exploratory science where you just set out and see what you discover.

3) You should check out Julian Miller's papers on evolving a checkers player (with his student M. Khan, I believe). Or, better, email/Skype him (he's an extremely nice guy and I'm sure he would be happy to talk to you). He decided in the last few years that he is running out of time as a scientist and has tenure and he has spent years keeping things as simple as possible, and he now just wants to do what he originally wanted to do when he started: throw as much biology in the soup as possible and see if a golem crawls out. He has incorporated a ton of biologically inspired low-level mechanisms in evolving neural networks. From what I recall, however, it did become very difficult to figure out which ingredients were essential and exactly what was going on because of all the involved complexity. He may have updated results since I last checked in, however. So, you may benefit the actual work that he has done on this front and, more generally, from his opinions on the general scientific approach you are proposing.

I hope that helps. Best of luck, and I look forward to hearing what you learn!

> Hi Ken,
>
>
> Yes, I'm pretty sure that not all of the phenomena I listed are important; and that a good starting point in general is to assume that they are not. I also agree with your argument that a lot of the low-level phenomena we see may be a result of implementation with particular physical systems (and I would add perhaps as a result of evolutionary happenstance). The CPPN is a particularly compelling example of significant abstraction of developmental processes, producing many of the same features of the end result of developmental processes. One thing it does abstract away, in the context of plastic networks, is the effect of external input on the developmental process (which may or may not be an issue depending on details of implementation, problem domain, etc...).>
> Perhaps we could also assume that, rather than some specific set of functions being the only workable set, what matters is having a workable combination of functions, and that there are many possible combinations that would work equally well. In this framework we could assume that biological neural networks represent at least a reasonably good combination of low-level functions, and so we could use this combination as a guide (but of course this doesn't answer what functions in this combination are actually important, or what things can be abstracted away). Also, some combinations may be workable, but are far harder to evolve solutions with, or require much larger networks, etc (eg evolving networks incorporating neuromodulation of synaptic plasticity can be much easier for some tasks than for those without this type of neuromodulation).>
> I'm intending to run some experiments to explore these questions (which phenomena are important, acceptable level of abstraction, etc), but of course to try and thoroughly explore all of these functions in many combinations would be a massive undertaking, and is not my main interest, so at some point I will have to pick a model and run with it after only a few, hopefully well chosen, experiments... Perhaps one approach is to create flexible parameterised versions of these functions, and let evolution determine what combination is right (like your approach described in "Evolving adaptive neural networks with and without adaptive synapses", but perhaps more flexible and applied to more functions).>
> Do you mind if I post/quote some/all of this discussion in the comments of my blog post?
>
> Cheers,
> Oliver
>
>

I'm much delayed in reading all of this as I have been insanely busy lately, but I have a few thoughts that might help you out:

1) Be careful with meta-evolution (evolving the parameters of evolutionary algorithms). It sounds good in theory, but can be tricky in practice because evolution is short-sighted and conservative, preferring exploitation over exploration, which can be very harmful vis a vis long-term adaptation. Check out my PLoS Computational Biology paper for a smoking gun on this front (the evolution of mutation rates). You may face the exact same problem if you go down this road. [Note, however, that using a divergent search algorithm like novelty search may allow you to take better advantage of meta-evolution: see Joel and Ken's 2012 alife review article on that subject.]

2) Another reason people do not throw all of the biology into the soup to see what happens is because scientifically you end up in an impenetrable quagmire where you can't figure out what is going on and you end up not learning much/anything. The scientific method demands keeping all else equal, and if you have a lot of variables you don't perfectly understand, it takes years to figure out what is going on if you are lucky! Even if you keep all else equal, if you are doing so against a backdrop that involves a lot of complexity you don't understand, any difference you see may be due to an interaction effect with one of the features in your backdrop...and that may invalidate generalizing your result to other backdrops of interest. In my limited experience, I have found that most new scientists want to throw a million things into their model--especially biologically motivated phenomena--to see what happens, and as they grow older/more jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as simple as possible. In fact, a pretty good heuristic for good hypothesis-testing science is to keep things absolutely as simple as possible while allowing the question to be asked. However, that may not be a good heuristic for more exploratory science where you just set out and see what you discover.

3) You should check out Julian Miller's papers on evolving a checkers player (with his student M. Khan, I believe). Or, better, email/Skype him (he's an extremely nice guy and I'm sure he would be happy to talk to you). He decided in the last few years that he is running out of time as a scientist and has tenure and he has spent years keeping things as simple as possible, and he now just wants to do what he originally wanted to do when he started: throw as much biology in the soup as possible and see if a golem crawls out. He has incorporated a ton of biologically inspired low-level mechanisms in evolving neural networks. From what I recall, however, it did become very difficult to figure out which ingredients were essential and exactly what was going on because of all the involved complexity. He may have updated results since I last checked in, however. So, you may benefit the actual work that he has done on this front and, more generally, from his opinions on the general scientific approach you are proposing.

I hope that helps. Best of luck, and I look forward to hearing what you learn!

> Hi Ken,
>
>
> Yes, I'm pretty sure that not all of the phenomena I listed are important; and that a good starting point in general is to assume that they are not. I also agree with your argument that a lot of the low-level phenomena we see may be a result of implementation with particular physical systems (and I would add perhaps as a result of evolutionary happenstance). The CPPN is a particularly compelling example of significant abstraction of developmental processes, producing many of the same features of the end result of developmental processes. One thing it does abstract away, in the context of plastic networks, is the effect of external input on the developmental process (which may or may not be an issue depending on details of implementation, problem domain, etc...).>
> Perhaps we could also assume that, rather than some specific set of functions being the only workable set, what matters is having a workable combination of functions, and that there are many possible combinations that would work equally well. In this framework we could assume that biological neural networks represent at least a reasonably good combination of low-level functions, and so we could use this combination as a guide (but of course this doesn't answer what functions in this combination are actually important, or what things can be abstracted away). Also, some combinations may be workable, but are far harder to evolve solutions with, or require much larger networks, etc (eg evolving networks incorporating neuromodulation of synaptic plasticity can be much easier for some tasks than for those without this type of neuromodulation).>
> I'm intending to run some experiments to explore these questions (which phenomena are important, acceptable level of abstraction, etc), but of course to try and thoroughly explore all of these functions in many combinations would be a massive undertaking, and is not my main interest, so at some point I will have to pick a model and run with it after only a few, hopefully well chosen, experiments... Perhaps one approach is to create flexible parameterised versions of these functions, and let evolution determine what combination is right (like your approach described in "Evolving adaptive neural networks with and without adaptive synapses", but perhaps more flexible and applied to more functions).>
> Do you mind if I post/quote some/all of this discussion in the comments of my blog post?
>
> Cheers,
> Oliver
>
>

You can check out my NEAT implementation which I just uploaded to the file section of the group (NEAT.tar.bz2). It is written in C++ and has Python bindings

Message 8 of 16
, Jul 19, 2012

You can check out my NEAT implementation which I just uploaded to the file section of the group (NEAT.tar.bz2). It is written in C++ and has Python bindings for running generational evolution (later versions will feature running rtNEAT/HyperNEAT/Novelty Search from Python).

--- In neat@yahoogroups.com, Madan Dabbeeru <iitk.madan@...> wrote:
>
> Hello,
>
> I am looking for NEAT in Python. I am not able to find any file in the
> download like provided. (http://code.google.com/p/neat-python/downloads/list
> ).
>
> Please share me if anybody has this package.
>
> Thanks & Regards,
> Madan
>
> On Wed, Jul 11, 2012 at 11:42 AM, Jeff Clune <jeffclune@...> wrote:
>
> > Hello Oliver,
> >
> > I'm much delayed in reading all of this as I have been insanely busy
> > lately, but I have a few thoughts that might help you out:
> >
> > 1) Be careful with meta-evolution (evolving the parameters of evolutionary
> > algorithms). It sounds good in theory, but can be tricky in practice
> > because evolution is short-sighted and conservative, preferring
> > exploitation over exploration, which can be very harmful vis a vis
> > long-term adaptation. Check out my PLoS Computational Biology paper for a
> > smoking gun on this front (the evolution of mutation rates). You may face
> > the exact same problem if you go down this road. [Note, however, that using
> > a divergent search algorithm like novelty search may allow you to take
> > better advantage of meta-evolution: see Joel and Ken's 2012 alife review
> > article on that subject.]
> >
> > 2) Another reason people do not throw all of the biology into the soup to
> > see what happens is because scientifically you end up in an impenetrable
> > quagmire where you can't figure out what is going on and you end up not
> > learning much/anything. The scientific method demands keeping all else
> > equal, and if you have a lot of variables you don't perfectly understand,
> > it takes years to figure out what is going on if you are lucky! Even if you
> > keep all else equal, if you are doing so against a backdrop that involves a
> > lot of complexity you don't understand, any difference you see may be due
> > to an interaction effect with one of the features in your backdrop...and
> > that may invalidate generalizing your result to other backdrops of
> > interest. In my limited experience, I have found that most new scientists
> > want to throw a million things into their model--especially biologically
> > motivated phenomena--to see what happens, and as they grow older/more
> > jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as
> > simple as possible. In fact, a pretty good heuristic for good
> > hypothesis-testing science is to keep things absolutely as simple as
> > possible while allowing the question to be asked. However, that may not be
> > a good heuristic for more exploratory science where you just set out and
> > see what you discover.
> >
> > 3) You should check out Julian Miller's papers on evolving a checkers
> > player (with his student M. Khan, I believe). Or, better, email/Skype him
> > (he's an extremely nice guy and I'm sure he would be happy to talk to you).
> > He decided in the last few years that he is running out of time as a
> > scientist and has tenure and he has spent years keeping things as simple as
> > possible, and he now just wants to do what he originally wanted to do when
> > he started: throw as much biology in the soup as possible and see if a
> > golem crawls out. He has incorporated a ton of biologically inspired
> > low-level mechanisms in evolving neural networks. From what I recall,
> > however, it did become very difficult to figure out which ingredients were
> > essential and exactly what was going on because of all the involved
> > complexity. He may have updated results since I last checked in, however.
> > So, you may benefit the actual work that he has done on this front and,
> > more generally, from his opinions on the general scientific approach you
> > are proposing.
> >
> > I hope that helps. Best of luck, and I look forward to hearing what you
> > learn!
> >
> > Best regards,
> > Jeff Clune
> >
> > Postdoctoral Fellow
> > Cornell University
> > jeffclune@...
> > jeffclune.com
> >
> > On May 11, 2012, at 6:34 AM, Oliver Coleman wrote:
> >
> > > Hi Ken,
> > >
> > >
> > > Yes, I'm pretty sure that not all of the phenomena I listed are
> > important; and that a good starting point in general is to assume that they
> > are not. I also agree with your argument that a lot of the low-level
> > phenomena we see may be a result of implementation with particular physical
> > systems (and I would add perhaps as a result of evolutionary happenstance).
> > The CPPN is a particularly compelling example of significant abstraction of
> > developmental processes, producing many of the same features of the end
> > result of developmental processes. One thing it does abstract away, in the
> > context of plastic networks, is the effect of external input on the
> > developmental process (which may or may not be an issue depending on
> > details of implementation, problem domain, etc...).
> > >
> > > Perhaps we could also assume that, rather than some specific set of
> > functions being the only workable set, what matters is having a workable
> > combination of functions, and that there are many possible combinations
> > that would work equally well. In this framework we could assume that
> > biological neural networks represent at least a reasonably good combination
> > of low-level functions, and so we could use this combination as a guide
> > (but of course this doesn't answer what functions in this combination are
> > actually important, or what things can be abstracted away). Also, some
> > combinations may be workable, but are far harder to evolve solutions with,
> > or require much larger networks, etc (eg evolving networks incorporating
> > neuromodulation of synaptic plasticity can be much easier for some tasks
> > than for those without this type of neuromodulation).
> > >
> > > I'm intending to run some experiments to explore these questions (which
> > phenomena are important, acceptable level of abstraction, etc), but of
> > course to try and thoroughly explore all of these functions in many
> > combinations would be a massive undertaking, and is not my main interest,
> > so at some point I will have to pick a model and run with it after only a
> > few, hopefully well chosen, experiments... Perhaps one approach is to
> > create flexible parameterised versions of these functions, and let
> > evolution determine what combination is right (like your approach described
> > in "Evolving adaptive neural networks with and without adaptive synapses",
> > but perhaps more flexible and applied to more functions).
> > >
> > > Do you mind if I post/quote some/all of this discussion in the comments
> > of my blog post?
> > >
> > > Cheers,
> > > Oliver
> > >
> > >
> >
> >
> >
> > ------------------------------------
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
>

Jeff Clune

... No problem. I m actually on my honeymoon in remote jungles of Indonesia right now, so please excuse my delayed response. ... My instincts tell me that you

Message 9 of 16
, Jul 21, 2012

> Thanks for taking the time to write your thoughts. :)
>

No problem. I'm actually on my honeymoon in remote jungles of Indonesia right now, so please excuse my delayed response.

> Perhaps I did not describe my intentions clearly: I don't intend to evolve the parameters of the EA, but rather the parameters of some neural network components or functions (eq the parameters for synaptic plasticity update rules).
>

My instincts tell me that you will run into problems any time there is a tradeoff between exploration and exploitation. If such tradeoffs exist within the parameters you talk about, which they likely do, you may find worse performance than picking these parameters yourself. Note that I am talking about evolving within a run (e.g. evolving a different parameter for each organism on its genome during the course of the run). If you fix parameters for an entire run, and then have an outer loop that is evolving those parameters, that works just fine. Gus Eiben and his students have done some interesting work showing that. In any case, I recommend that you compare your meta-evolution results to experiments where you manually or algorithmically select good parameters and fix them instead of evolving them.

> Yes, I've been wondering how best to go about testing the usefulness of multiple functional properties in a model. I'm aware of Miller and Khan's work. They certainly did throw just about everything into the pot. And it was hard to draw many solid conclusions as a result. I think the best approach I've come up with so far (and I'm open to criticism or other ideas) is to create a model with (computationally tractable abstractions of) most or all of the biological phenomena/properties for which there is evidence of a role in learning, and/or which are computationally cheap, and then perform an ablative study where each property is disabled in turn: if disabling a property reduces the efficacy of the model in some learning task or other (in terms of evolvability, quality of solutions, or some other metric), then that property is useful (I believe Stanley and Miikkulainen used this approach with their NEAT algorithm). This way one doesn't have to try every combination of properties, and avoids the problem where one property might only be useful in combination with one or more other properties (which would be a problem if only one property is enabled at a time).
>

That is a great strategy. You don't completely eliminate interaction effects (one mechanism may be very helpful, but only in the absence of other mechanisms), but it is a good first crack at the problem. You may actually want to look at algorithms that have been developed for SVM feature selection: they have the same problem of figuring out which features are helpful, and there are often many interaction effects. I believe the best current policy involves both adding and subtracting features in a certain order, but I forget the details of the algorithm and its name. I could find it out for you if you like, or maybe someone else on the list knows more about this subject.

> Comparing the performance of the new model against other existing models on the same learning tasks would help demonstrate whether the new model as a whole is an improvement or not, although like you say this doesn't avoid the issue of "interaction effect[s] with one of the features in your backdrop". Perhaps testing against many kinds of learning tasks would help alleviate this effect?
>

Ken Stanley has strong opinions on why comparing different algorithms on one or a few tasks tells us very little, which he may want to chime in with. I generally agree with him that it is not terribly informative, although I tend to think it is still somewhat valuable, while he thinks it is mostly worthless! (Sorry if I am incorrectly paraphrasing you Ken). Ken is right that different algorithms perform very differently on different problems, so a few tests provides too small a sample size to learn much. Moreover, every researcher inadvertently knows their own algorithm much better than what they are comparing against, so they keep tuning their algorithm to the benchmarks being used until they win, reducing the value of the comparison. There's no great alternative, in my opinion, so I still do it...but I increasingly agree with Ken that our time as scientists can better be spent on other chores (such as showing the new, interesting, properties of our new algorithms...an example being HyperNEAT genomes scaling up to very large networks without substantial performance drops).

> [I'm wondering if all of this is a little off-topic for this list...]
>

It's not at all off-topic, in my opinion! In fact, I've had similar conversations (e.g. with Ken Stanley and Julian Miller) at many evolutionary conferences.

Best of luck, and I look forward to hearing more about your work as you begin it.
Jeff

> Cheers,
> Oliver
>
> On 11 July 2012 16:12, Jeff Clune <jeffclune@...> wrote:
> Hello Oliver,
>
> I'm much delayed in reading all of this as I have been insanely busy lately, but I have a few thoughts that might help you out:
>
> 1) Be careful with meta-evolution (evolving the parameters of evolutionary algorithms). It sounds good in theory, but can be tricky in practice because evolution is short-sighted and conservative, preferring exploitation over exploration, which can be very harmful vis a vis long-term adaptation. Check out my PLoS Computational Biology paper for a smoking gun on this front (the evolution of mutation rates). You may face the exact same problem if you go down this road. [Note, however, that using a divergent search algorithm like novelty search may allow you to take better advantage of meta-evolution: see Joel and Ken's 2012 alife review article on that subject.]
>
> 2) Another reason people do not throw all of the biology into the soup to see what happens is because scientifically you end up in an impenetrable quagmire where you can't figure out what is going on and you end up not learning much/anything. The scientific method demands keeping all else equal, and if you have a lot of variables you don't perfectly understand, it takes years to figure out what is going on if you are lucky! Even if you keep all else equal, if you are doing so against a backdrop that involves a lot of complexity you don't understand, any difference you see may be due to an interaction effect with one of the features in your backdrop...and that may invalidate generalizing your result to other backdrops of interest. In my limited experience, I have found that most new scientists want to throw a million things into their model--especially biologically motivated phenomena--to see what happens, and as they grow older/more jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as simple as possible. In fact, a pretty good heuristic for good hypothesis-testing science is to keep things absolutely as simple as possible while allowing the question to be asked. However, that may not be a good heuristic for more exploratory science where you just set out and see what you discover.
>
> 3) You should check out Julian Miller's papers on evolving a checkers player (with his student M. Khan, I believe). Or, better, email/Skype him (he's an extremely nice guy and I'm sure he would be happy to talk to you). He decided in the last few years that he is running out of time as a scientist and has tenure and he has spent years keeping things as simple as possible, and he now just wants to do what he originally wanted to do when he started: throw as much biology in the soup as possible and see if a golem crawls out. He has incorporated a ton of biologically inspired low-level mechanisms in evolving neural networks. From what I recall, however, it did become very difficult to figure out which ingredients were essential and exactly what was going on because of all the involved complexity. He may have updated results since I last checked in, however. So, you may benefit the actual work that he has done on this front and, more generally, from his opinions on the general scientific approach you are proposing.
>
> I hope that helps. Best of luck, and I look forward to hearing what you learn!
>
> Best regards,
> Jeff Clune
>
> Postdoctoral Fellow
> Cornell University
> jeffclune@...
> jeffclune.com
>
> On May 11, 2012, at 6:34 AM, Oliver Coleman wrote:
>
> > Hi Ken,
> >
> >
> > Yes, I'm pretty sure that not all of the phenomena I listed are important; and that a good starting point in general is to assume that they are not. I also agree with your argument that a lot of the low-level phenomena we see may be a result of implementation with particular physical systems (and I would add perhaps as a result of evolutionary happenstance). The CPPN is a particularly compelling example of significant abstraction of developmental processes, producing many of the same features of the end result of developmental processes. One thing it does abstract away, in the context of plastic networks, is the effect of external input on the developmental process (which may or may not be an issue depending on details of implementation, problem domain, etc...).
> >
> > Perhaps we could also assume that, rather than some specific set of functions being the only workable set, what matters is having a workable combination of functions, and that there are many possible combinations that would work equally well. In this framework we could assume that biological neural networks represent at least a reasonably good combination of low-level functions, and so we could use this combination as a guide (but of course this doesn't answer what functions in this combination are actually important, or what things can be abstracted away). Also, some combinations may be workable, but are far harder to evolve solutions with, or require much larger networks, etc (eg evolving networks incorporating neuromodulation of synaptic plasticity can be much easier for some tasks than for those without this type of neuromodulation).
> >
> > I'm intending to run some experiments to explore these questions (which phenomena are important, acceptable level of abstraction, etc), but of course to try and thoroughly explore all of these functions in many combinations would be a massive undertaking, and is not my main interest, so at some point I will have to pick a model and run with it after only a few, hopefully well chosen, experiments... Perhaps one approach is to create flexible parameterised versions of these functions, and let evolution determine what combination is right (like your approach described in "Evolving adaptive neural networks with and without adaptive synapses", but perhaps more flexible and applied to more functions).
> >
> > Do you mind if I post/quote some/all of this discussion in the comments of my blog post?
> >
> > Cheers,
> > Oliver
> >
> >
>
>
>
> ------------------------------------
>
> Yahoo! Groups Links
>
>
>
>
>
>

You can check out my NEAT implementation which I just uploaded to the file section of the group (NEAT.tar.bz2). It is written in C++ and has Python bindings for running generational evolution (later versions will feature running rtNEAT/HyperNEAT/Novelty Search from Python).

> On Wed, Jul 11, 2012 at 11:42 AM, Jeff Clune <jeffclune@...> wrote:
>
> > Hello Oliver,
> >
> > I'm much delayed in reading all of this as I have been insanely busy
> > lately, but I have a few thoughts that might help you out:
> >
> > 1) Be careful with meta-evolution (evolving the parameters of evolutionary
> > algorithms). It sounds good in theory, but can be tricky in practice
> > because evolution is short-sighted and conservative, preferring
> > exploitation over exploration, which can be very harmful vis a vis
> > long-term adaptation. Check out my PLoS Computational Biology paper for a
> > smoking gun on this front (the evolution of mutation rates). You may face
> > the exact same problem if you go down this road. [Note, however, that using
> > a divergent search algorithm like novelty search may allow you to take
> > better advantage of meta-evolution: see Joel and Ken's 2012 alife review
> > article on that subject.]
> >
> > 2) Another reason people do not throw all of the biology into the soup to
> > see what happens is because scientifically you end up in an impenetrable
> > quagmire where you can't figure out what is going on and you end up not
> > learning much/anything. The scientific method demands keeping all else
> > equal, and if you have a lot of variables you don't perfectly understand,
> > it takes years to figure out what is going on if you are lucky! Even if you
> > keep all else equal, if you are doing so against a backdrop that involves a
> > lot of complexity you don't understand, any difference you see may be due
> > to an interaction effect with one of the features in your backdrop...and
> > that may invalidate generalizing your result to other backdrops of
> > interest. In my limited experience, I have found that most new scientists
> > want to throw a million things into their model--especially biologically
> > motivated phenomena--to see what happens, and as they grow older/more
> > jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as
> > simple as possible. In fact, a pretty good heuristic for good
> > hypothesis-testing science is to keep things absolutely as simple as
> > possible while allowing the question to be asked. However, that may not be
> > a good heuristic for more exploratory science where you just set out and
> > see what you discover.
> >
> > 3) You should check out Julian Miller's papers on evolving a checkers
> > player (with his student M. Khan, I believe). Or, better, email/Skype him
> > (he's an extremely nice guy and I'm sure he would be happy to talk to you).
> > He decided in the last few years that he is running out of time as a
> > scientist and has tenure and he has spent years keeping things as simple as
> > possible, and he now just wants to do what he originally wanted to do when
> > he started: throw as much biology in the soup as possible and see if a
> > golem crawls out. He has incorporated a ton of biologically inspired
> > low-level mechanisms in evolving neural networks. From what I recall,
> > however, it did become very difficult to figure out which ingredients were
> > essential and exactly what was going on because of all the involved
> > complexity. He may have updated results since I last checked in, however.
> > So, you may benefit the actual work that he has done on this front and,
> > more generally, from his opinions on the general scientific approach you
> > are proposing.
> >
> > I hope that helps. Best of luck, and I look forward to hearing what you
> > learn!
> >
> > Best regards,
> > Jeff Clune
> >
> > Postdoctoral Fellow
> > Cornell University

> > jeffclune@...

> > jeffclune.com
> >
> > On May 11, 2012, at 6:34 AM, Oliver Coleman wrote:
> >
> > > Hi Ken,
> > >
> > >
> > > Yes, I'm pretty sure that not all of the phenomena I listed are
> > important; and that a good starting point in general is to assume that they
> > are not. I also agree with your argument that a lot of the low-level
> > phenomena we see may be a result of implementation with particular physical
> > systems (and I would add perhaps as a result of evolutionary happenstance).
> > The CPPN is a particularly compelling example of significant abstraction of
> > developmental processes, producing many of the same features of the end
> > result of developmental processes. One thing it does abstract away, in the
> > context of plastic networks, is the effect of external input on the
> > developmental process (which may or may not be an issue depending on
> > details of implementation, problem domain, etc...).
> > >
> > > Perhaps we could also assume that, rather than some specific set of
> > functions being the only workable set, what matters is having a workable
> > combination of functions, and that there are many possible combinations
> > that would work equally well. In this framework we could assume that
> > biological neural networks represent at least a reasonably good combination
> > of low-level functions, and so we could use this combination as a guide
> > (but of course this doesn't answer what functions in this combination are
> > actually important, or what things can be abstracted away). Also, some
> > combinations may be workable, but are far harder to evolve solutions with,
> > or require much larger networks, etc (eg evolving networks incorporating
> > neuromodulation of synaptic plasticity can be much easier for some tasks
> > than for those without this type of neuromodulation).
> > >
> > > I'm intending to run some experiments to explore these questions (which
> > phenomena are important, acceptable level of abstraction, etc), but of
> > course to try and thoroughly explore all of these functions in many
> > combinations would be a massive undertaking, and is not my main interest,
> > so at some point I will have to pick a model and run with it after only a
> > few, hopefully well chosen, experiments... Perhaps one approach is to
> > create flexible parameterised versions of these functions, and let
> > evolution determine what combination is right (like your approach described
> > in "Evolving adaptive neural networks with and without adaptive synapses",
> > but perhaps more flexible and applied to more functions).
> > >
> > > Do you mind if I post/quote some/all of this discussion in the comments
> > of my blog post?
> > >
> > > Cheers,
> > > Oliver
> > >
> > >
> >
> >
> >
> > ------------------------------------
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
>

Oliver Coleman

Hi Jeff, Thanks so much for your interesting and useful comments on parameter evolution and exploration and exploitation trade off, I will have to take this

Message 11 of 16
, Jul 22, 2012

Hi Jeff,

Thanks so much for your interesting and useful comments on parameter evolution and exploration and exploitation trade off, I will have to take this into consideration and read up on the work you mention. In reviewing the literature on evolution of plastic networks I had found that often, and counter-intuitively, increasing the complexity/parameter space of the network model improved evolvability (speed and reliability of producing solutions), but only to a point. I have been somewhat swayed by this finding, and concluded that introducing many evolvable parameters would not significantly affect evolvability, but clearly I need to consider this more carefully. (I've submitted my findings in a review paper for a conference, haven't heard back about acceptance yet, I hear it can be hard to get review papers accepted... I'd be happy to provide it here if you're interested).

Also thanks for suggestion re interaction effects and SVM feature selection, very interesting!

As it turns out I've been working on a blog post on the significance of positive results on one or a few tasks. It's been stagnating for a few weeks, I wasn't sure about its significance... but your comments on this have provided me with the motivation to get it out the door (as well as a few more interesting words to add to it): http://ojcoleman.com/content/are-positive-results-one-or-two-tasks-significant thanks!

You can check out my NEAT implementation which I just uploaded to the file section of the group (NEAT.tar.bz2). It is written in C++ and has Python bindings for running generational evolution (later versions will feature running rtNEAT/HyperNEAT/Novelty Search from Python).

> On Wed, Jul 11, 2012 at 11:42 AM, Jeff Clune <jeffclune@...> wrote:
>
> > Hello Oliver,
> >
> > I'm much delayed in reading all of this as I have been insanely busy
> > lately, but I have a few thoughts that might help you out:
> >
> > 1) Be careful with meta-evolution (evolving the parameters of evolutionary
> > algorithms). It sounds good in theory, but can be tricky in practice
> > because evolution is short-sighted and conservative, preferring
> > exploitation over exploration, which can be very harmful vis a vis
> > long-term adaptation. Check out my PLoS Computational Biology paper for a
> > smoking gun on this front (the evolution of mutation rates). You may face
> > the exact same problem if you go down this road. [Note, however, that using
> > a divergent search algorithm like novelty search may allow you to take
> > better advantage of meta-evolution: see Joel and Ken's 2012 alife review
> > article on that subject.]
> >
> > 2) Another reason people do not throw all of the biology into the soup to
> > see what happens is because scientifically you end up in an impenetrable
> > quagmire where you can't figure out what is going on and you end up not
> > learning much/anything. The scientific method demands keeping all else
> > equal, and if you have a lot of variables you don't perfectly understand,
> > it takes years to figure out what is going on if you are lucky! Even if you
> > keep all else equal, if you are doing so against a backdrop that involves a
> > lot of complexity you don't understand, any difference you see may be due
> > to an interaction effect with one of the features in your backdrop...and
> > that may invalidate generalizing your result to other backdrops of
> > interest. In my limited experience, I have found that most new scientists
> > want to throw a million things into their model--especially biologically
> > motivated phenomena--to see what happens, and as they grow older/more
> > jaded/wiser/more experienced/gun shy/etc. they increasingly keep things as
> > simple as possible. In fact, a pretty good heuristic for good
> > hypothesis-testing science is to keep things absolutely as simple as
> > possible while allowing the question to be asked. However, that may not be
> > a good heuristic for more exploratory science where you just set out and
> > see what you discover.
> >
> > 3) You should check out Julian Miller's papers on evolving a checkers
> > player (with his student M. Khan, I believe). Or, better, email/Skype him
> > (he's an extremely nice guy and I'm sure he would be happy to talk to you).
> > He decided in the last few years that he is running out of time as a
> > scientist and has tenure and he has spent years keeping things as simple as
> > possible, and he now just wants to do what he originally wanted to do when
> > he started: throw as much biology in the soup as possible and see if a
> > golem crawls out. He has incorporated a ton of biologically inspired
> > low-level mechanisms in evolving neural networks. From what I recall,
> > however, it did become very difficult to figure out which ingredients were
> > essential and exactly what was going on because of all the involved
> > complexity. He may have updated results since I last checked in, however.
> > So, you may benefit the actual work that he has done on this front and,
> > more generally, from his opinions on the general scientific approach you
> > are proposing.
> >
> > I hope that helps. Best of luck, and I look forward to hearing what you
> > learn!
> >
> > Best regards,
> > Jeff Clune
> >
> > Postdoctoral Fellow
> > Cornell University

> > jeffclune@...

> > jeffclune.com
> >
> > On May 11, 2012, at 6:34 AM, Oliver Coleman wrote:
> >
> > > Hi Ken,
> > >
> > >
> > > Yes, I'm pretty sure that not all of the phenomena I listed are
> > important; and that a good starting point in general is to assume that they
> > are not. I also agree with your argument that a lot of the low-level
> > phenomena we see may be a result of implementation with particular physical
> > systems (and I would add perhaps as a result of evolutionary happenstance).
> > The CPPN is a particularly compelling example of significant abstraction of
> > developmental processes, producing many of the same features of the end
> > result of developmental processes. One thing it does abstract away, in the
> > context of plastic networks, is the effect of external input on the
> > developmental process (which may or may not be an issue depending on
> > details of implementation, problem domain, etc...).
> > >
> > > Perhaps we could also assume that, rather than some specific set of
> > functions being the only workable set, what matters is having a workable
> > combination of functions, and that there are many possible combinations
> > that would work equally well. In this framework we could assume that
> > biological neural networks represent at least a reasonably good combination
> > of low-level functions, and so we could use this combination as a guide
> > (but of course this doesn't answer what functions in this combination are
> > actually important, or what things can be abstracted away). Also, some
> > combinations may be workable, but are far harder to evolve solutions with,
> > or require much larger networks, etc (eg evolving networks incorporating
> > neuromodulation of synaptic plasticity can be much easier for some tasks
> > than for those without this type of neuromodulation).
> > >
> > > I'm intending to run some experiments to explore these questions (which
> > phenomena are important, acceptable level of abstraction, etc), but of
> > course to try and thoroughly explore all of these functions in many
> > combinations would be a massive undertaking, and is not my main interest,
> > so at some point I will have to pick a model and run with it after only a
> > few, hopefully well chosen, experiments... Perhaps one approach is to
> > create flexible parameterised versions of these functions, and let
> > evolution determine what combination is right (like your approach described
> > in "Evolving adaptive neural networks with and without adaptive synapses",
> > but perhaps more flexible and applied to more functions).
> > >
> > > Do you mind if I post/quote some/all of this discussion in the comments
> > of my blog post?
> > >
> > > Cheers,
> > > Oliver
> > >
> > >
> >
> >
> >
> > ------------------------------------
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
>

Ken

Hi Jeff and Oliver, nice discussion and definitely relevant to the group. Jeff mentioned my strong opinions about algorithm comparisons, so I thought it

Message 12 of 16
, Jul 25, 2012

Hi Jeff and Oliver, nice discussion and definitely relevant to the group. Jeff mentioned my "strong opinions" about algorithm comparisons, so I thought it can't hurt to follow up on what Jeff said:

"Ken Stanley has strong opinions on why comparing different
algorithms on one or a few tasks tells us very little, which he may
want to chime in with. I generally agree with him that it is not
terribly informative, although I tend to think it is still somewhat
tvaluable, while he thinks it is mostly worthless! (Sorry if I am
tincorrectly paraphrasing you Ken). Ken is right that different
algorithms perform very differently on different problems, so a few tests provides too small a sample size to learn much. Moreover, every researcher inadvertently knows their own algorithm much better than what they are comparing against, so they keep tuning their algorithm to the benchmarks being used until they win, reducing the value of the comparison. There's no great alternative, in my opinion, so I still do it...but I increasingly agree with Ken that our time as scientists can better be spent on other chores (such as showing the new, interesting, properties of our new algorithms...an example being HyperNEAT genomes scaling up to very large networks without substantial performance drops)."

I agree with these concerns but as Jeff hints I'd go farther with it. The problem here is more fundamental than simply that it's hard to tell which algorithm is "better" from a few comparisons. The problem is that it's not even clear what "better" means no matter how many comparisons there are. Quantitative comparisons imply that "better" means that an algorithms scores better on average on some performance metric. But for those who are pursuing revolutionary advances in AI, I'm skeptical that it really matters which algorithm scores better even across many benchmarks.

The reason is that to me "better" should mean "leads to the most new algorithms in the future." In other words, it has little or nothing to do with performance. "Better" means creating a foundation for new ideas and a new research direction. We know it when we see it. We're talking about primitive AI algorithms here that are about 3 inches into a 10-million-kilometer marathon to the pinnacle of AI. If you're looking at two different algorithms then in effect you're comparing two different points in the vast space of all possible algorithms. Given that there are probably light years of advances to go in the direction of either one of them, why would you cut the path of either one of them off regardless of the "results" if both of them are interesting ideas?

If you were running an evolutionary algorithm with diversity maintenance of some kind, then how one arbitrary point in the search space compares to another would hardly matter. So why do we care about apple-and-oranges comparisons in AI?

I think it has become a convenient way to avoid the sobering reality that most algorithms don't have any exciting ideas behind them. So the only thing you can do is look at a pointless comparison. For those algorithms that do have interesting ideas behind them, I don't even need a comparison to know they're interesting, and even if they perform worse than something else, the last thing I want to do is throw out an interesting idea. Who knows where it might lead?

So yes comparisons are very overrated. One type of comparison I do think can be useful once in a while is to compare an algorithm with a variant of itself (which includes ablations). That can give a sense of what a new ingredient adds. But even then, if the idea isn't inspirational, the performance gain won't matter much in the long run. Because in the long run we aren't interested in performance gains but rather in stepping stones to new frontiers. These things (i.e. performance and where an idea leads) are not correlated in any complex search space and therefore we should should not be running the whole field of AI research like a naive giant hill-climbing algorithm. The irony here is that the world's greatest experts in search are doing exactly that at the meta-level (i.e. at the level of how the community searches for new algorithms) by focusing so intently on comparative performance results.

The one other kind of performance result I think is useful is when an algorithm does something completely unprecedented. Of course, in that case, you don't need a comparison because there's nothing to compare with. Though that won't stop traditionalists from clamoring for a comparison anyway.

ken

Oliver Coleman

Hi Ken, I agree overall that qualitative results are much more interesting and necessary than quantitative results at least at this stage of advancement in AI

Message 13 of 16
, Jul 25, 2012

Hi Ken,

I agree overall that qualitative results are much more interesting and necessary than quantitative results at least at this stage of advancement in AI (I'm strongly swayed by your arguments; what have you got to say Jeff?).

However, perhaps comparative performance
results can help to provide insight into the characteristics of one
approach versus another and provide useful information to help improve
them, for example if two approaches fail on different kinds of tasks
perhaps we can look at how they each succeed in different ways in order
to improve one or both approaches. In the search space of AI algorithms,
comparative results can provide information about which algorithms
should have a crossover operator applied to them to produce potentially
better ones (the apple and orange may combine to produce a new inspirational fruit ;)). Perhaps this is too vague to be useful...

Oliver

P.S. I've quoted your comment in a reply on my blog, I'm assuming that
your permission to post your comments on my blog earlier in this
discussion applies to ongoing replies, let me know if this is not
okay...

Hi Jeff and Oliver, nice discussion and definitely relevant to the group. Jeff mentioned my "strong opinions" about algorithm comparisons, so I thought it can't hurt to follow up on what Jeff said:

"Ken Stanley has strong opinions on why comparing different
algorithms on one or a few tasks tells us very little, which he may
want to chime in with. I generally agree with him that it is not
terribly informative, although I tend to think it is still somewhat

tvaluable, while he thinks it is mostly worthless! (Sorry if I am
tincorrectly paraphrasing you Ken). Ken is right that different

algorithms perform very differently on different problems, so a few tests provides too small a sample size to learn much. Moreover, every researcher inadvertently knows their own algorithm much better than what they are comparing against, so they keep tuning their algorithm to the benchmarks being used until they win, reducing the value of the comparison. There's no great alternative, in my opinion, so I still do it...but I increasingly agree with Ken that our time as scientists can better be spent on other chores (such as showing the new, interesting, properties of our new algorithms...an example being HyperNEAT genomes scaling up to very large networks without substantial performance drops)."

I agree with these concerns but as Jeff hints I'd go farther with it. The problem here is more fundamental than simply that it's hard to tell which algorithm is "better" from a few comparisons. The problem is that it's not even clear what "better" means no matter how many comparisons there are. Quantitative comparisons imply that "better" means that an algorithms scores better on average on some performance metric. But for those who are pursuing revolutionary advances in AI, I'm skeptical that it really matters which algorithm scores better even across many benchmarks.

The reason is that to me "better" should mean "leads to the most new algorithms in the future." In other words, it has little or nothing to do with performance. "Better" means creating a foundation for new ideas and a new research direction. We know it when we see it. We're talking about primitive AI algorithms here that are about 3 inches into a 10-million-kilometer marathon to the pinnacle of AI. If you're looking at two different algorithms then in effect you're comparing two different points in the vast space of all possible algorithms. Given that there are probably light years of advances to go in the direction of either one of them, why would you cut the path of either one of them off regardless of the "results" if both of them are interesting ideas?

If you were running an evolutionary algorithm with diversity maintenance of some kind, then how one arbitrary point in the search space compares to another would hardly matter. So why do we care about apple-and-oranges comparisons in AI?

I think it has become a convenient way to avoid the sobering reality that most algorithms don't have any exciting ideas behind them. So the only thing you can do is look at a pointless comparison. For those algorithms that do have interesting ideas behind them, I don't even need a comparison to know they're interesting, and even if they perform worse than something else, the last thing I want to do is throw out an interesting idea. Who knows where it might lead?

So yes comparisons are very overrated. One type of comparison I do think can be useful once in a while is to compare an algorithm with a variant of itself (which includes ablations). That can give a sense of what a new ingredient adds. But even then, if the idea isn't inspirational, the performance gain won't matter much in the long run. Because in the long run we aren't interested in performance gains but rather in stepping stones to new frontiers. These things (i.e. performance and where an idea leads) are not correlated in any complex search space and therefore we should should not be running the whole field of AI research like a naive giant hill-climbing algorithm. The irony here is that the world's greatest experts in search are doing exactly that at the meta-level (i.e. at the level of how the community searches for new algorithms) by focusing so intently on comparative performance results.

The one other kind of performance result I think is useful is when an algorithm does something completely unprecedented. Of course, in that case, you don't need a comparison because there's nothing to compare with. Though that won't stop traditionalists from clamoring for a comparison anyway.

ken

Evert Haasdijk

All, An interesting paper on the subject of the use(lesness) of performance comparisons in our research was published by Hooker: Hooker J (1995) Testing

Message 14 of 16
, Jul 26, 2012

All,

An interesting paper on the subject of the use(lesness) of performance comparisons in our research was published by Hooker:

>
> Hi Ken,
>
> I agree overall that qualitative results are much more interesting and necessary than quantitative results at least at this stage of advancement in AI (I'm strongly swayed by your arguments; what have you got to say Jeff?).
>
> However, perhaps comparative performance results can help to provide insight into the characteristics of one approach versus another and provide useful information to help improve them, for example if two approaches fail on different kinds of tasks perhaps we can look at how they each succeed in different ways in order to improve one or both approaches. In the search space of AI algorithms, comparative results can provide information about which algorithms should have a crossover operator applied to them to produce potentially better ones (the apple and orange may combine to produce a new inspirational fruit ;)). Perhaps this is too vague to be useful...
> Oliver
>
>
> P.S. I've quoted your comment in a reply on my blog, I'm assuming that your permission to post your comments on my blog earlier in this discussion applies to ongoing replies, let me know if this is not okay...
>
> On 25 July 2012 17:34, Ken <kstanley@...> wrote:
>
>
>
>
> Hi Jeff and Oliver, nice discussion and definitely relevant to the group. Jeff mentioned my "strong opinions" about algorithm comparisons, so I thought it can't hurt to follow up on what Jeff said:
>
> "Ken Stanley has strong opinions on why comparing different
> algorithms on one or a few tasks tells us very little, which he may
> want to chime in with. I generally agree with him that it is not
> terribly informative, although I tend to think it is still somewhat
> tvaluable, while he thinks it is mostly worthless! (Sorry if I am
> tincorrectly paraphrasing you Ken). Ken is right that different
> algorithms perform very differently on different problems, so a few tests provides too small a sample size to learn much. Moreover, every researcher inadvertently knows their own algorithm much better than what they are comparing against, so they keep tuning their algorithm to the benchmarks being used until they win, reducing the value of the comparison. There's no great alternative, in my opinion, so I still do it...but I increasingly agree with Ken that our time as scientists can better be spent on other chores (such as showing the new, interesting, properties of our new algorithms...an example being HyperNEAT genomes scaling up to very large networks without substantial performance drops)."
>
> I agree with these concerns but as Jeff hints I'd go farther with it. The problem here is more fundamental than simply that it's hard to tell which algorithm is "better" from a few comparisons. The problem is that it's not even clear what "better" means no matter how many comparisons there are. Quantitative comparisons imply that "better" means that an algorithms scores better on average on some performance metric. But for those who are pursuing revolutionary advances in AI, I'm skeptical that it really matters which algorithm scores better even across many benchmarks.
>
> The reason is that to me "better" should mean "leads to the most new algorithms in the future." In other words, it has little or nothing to do with performance. "Better" means creating a foundation for new ideas and a new research direction. We know it when we see it. We're talking about primitive AI algorithms here that are about 3 inches into a 10-million-kilometer marathon to the pinnacle of AI. If you're looking at two different algorithms then in effect you're comparing two different points in the vast space of all possible algorithms. Given that there are probably light years of advances to go in the direction of either one of them, why would you cut the path of either one of them off regardless of the "results" if both of them are interesting ideas?
>
> If you were running an evolutionary algorithm with diversity maintenance of some kind, then how one arbitrary point in the search space compares to another would hardly matter. So why do we care about apple-and-oranges comparisons in AI?
>
> I think it has become a convenient way to avoid the sobering reality that most algorithms don't have any exciting ideas behind them. So the only thing you can do is look at a pointless comparison. For those algorithms that do have interesting ideas behind them, I don't even need a comparison to know they're interesting, and even if they perform worse than something else, the last thing I want to do is throw out an interesting idea. Who knows where it might lead?
>
> So yes comparisons are very overrated. One type of comparison I do think can be useful once in a while is to compare an algorithm with a variant of itself (which includes ablations). That can give a sense of what a new ingredient adds. But even then, if the idea isn't inspirational, the performance gain won't matter much in the long run. Because in the long run we aren't interested in performance gains but rather in stepping stones to new frontiers. These things (i.e. performance and where an idea leads) are not correlated in any complex search space and therefore we should should not be running the whole field of AI research like a naive giant hill-climbing algorithm. The irony here is that the world's greatest experts in search are doing exactly that at the meta-level (i.e. at the level of how the community searches for new algorithms) by focusing so intently on comparative performance results.
>
> The one other kind of performance result I think is useful is when an algorithm does something completely unprecedented. Of course, in that case, you don't need a comparison because there's nothing to compare with. Though that won't stop traditionalists from clamoring for a comparison anyway.
>
> ken
>
>
>
>
>

Colin Green

Hi, From the abstract this looks like an interesting read, thanks. I m understanding the point and largely agree with Ken s and other s view that focusing on

Message 15 of 16
, Jul 26, 2012

Hi,

From the abstract this looks like an interesting read, thanks. I'm
understanding the point and largely agree with Ken's and other's view
that focusing on test metrics can tend to be be a distraction,
although I'm not in academia and have yet to submit a paper for
publication so my understanding of the problem at journal paper
acceptance level is zero.

Are papers really rejected based on whether they present a comparison
or comparisons with existing methods? I would hope that a novel method
that shows that it succeeds on a reasonably difficult test would be
published regardless of any comparison, the point being that such a
method is a new idea, qualitatively different and therefore worthy of
publication to spread the idea and maybe spark further ideas. Whereas
a paper that makes a tweak and improves performance by say 50% (e.g.
solved problem in 1 CPU hour instead of 2), is maybe interesting and
worthy for those working on that particular method but is probably not
of wider interest.

That said, I would like to throw the idea out there that maybe it's
not all that bad a situation, if a novel method is head and shoulders
above the current best then presumably this would be evident in a
range of problem domains. But yes, how do you follow a path of
research that gets to that method if you're too focused on metrics.
How very 'meta' :)

But as all of us on here will be aware, evolution in nature consists
both of large disruptive jumps that cause a flurry of new solutions to
be searched in a relatively short span of time (e.g. Cambrian
explosion), as well as smooth steady incremental improvements, and
yes, periods of relative stagnation. A single paper could and
eventually will make a large jump that shakes up the entire research
community. I think the novelty search research 'track' is a good
candidate for spawning that paper.

Colin

Ken

Hi Evert, thanks for sharing the article. As you can guess, I agree with a lot of what it says and I liked it. But my concerns go even farther in the sense

Message 16 of 16
, Jul 31, 2012

Hi Evert, thanks for sharing the article. As you can guess, I agree with a lot of what it says and I liked it. But my concerns go even farther in the sense that this article still voices a concern with the accuracy of comparative results whereas my question is: Even if we could be sure that the results of any comparison were accurate, what good would that be if the long-term potential of a research direction has no correlation with whether it compares well or poorly with some arbitrary alternative method? In other words, if method X performs worse than method Y, method X still might be entirely more innovative and lead to many more new ideas than dead-end method Y. So even a lack of accuracy isn't the deepest issue.

My opinion (which in part echoes the article) is that we spend significantly more effort than we should on trying to prove to each other that our comparisons are accurate as opposed to spending time on discussing the future potential created by the ideas behind our methods.

Best,

ken

--- In neat@yahoogroups.com, Evert Haasdijk <evert@...> wrote:
>
> All,
>
> An interesting paper on the subject of the use(lesness) of performance comparisons in our research was published by Hooker:
>
>
> Hooker J (1995) Testing heuristics: We have it all wrong. Journal of Heuristics 1:3342, URL
> http://dx.doi.org/10.1007/BF02430364, 10.1007/BF02430364
>
> Just in case you hadn't read that yet
>
> Cheers,
>
> Evert
>
>
> On 26 Jul 2012, at 00:15, Oliver Coleman wrote:
>
> >
> > Hi Ken,
> >
> > I agree overall that qualitative results are much more interesting and necessary than quantitative results at least at this stage of advancement in AI (I'm strongly swayed by your arguments; what have you got to say Jeff?).
> >
> > However, perhaps comparative performance results can help to provide insight into the characteristics of one approach versus another and provide useful information to help improve them, for example if two approaches fail on different kinds of tasks perhaps we can look at how they each succeed in different ways in order to improve one or both approaches. In the search space of AI algorithms, comparative results can provide information about which algorithms should have a crossover operator applied to them to produce potentially better ones (the apple and orange may combine to produce a new inspirational fruit ;)). Perhaps this is too vague to be useful...
> > Oliver
> >
> >
> > P.S. I've quoted your comment in a reply on my blog, I'm assuming that your permission to post your comments on my blog earlier in this discussion applies to ongoing replies, let me know if this is not okay...
> >
> > On 25 July 2012 17:34, Ken <kstanley@...> wrote:
> >
> >
> >
> >
> > Hi Jeff and Oliver, nice discussion and definitely relevant to the group. Jeff mentioned my "strong opinions" about algorithm comparisons, so I thought it can't hurt to follow up on what Jeff said:
> >
> > "Ken Stanley has strong opinions on why comparing different
> > algorithms on one or a few tasks tells us very little, which he may
> > want to chime in with. I generally agree with him that it is not
> > terribly informative, although I tend to think it is still somewhat
> > tvaluable, while he thinks it is mostly worthless! (Sorry if I am
> > tincorrectly paraphrasing you Ken). Ken is right that different
> > algorithms perform very differently on different problems, so a few tests provides too small a sample size to learn much. Moreover, every researcher inadvertently knows their own algorithm much better than what they are comparing against, so they keep tuning their algorithm to the benchmarks being used until they win, reducing the value of the comparison. There's no great alternative, in my opinion, so I still do it...but I increasingly agree with Ken that our time as scientists can better be spent on other chores (such as showing the new, interesting, properties of our new algorithms...an example being HyperNEAT genomes scaling up to very large networks without substantial performance drops)."
> >
> > I agree with these concerns but as Jeff hints I'd go farther with it. The problem here is more fundamental than simply that it's hard to tell which algorithm is "better" from a few comparisons. The problem is that it's not even clear what "better" means no matter how many comparisons there are. Quantitative comparisons imply that "better" means that an algorithms scores better on average on some performance metric. But for those who are pursuing revolutionary advances in AI, I'm skeptical that it really matters which algorithm scores better even across many benchmarks.
> >
> > The reason is that to me "better" should mean "leads to the most new algorithms in the future." In other words, it has little or nothing to do with performance. "Better" means creating a foundation for new ideas and a new research direction. We know it when we see it. We're talking about primitive AI algorithms here that are about 3 inches into a 10-million-kilometer marathon to the pinnacle of AI. If you're looking at two different algorithms then in effect you're comparing two different points in the vast space of all possible algorithms. Given that there are probably light years of advances to go in the direction of either one of them, why would you cut the path of either one of them off regardless of the "results" if both of them are interesting ideas?
> >
> > If you were running an evolutionary algorithm with diversity maintenance of some kind, then how one arbitrary point in the search space compares to another would hardly matter. So why do we care about apple-and-oranges comparisons in AI?
> >
> > I think it has become a convenient way to avoid the sobering reality that most algorithms don't have any exciting ideas behind them. So the only thing you can do is look at a pointless comparison. For those algorithms that do have interesting ideas behind them, I don't even need a comparison to know they're interesting, and even if they perform worse than something else, the last thing I want to do is throw out an interesting idea. Who knows where it might lead?
> >
> > So yes comparisons are very overrated. One type of comparison I do think can be useful once in a while is to compare an algorithm with a variant of itself (which includes ablations). That can give a sense of what a new ingredient adds. But even then, if the idea isn't inspirational, the performance gain won't matter much in the long run. Because in the long run we aren't interested in performance gains but rather in stepping stones to new frontiers. These things (i.e. performance and where an idea leads) are not correlated in any complex search space and therefore we should should not be running the whole field of AI research like a naive giant hill-climbing algorithm. The irony here is that the world's greatest experts in search are doing exactly that at the meta-level (i.e. at the level of how the community searches for new algorithms) by focusing so intently on comparative performance results.
> >
> > The one other kind of performance result I think is useful is when an algorithm does something completely unprecedented. Of course, in that case, you don't need a comparison because there's nothing to compare with. Though that won't stop traditionalists from clamoring for a comparison anyway.
> >
> > ken
> >
> >
> >
> >
> >
>

Your message has been successfully submitted and would be delivered to recipients shortly.