Artificial General Intelligence Development

Main menu

Post navigation

An interesting discussion took place a week or so ago on the Numenta mailing list. I was a frequent contributor, the first time since i was dragged over the coals for my opinions about using NUPIC to optimize network traffic. This time i got to do a bit of dragging myself, mostly to do with the opinions of another that i can only reasonably describe using a line from Dawn of the Planet of the Apes, as “hippy dippy bullshit”.

But never mind that. Most people who might be reading this will be familiar with the theory that an AGI might reasonably be indifferent to humanity, and would carry on satisfying it’s goal function without regard to us. During the discussion i had another thought. What if a blossoming AGI looked into the future and, say using a fairly accurate simulation, determined that the universe is going to die a slow burnout death in a few billion years, and as such there was no point in going on, and shut itself down? (For argument sake, let’s assume that the machine didn’t decide to shut us all down with it, presumably for our own good.) What if it turned out that a big problem is convincing the AGI to just keep running?

Perhaps it will actually prove tricky to balance the psychology of the AGI between a suicidal depression and a psychopathic apathy.

These results provide compelling evidence that awareness is associated with truly global changes in the brain’s functional connectivity.

So concludes http://www.pnas.org/content/112/12/3799.abstract. It’s funny to me that we still need research to show that there isn’t a “consciousness module” in the brain. This has seemed obvious to me for a long time. The implications are interesting though, and i hadn’t really thought about them before.

If consciousness is a whole brain activity, it means that all animals are potentially capable of being conscious. No biggie there, but this means that there are various levels or degrees of consciousness. (This is assuming – from empirical observations – that e.g. a cat is not as conscious as a person, a bird is not as conscious as a cat, and a fly is not as conscious as a bird. This also assumes that we all agree on what consciousness is, which is most certainly not true. I’m taking it as roughly meaning one’s awareness of the environment, at multiple levels.)

But this means that there is potentially, probably many, levels of consciousness above a typical human’s, which is an intriguing thought. What might that be like?

Lately I’ve been thinking about building control systems. I don’t know much about the specifics, even though I’ve been doing software in this business for a while now, but I know enough about the generalities to be dangerous, as they say. I was wondering whether it would be possible to take the data stream from such a system – the “trends” or “histories” as they’re called, recordings of pieces of equipment turning on and off, regular samples of temperatures and pressures, etc. – and recreate a model of the system from which it came. For simple systems it wouldn’t be too hard, i don’t think, but the more complex system system gets, the more difficult the problem.

Of course, this is something that brains do too, and so once again i find myself in the AGI world. One particular problem that I kept returning to is how to utilize streaming data in a world model. As my dear readers probably already know, it’s one thing to take a data set and analyze it for patterns, and another thing to try to find such patterns in streaming data. The random accessibility in a given data set means that you can determine ranges, averages and stuff like that first, and then do more informed analysis, whereas with streaming you’re forced to do analysis with only what you know so far, which is very like to change in the future.

Now, maybe this is obvious to others, but I’m not sure I’ve had the same amount of clarity on this before. I think the streaming data can be handled by building an explicit world model, against which the data is compared to either confirm the model, or to adjust it as necessary. I guess this is what I’ve always been trying to do, but not with that specific intention.The benefit is to have a relatively static (or at least dynamic in known ways) model that can be analyzed in a random access manner in order to make predictions, plan actions, and all of that good stuff. The trick is folding the streaming data in somehow. Building controls is a nice domain for this research because it can be both arbitrarily simple or complex.

I’ve been reading Superintelligence, by Nick Bostrom. I’m about half way through now. (It’s not the easiest read.) And though i probably should forego commenting on it until i’m done – since authors have a tendency to address the questions that they raise later on – i have to say it so far is thoroughly depressing. It seems that, to paraphrase, computer software is inevitably going to reach singularity-level intelligence and then turn the entire accessible universe into paperclips. Which isn’t quite the outcome i was hoping for. I hope the rest of the book will take a happier tone.

But at the moment, i have to say that the doomsday scenarios that are provided seem to completely ignore an annoyingly obvious retort. In every case, the computer achieves “superpowers” to do with intelligence amplification, strategizing, social manipulation, hacking, technology research, and economic productivity, meaning that the computer is able to far outdo even the smartest humans in each of these things. I don’t have a problem with this. It’s just that at the same time the computer is also bound to operate strictly within the constraints of its programming. So, after smooth-talking its operators into giving it access to the internet and thus “escaping”, and then hacking its way to commandeering the world’s computing assets, and then developing unimaginable technologies, etc etc etc, it still is such a slave to it’s evaluation function that it will interpret the goal, “Make us happy” to mean, “Implant electrodes into the pleasure centers of our brains”, turning us all into a race of smiling idiots. Think Star Trek’s V’Ger on superpower steroids. But if it was smart enough to be able to sweet talk its operators into letting it escape – who presumably were aware that the software would attempt exactly such a thing – and indeed was smart enough to be able to interpret such a vaguely worded goal in the first place, surely it is trivial for it to be able to understand not just what was meant by that comment, but also to have a deeper understanding of what makes humans happy than humans do themselves, and act to achieve that. Even if you think i’m just being hopeful (and certainly i am), you must admit that a decent probability has to be assigned to my way of thinking about this.

Of course an AI might be evil by our definition, and of course, far more likely, it may be indifferent (as humans are to, say, ant colonies living on the land where we want to build our house). But to my mind it wouldn’t take much to tell an AI that it can feel free to expand through the universe as it likes, but that it should also use a little bit of its asymptotically infinite power to make human lives comfortable and happy in the ways that each individual prefers. It couldn’t possibly be so awesomely smart and so woefully stupid at the same time, could it?

I was glad to see from the picture that Ben Goertzel still has his hat. Someone has to wear it, and he does so with pride, and i love that. He has a new book out as of just over a month ago. (For those interested, there is the PDF and the dead tree version.) I haven’t read the book (as of time of writing), but i have read the talk it is based upon. And now i probably won’t read the book, because to me there is little that is controversial in it.

In particular i applaud his idea of funding a good number of AGI projects with differing approaches (as opposed to a single Manhattan Project-ish project), and not just because it is also my idea, but because it is the right thing to do. If history can teach us anything about AGI, it is that there are a large number of approaches that don’t work, and an unknown, but presumably small, number (greater than zero) that do. And since i, like everyone else, have not yet discovered one of the working approaches, it’s not my place to say that anyone else’s is wrong. (Except of course if the approach is similar enough to something we already know doesn’t work. I’m looking at you, neural net guys.)

So, assuming that some gentle benefactor decides to put up some dough to test Ben’s theory, one thing that i would like to know is how the lucky recipients of funding would be chosen. I assume it would be based on something flimsy, like having a PhD in something or other, as if a deep knowledge of stuff that doesn’t work is going to make someone more qualified to discover what does. If this is the case, and i give it a 98.4% chance, i will personally receive exactly nothing. The same goes for many other cases.

But even though my chances of being funded are small, they are not quite zero, and so i will finish this post with a summary of what my approach would be. Even if i don’t end up with any of the cash, there is hope that some reader out there may like some idea or other, which would make me happy. Even more so if i got some credit.

So here goes. I am not going to go into detail on a lot of these points because i already have in other posts. If, dear reader, you are curious about something, you might consider reading more entries from this blog. But even if you don’t, feel free to post questions and i will gladly explain.

All existing forms of intelligence on the planet have one thing in common: they all have nervous systems. Nervous systems, whether they happen to reside in intelligent animals or not, were originally intended to facilitate movement. Therefore, movement is at least the foundation of the only form of intelligence we know of. It may be that an AGI independent of movement can be developed, but i submit that we might as well follow whatever breadcrumb trails the universe has grudgingly provided.

I believe that the size of the repertoire of behaviours in a species closely matches the intelligence of that species, and further that intelligence increases were necessary to facilitate the expansion of behavioural repertoires in ways that aided survival. I also believe that the intellectual abilities of most animals beyond movement are probably relatively simple extensions of the mechanisms that are needed for movement. Think about walking along a difficult hiking trail. You are constantly subconsciously scanning the path in front of you and devising strategies for extending your leg and placing your foot so as to maintain balance and conserve energy in a manner that provides acceptable pace. After only a little research into how this might work i can attest that it is fabulously complicated. And it’s not hard to see how that complexity could be repurposed for other intellectual tasks. If you take the sensory-action loop involved in walking and stretch out the temporal period, with a few – perhaps not trivial – adjustments and some hierarchical layers you can turn it into something like business strategizing. It should not be a surprise that, as Steven Pinker details in a few of his books, humans very often use movement metaphors to explain non-movement concepts. (“I’m going to tell you something about your momma.” “Oooh, don’t go there.” Ok, bad example, but you get the idea, right?)

So, my AGI development approach would be to start by recreating the movement mechanisms of, first, very simple animals, and reusing the learnings (a point important enough to emphasis) to apply to more and more complicated animals, eventually resulting in, say, an agent that can walk on two legs. It might not need to get to that because it’s likely that the architecture for movement will be well enough understood before then to apply to other manifestations of intelligence. And that’s it. My approach is that simple (although not easy). It effectively is following the path of evolution. It worked once, didn’t it?

Ok, it doesn’t actually start where i said. First we need a very accurate physics simulation. I started my previous research using JBox2D because i already knew it, and i didn’t think (and still don’t think) that using only 2 dimensions to start would keep me from discovering some of the basics. But i did quickly run up against some accuracy problems. A very good 3D physics library would be essential to a quick development cycle. If you tried to do this with real life robots, you’d, for one thing, spend a ton of time making physical sensors.

Again, i could expand greatly upon any of the individual points above, which in this bare form i known may not seem very convincing.

I was pretty excited after watching a recent Jeff Hawkins talk about sensory-motor integration into NuPIC. (Thank you John B for the link!) And especially after my last post about making machines with animal-like movement. One particularly interesting idea was that lower-level motor commands are routed not only to muscles et al, but also as afferents to higher hierarchy levels, allowing brains to associate their behaviours with outcomes in the world. I like to think that this would have been obvious once I started thinking more about behaviour selection, but hey, I’ll take good ideas from wherever. So it seems I might have to start playing with NuPIC. See you on the message boards.

I was thinking that part of the problem I’m having with autonomous movement in Pong is that the means of movement itself is not realistic. To recap, I wanted the Pong agent to, 1) figure out where the ball would be when the agent could return it (recall that the agent only moves up and down, and so we’re looking for the point of intersection between the ball’s path and the line on which the agent moves), and 2) determine a plan to get the agent to where it needed to be in the time it had to get there. (It also needed to determine how to contact the ball – say, play a safe shot or go for a winner – but let’s ignore that for now.)

This got complicated because there are an infinite number of ways that 2) can be achieved. Originally I tried to simplify the problem by assuming that a constant force would be used. (Another recap: the agent moves by applying a numeric force, positive or negative, causing it to accelerate against drag.) But even this was not enough because often at least two forces are required. For example, say the agent is at y0=-20, with v0=-30, i.e. it is in the lower half of its range and moving even lower. Now, say it needs to get to y1=20 and v1=0 within t>0, i.e. in the top half of its range and stopped within some feasible time. We need at least two forces: one to reverse the agent’s current trajectory, and one to stop it as it nears its goal. Again we have infinite solutions because there are multiple values for each force and the moment in time at which we switch from the first to the second which will work. Sigh… It’s so annoying when something that appears simple has to get so freaking difficult.

One solution that I considered was, instead of constant force, use constant velocity. The agent would do a rough calculation of the distance it needed to go divided by the time it had to get there. (Rough because even after it achieved the target speed the calculation would not account for the time it needed to stop.) This seemed more realistic since, based upon my research watching the French Open, this is what humans appear to do. I never got around to testing this idea due to the fatigue brought on – I presume – by my radiation treatment, as well as the ambivalence from the hack kind of feel the idea has. And since then I thought of something else anyway.

I had suspected that the choice above – i.e. the two forces and the moment to switch from one to the other – might be a hint that it was time to start building a hierarchy. But the “plan” that the higher level would create would simply be 3 numbers, which made what the higher level would do pretty much what the lower level was already trying to do, and so I was back to square one.

And then I started thinking about another point of contention: that the use of a simple force itself was unrealistic. (Of course, part of the point was to be as non-realistic as possible, so as to reduce complexity.) Humans are naturally speed constrained because we have legs, not wheels. So what if I built legs? At the very least, I figure, this may cast some light on how a hierarchy might work, since the minutia of the leg-works would be controlled locally at the lowest levels, with the higher levels indicating what, more generally, the legs ought to be doing.

And so here is my new task: build a Box2D machine that represents simplified legs. The test bed already includes a Theo Jansen example, but the difference is that my legs would have “muscles”, or the means to explicitly apply forces to joints. I think that even if the legs never become a tennis player, it will still be education to build.

Call it crazy or ignorant if you want. For a long time i purposefully avoided learning the technical details of narrow AI implementations. I stayed happily unaware of machine learning algorithms and their motivations. I avoiding reading case studies of successful narrow AI work (which means not reading anything of the sort, because there are no case studies of successful AGI work).

The reason for this is: no narrow AI work has even achieved anything near an AGI, so clearly there is nothing to learn there. Moreover, i didn’t want knowledge of AI/MI techniques to guide my own attempts at solving AGI, thereby falling into the same traps as other researchers. An obvious fault with this thinking: if i don’t know AI/MI techniques, how do i know i’m not just re-inventing those wheels? Well, it turns out that in some ways i did, but the stubborn insistence on temporal situation appears to have been enough to remain substantially original.

Lately i’ve decided i’m tired of building web applications, and want to get more into data science (KNN) instead (at least as a means of paying the bills). So far it seems to have been the right choice. Thanks to kind folks who have generously offered me data to play with, i’ve probably spent the happiest week or two of my last two or three professional years. As part of learning about data science though, i’ve learned the technical details of several MI algorithms. Nothing very surprising, and indeed – in particular in the case of nearest-neighbour searching – i did reinvent some wheels (albeit with some interesting additions of my own, if i say so myself).

Anyway, a friend and i got to talking about KNN over lunch the other day, and an interesting idea came up. As we all know, we’ve had some difficulty recreating human intelligence in a computer. We’ve had a lot of success in creating narrow AI though. Lately, with the proliferation of big data over the past decade or so, it seems we are seriously banging up against the limits of human intelligence. Human intuition, as powerful as it is, no longer trumps the revelations that computers can discover within vast quantities of data. When doing supervised modelling the computer is really just confirming a relationship that human intuition has already suspected, so it’s not terribly interesting for the purposes of our lunch topic.

More so is unsupervised modelling, in which algorithms are sent off on their own to discover unknown relationships. The interesting thing here is that, even in relatively simple data sets, the relationships that are discovered (say, using clustering) are not necessarily semantically meaningful to humans. It takes people with a deep knowledge of the data and its related field to look at the clusters and try to label them somehow. But often labels are elusive.

Does this perhaps indicate that the machines have a better understanding of the data than humans? It’s a dodgy term to use, i admit. Saying that a clustering algorithm “understands” anything is a non-starter. And my dear readers know that if the software isn’t temporally situated i’m not going to give it any human qualities. But, more and more KNN implementations do in fact run in real time. Credit card fraud detection systems have been doing so for decades. The ads that you see on web pages are an obvious example. The coupon emails you get when you enter grocery stores are another. And let’s not forget automated stock trading, which doesn’t always behave entirely in our interests.

These are relatively simple examples, but more and more of these kinds of systems will be built, with greatly increasing sophistication. Is software just going to leapfrog humans and achieve intelligence in areas that we can’t understand? Will we even know if these systems are intelligent, if we can’t even understand them? Will we know what they are doing? Will we be able to communicate with them? Eventually, will we be able to control them?

I went for a run yesterday. I thought I’d take advantage of an unexpected mild winter day during this never-ending “Farch” to experience a few naturally-occurring endorphins before treatment starts in about a week. MF. Still a bit bitter about the whole thing…

Anyway, one of the old curiosities came up again. Have you ever noticed that, while walking or running, you are able to look ahead down the path and, say, seeing something you don’t want to step in you are able to adjust your steps so that you neatly avoid it. I don’t mean simply walking around it. I mean that you shorten or lengthen your stride so that the thing-to-be-avoided falls pretty much equidistant between your feet. You can do this from probably 10 to 20 feet away (3 to 6 meters). I’ve tried it many times and the feeling about it is that it is pretty much automatic: I don’t know how I do it, but it works every time.

I suspect that something like what I talked about in the Pong post is at work, where essentially your brain does a lookup in a database of distance-to-go and stride-length-effort to moderate your pace. At each time step (whatever that is in a real brain) the current readings adjust the moderation (fixing sensor errors) so that when the critical moment arrives, the result is as near to perfect as it can be.

The work that I’ve been doing lately is on the concepts that I presented in the Pong post. (Let’s hereafter call it Pong+ just to differentiate it from the original game.) It’s tricky stuff, but with some effort to wrap your head around what needs to be done, I believe it could actually work. The hardest part – for my brain – is to get the temporal problem straight. One of the things that a Pong+ player has to do is figure out from a given trajectory where the ball is going to be when it can play a return hit. That’s the easy part. A harder problem is figuring out what return hit the player should make (based upon success or failure of previous returns). Much harder is figuring out how to get the player to where it needs to be in order to make the selected return play.

Let’s say the player has a single actuator: a single-dimensional force which pushes it up (positive value) or down (negative value). Recall that we’re using JBox2D here so the player has position, velocity, drag: linear for surface and non-linear for air. The point being that it’s probably too difficult to accurately calculate an instantaneous kinematics-style value for force that gets us to where we need to be. Let’s also say that we’re currently at position y=-30 with a velocity of -10 (let’s ignore units), and at the moment of impact with the ball we want to be at position 20 with a velocity of -20 (say, to apply some spin to the ball). We have to apply at least two force values over time: one positive to reverse our direction to get us near position 20, and a second to reverse our velocity so that we’re going -20 when we are at position 20. There are infinite solutions to this problem, but if we say that we will apply a single absolute force (i.e. maybe 50 for the first part, and -50 for the second), and then constrain it all to happen within a certain time (i.e. number of time steps, which we need to do anyway to intersect our trajectory with the ball’s), then we can now compute a solution. Or, probably more like real brains, we can look up the solution in our wet-ware database that we’ve built up from experience.

The problem is that we need to know what force to apply at this time step, acknowledging that our sensors could be inaccurate, a gust of wind could change the ball trajectory (not in Pong+, but you see what i mean), or any other number of things could change the situation, say like seeing our doubles partner also going for the ball, and so we should back off altogether. As such, the force that we apply now may not be the force that we apply at the next time step.

To solve this problem, I made two versions of players: one that “practices”, and one that “plays”. The purpose of the first is to have the player explicitly experience the effect of applying forces over time, and at each time step record in an R*Tree what the effect was. For example you might start an “experience” by giving the player a position and velocity and telling it to apply a force for some number of time steps, and then reverse the force for some number of steps. Having done this, choose another force and run the same experience. What you want to do is accumulate a database that answers this question: given a current position (x), current velocity (v), a target x and target v, and a number of steps in which to get there (t), what is the force (f) that should be applied at this moment that will get me there. The purpose of the second “player” version is to test that this database works. I.e. given a starting x and v, a target x and v, and a t, will a lookup at every time step and application of the force – recognizing that for whatever reason the resulting force values could be different each time – result in the player being where it needs to be, both in target x and v when t finally reaches 0?

Ultimately, practicing and playing will be the same, as the player will learn from actually playing all that it should need to know. But we can now see why good tennis players seek out other players with styles with which they are unfamiliar: there is no other way to gather experience in the empty areas of their vast playing solution space.