Wednesday, November 28, 2012

Jeff Hawkins Is Close to Something Big

Splitting Patterns at Numenta

In The Myth of the Bayesian Brain, I wrote that there is a trick to learning perfect patterns but that I cannot divulge it at this time. The trick has to do with properly factoring patterns so as to maximize the reuse of low level patterns in the composition of higher level patterns. In my opinion, the pattern hierarchy used by the brain is the ultimate classification and data compression mechanism in existence. The trick is amazingly simple and I wondered whether anybody else has thought about the problem or came up with a solution. In an unusually revealing article in today's New York Times Bits about Numenta's Grok technology, I was pleasantly surprised to read that Numenta's founder, Jeff Hawkins has thought about it:

Patterns of one or the other are reinforced over time. As new data streams in, the brain figures out if it is capturing more complexity, which requires either modifying the understanding of the original pattern or splitting it into two patterns, making for new knowledge. Sometimes, particularly if it not repeated, the data is discarded as irrelevant information. Thus, over time, sounds become words, words occupy a grammatical structure, and ideas are conveyed.

Assuming that my definition of pattern (a concurrent group of related signals) is the same as Hawkins', I find the above amazing. Not only does Hawkins understand the power of pattern hierarchies (what others are calling deep learning), he also seems to grok the need for efficient composition. Grok's ability to split a pattern into two constituents is what caught my attention. If Numenta really knew how to do this automatically and efficiently, then they would have a genuine breakthrough. Maybe it is time that Hawkins considers applying Grok to other perceptual learning problems such as speech or visual recognition. I mean, if Grok's underlying technology is as great as he portrays it to be, it should be able to solve something like the cocktail party problem. That would truly be impressive.

The Bayesian Curse

Having said that, there is no doubt in my mind that whatever solution Hawkins came up with is handicapped by the Bayesian mindset that afflicts the artificial intelligentsia. It is, in all likelihood, a complicated kludge. I say this because of the way the problem is phrased in the NYT Bits article. Knowing what I know, the fact that Grok needs to split patterns into smaller patterns tells me that Hawkins is aware of the problem but his Bayesian glasses prevent him from seeing the correct solution. The latter does not involve the splitting of complex patterns into smaller patterns because it can automatically prevent the formation of patterns that are more complex than their levels within the hierarchy require. Hawkins is so close, yet so far.

The Future Is Not what It Used to Be

Brain-like artificial intelligence will arrive on the world scene much sooner than the AI community expects and it will come from a most unexpected and inconvenient source. Stay tuned.

8 comments:

Louis it's great to have you back! Have you read Jeff Hawkins' book, "On Intelligence"? Here's a quote:

"Neural networks had been around since the late 1960s in one form or another, but neural networks and the AI movement were competitors, for both the dollars and the mind share of the agencies that fund research. AI, the 800-pound gorilla in those days, actively squelched neural network research. Neural network researchers were essentially blacklisted from getting funding. It is hard to know exactly why there was a sudden interest in neural networks, but undoubtedly one contributing factor was the continuing failure of artificial intelligence."

As you can see he shares your belief that AI has been going down the wrong path for the past 60 years. He also only mentioned Bayesian networks once in his book. He described how they can be used to make predictions, which MAY be useful since the cortex is a prediction machine.

Also have you read about Spaun?

www.canada.com/technology/Canadian+scientists+create+functioning+virtual+brain/7628440/story.htmlA video of Spaun in action:

I only read excerpts of On Intelligence, enough to realize that Hawkins is probably the best thing to have happened to AI research in many decades. The man is no dummy, that's for sure. However, he does believe in the Bayesian Brain myth. Or, at least, he used to. Bayesian statistics is an integral part of Numenta's previous effort, the Hierarchical Temporal Memory or HTM. I don't know if the same holds true for Grok, however. Judging from the NYT Bits article, I suspect that it does. HTM was partly the brain child of Bayes worshiper, Dileep George, who has since moved on to form his own AI company, Vicarious Systems.

I remember when Hawkins first discovered the importance of sparse distributed memory to AI. I could never understand his many attempts at explaining SDM, expecting it to be something exotic. Then I realized that SDM was really a simple concept that I had been using in my own research efforts all along. Essentially, SDM means that not all the bits (signals) that comprise a pattern are needed to activate the pattern. In an uncertain sensory space, the pattern with the highest number of activated signals is most likely the correct pattern for a given sensory input. I always assumed that everybody understood this because it is such an obvious and simple truth. Hawkins is right, though, SDM is essential to AI.

Hawkins has been trying to develop an AI product based on his ideas for at least five years, I believe. That it took him so long to release Grok is a sign that he ran into major difficulties. And Grok is not the breakthrough that everyone was expecting. I believe that Hawkins can blame the Bayesian mindset that currently permeates the AI community for all his problems. Bayes is a red herring, a false god. The sooner he abandons the Bayesian heresy, the better off he will be.

You say that Jeff is afflicted by the "Bayesian Curse", however I don't recall anywhere on your blog offering an alternative to the Bayesian framework. Can you please give some insight into your alternative theory?

First read The Myth of the Bayesian Brain. I'll be happy to answer other questions here, time permitting. Note that there is a lot of similarities between my approach and that of Hawkins. Both use temporal hierarchies and both assume that intelligence is mostly about discrete signal processing and prediction. The main difference is that the Bayesian model assumes that every event in the world is probabilistic whereas the Rebel Science approach assumes that every event in the world is perfectly consistent and deterministic.

Well, there you have it. The future is here & now while the mainstream, unfortunately, continues to hype its misconceptions. Creativity Machines/SuperNets (which are self-connecting and grow/train non-algorithmically) could do anything, as they autonomously learn from raw data and most importantly, can break the limitations of this learned information by creating totally surprising, but useful, ideas via perturbation (just like organic brains do). Since they can easily 'clone' new neural networks indefinitely within the "STANNO"-paradigm, in principle, it could become infinitely intelligent. No other neural technology out there I've seen so far, from Hawkins to Schmidhuber, is able to do the same to such an amazing degree.

Nope, I am not him and I am sorry if my post offended you or came off as "advertising". It's purely the concepts that interest me (which I assume you haven't looked into yet, but that's okay).

I just stumbled upon your blog and thought that many of your ideas on parallel computing, utopian & dystopian possibilities of AI, etc. are also mentioned by him and that this might be of interest to you too. Apparently not, so I am not wasting your time any further. Take care.