Archive

We are the music makers,
And we are the dreamers of dreams,
Wandering by lone sea-breakers,
And sitting by desolate streams;—
World-losers and world-forsakers,
On whom the pale moon gleams:
Yet we are the movers and shakers
Of the world for ever, it seems.

While reeling from the scoop, depressed and doing some preliminary market research, I happened upon a gem of a blog post by none other than our favorite search company, Google. Before proceeding on in my post, I do recommend that you do read the blog post by Steve Baker, Software Engineer @ Google. I think he does an excellent job describing the problems Google is currently having and why they need such a powerful search quality team.

Here’s what I got from the Blog post: Google, though they really want to have them, cannot have fully automated quality algorithms. They need human intervention…And A LOT OF IT. The question is, why? Why does a company with all of the resources and power and money that Google has still need to hire humans to watch over search quality? Why have they not, in all of their intelligent genius, not created a program that can do this?

Because Google might be using methods which sterilize away meaning out of the gate.

Strangely enough, it may be that Google’s core engineer’s mind is holding them back…

We can write a computer program to beat the very best human chess players, but we can’t write a program to identify objects in a photo or understand a sentence with anywhere near the precision of even a child.

This is an engineer speaking, for sure. But I ask you: What child do we really program? Are children precise? My son falls over every time he turns around too quickly…

The goal of a search engine is to return the best results for your search, and understanding language is crucial to returning the best results. A key part of this is our system for understanding synonyms.

We use many techniques to extract synonyms, that we’ve blogged about before. Our systems analyze petabytes of web documents and historical search data to build an intricate understanding of what words can mean in different contexts.

Google does this using massive dictionary-like databases. They can only achieve this because of the sheer size and processing power of their server farms of computing devices. Not to take away from Google’s great achievements, but Syntience’s experimental systems have been running “synthetic synonyms” since our earliest versions. We have no dictionaries and no distributed supercomputers.

As a nomenclatural [sic] note, even obvious term variants like “pictures” (plural) and “picture” (singular) would be treated as different search terms by a dumb computer, so we also include these types of relationships within our umbrella of synonyms.

Here’s the way this works, super-simplified: There are separate “storage containers” for “picture”, “pictures”, “pic”, “pix”, “twitpix”, etc, all in their own neat little boxes. This separation removes the very thing Google is seeking…Meaning in their data. That’s why their approach doesn’t seem to make much sense to me for this particular application.

The activities of an engineer would be to write code that, in a sense, tells the computer to create a new little box and put the new word in a list of associated words. Shouldn’t the computer be able to have some sort of continuous, flowing process which allows it to break out of the little boxes and allow for some sort of free association? Well, the answer is “Not using Google’s methods.”.

You see, Google models the data to make it easily controllable…actually for that and for many, MANY other reasons. But by doing so, they have put themselves in an intellectually mired position. Monica Anderson does a great analysis of this in a talk on the Syntience Site called “Models vs. Patterns”.

So, simply and if you please, rhetorically:

How can computer scientists ever expect a computer to do anything novel with data when there is someone (or some rule/code) telling them precisely what to do all the time?

I do have an original post in the mix which talks a bit about some of the unseen things at work in the unemployment numbers being posted, but for now here’s the words of Monica Anderson talking about inventing a new kind of programming. From Artificial Intuition:

In 1998, I had been working on industrial AI — mostly expert systems and Natural Language processing — for over a decade. And like many others, for over a decade I had been waiting for Doug Lenat’s much hyped CYC project to be released. As it happened, I was given access to CYC for several months, and was disappointed when it did not live up to my expectations. I lost faith in Symbolic Strong AI, and almost left the AI field entirely. But in 2001 I started thinking about AI from the Subsymbolic perspective. My thinking quickly solidified into a novel and plausible theory for computer based cognition based on Artificial Intuition, and I quickly decided to pursue this for the rest of my life.

In most programming situations, success means that the program performs according to a given specification. In experimental programming, you want to see what happens when you run the program.

I had, for years, been aware of a few key minority ideas that had been largely ignored by the AI mainstream and started looking for synergies among them. In order not to get sidetracked by the majority views I temporarily stopped reading books and reports about AI. I settled into a cycle of days to weeks of thought and speculation alternating with multi-day sessions of experimental programming.

I tested about 8 major variants and hundreds of minor optimizations of the algorithm and invented several ways to measure whether I was making progress. Typically, a major change would look like a step back until the system was fine-tuned, at which point the scores might reach higher than before. The repeated breaking of the score records provided a good motivation to continue.

My AI work was excluded as prior invention when I joined Google.

In late 2004 I accepted a position at Google, where I worked for two years in order to fill my coffers to enable further research. I learned a lot about how AI, if it were available, could improve Web search. Work on my own algorithms was suspended for the duration but I started reading books again and wrote a few whitepapers for internal distribution at Google. I discovered that several others had had similar ideas, individually, but nobody else seemed to have had all these ideas at once; nobody seemed to have noticed how well they fit together.

I am currently funding this project myself and have been doing that since 2001. At most, Syntience employed three paid researchers including myself plus several volunteers, but we had to cut down on salaries as our resources dwindled. Increased funding would allow me to again hire these and other researchers and would accelerate progress.

My 90-year-old Nana (Paternal Grandmother) is an inventor and her inventions work.

For example, one of my Nana’s inventions is a color-coded flagging system for dog doo-doo left in her front yard by neighbors who don’t clean up their pet’s mess. The system is simple. If it is a fresh dog dropping the marker (A tomato stake and colored plastic bag.) is yellow which warns people not to step there lest they need to clean their shoe of said droppings.

As the dropping starts to “mature” (Or get dried out and easier to pick up.), my Nana replaces the yellow flag with an orange one to inform her which ones are ready to pick up that week. These flags, in concert with a systematic lawn-checking walking pattern done on a weekly basis, keeps shoes clean and dog doo-doo marked for elimination.

Did I mention each flag has “Doo-Doo” written on the plastic in blue Sharpie?

The invention of this system is made real by the operations of the mind of my Nana. This process of invention is inherently “Model Free”, meaning that my Nana did not need to know differential equations or string theory to make her idea manifest. What’s “Model Free” mean anyway?

“Model Free” means you do not need a PhD or to know the hard sciences like Physics to solve a problem. You just observe the problem and a solution comes to you.

Many in the fields of Economics, Neurology, and Computer Science have been trying to come up with ways to solve complex problems using descriptive models. However, nothing seems to work as well as good ole fashion gray matter. If you wonder why this is, don’t think it is because these complex problems cannot be solved. We just need to change our perspective to understand the operations of the “Model Free” so we might expand our tool sets to encompass the methods of Creation, Natural Construction, Emergence, and Complexity. Innovation also falls in this category…To a point.

So is innovation “Model Free”?

Since innovation encompasses ideas and inventions applied successfully in practice, I have to say not so much. Innovation can be “Model Free” if it is implemented in a Model Free environment, but innovation quickly becomes subject to the introduction of models when the innovation is tied to a corporate agenda or to the scientific method. Using the example of my Nana, she successfully implemented a working invention which made her life much easier. “Model Free” innovation can quickly become “Model Rich” innovation as soon as someone says:

PROVE IT! Tell me how this makes life easier! (Or how it saves/makes me money…)

In the case of business or science, this means MEASUREMENT. So, I’m expanding the definition of “Model Free” to include the absence of measurement. As soon as an innovation starts including aspects of measurement, it ceases to be completely “Model Free”. Can you see the guys in the white lab coats and the consulting khakis approaching a 90 year old woman and attempting to get her to prove the “value proposition” associated with her design? Ridiculous, but measurement in its ivory tower has overwhelmed natural processes of creation and has brought us to the extreme brink.

When did we start believing measurement and models are the the source of invention and innovation, not the other way around?