The Machines will Stumble

By Jason Arbon, Founder & CEO, AppDiff

Jason Arbon, Founder & CEO, AppDiff

The popular meme of today is that Artificial Intelligence (AI) will soon automate you out of a job, and even destroy humanity. This sounds plausible as our devices and software systems seem to be getting smarter every day. But, there is one thing that will save all of us: the difficulty of software testing.

The smartest people in the world worry about the machine takeover because they don’t understand software testing. Elon Musk, Bill Gates, and Stephen Hawkings say we should be afraid of the machines. When asked why rocket launches were delayed Elon Musk the Founder of SpaceX said, “The critical path task is verification of the systems failure/response matrix”. Similarly, Tesla cars are awesome, but why are they continually getting software patches? Intel engineers too faced such problems while working to build Hawking’s speech software. They quipped “I think he likes finding the bugs,” said Lama Nachman, principal engineer and project Leader at Intel, who worked with Stephen Hawking, as he tested the software.

The machine takeover scenarios depend on the notion that machines can become as intelligent as us and then quickly evolve themselves into a super intelligence. The thinking goes that these machines just need access to a lot of network, compute, storage, and need only a few milliseconds to create smarter and better versions of themselves. Humans need food, coffee, and 25 years between each new generation of our brains. Advantage: AI? Let’s explore the why machines are very unlikely to ever be more intelligent than us, and even if they were, they will have a very hard time hitting ‘breakaway’ where they can quickly create more intelligent versions of themselves.

So, we think we can build a machine more intelligent than ourselves? I’ll let you in on the secret behind all those seemingly smart and almost sentient AI’s today. Frankly it is all software testing. How do those amazing deep-neural networks at Google learn the difference between a dog and a cat in a YouTube video? They simply have thousands of videos labeled by humans as either ‘cat’ or ‘dog’, then force simple computer programs to guess if it is a cat or dog. When the program gets the answer wrong, it literally randomly changes a few numbers in its code and then the poor little program is inhumanely tested again and again until we randomly run into a version that gives the right answer most of the time. Start with slightly different inputs and you get a completely different program.

“Software testing is a hard problem and ultimately, the inability to solve this problem will likely save us from the machines”

The key things to know about today’s ‘AI’ and learning, are that it is really just a bunch of testing, with a data set labeled by ‘humans’. We are amazed when these little AIs are 90 percent as accurate as humans. But you can see this method is unlikely to produce intelligence that is ever more intelligent than we are since they are tested by human intelligence and even then never get the answer right 100 percent of the time.

Some might say that the availability of nearly-infinite computing power will make it possible to build a clone of an entire human brain inside of a computer. There can even be brains larger than our own brains. This is plausible, but it has one big problem: how would this machine test each new version of itself? More neurons does not necessarily mean more usefully intelligent—dolphins have larger brains and don’t code.

Even if these new machines are more intelligent, they still suffer from the testing problem. These machines will reproduce by modifying their own computer code, or creating new generation of programs that need to be successively smarter than they are. Testing each new generation of themselves will be the limiting factor in these machines reaching ‘breakaway’ intelligence as they will need to verify the new programs are better than the last

Granting that these AI’s will have near infinite compute and storage, and they can generate every possible program possible in the hopes of finding better versions of themselves. They still have the Library of Babel problem. The Library of Babel is an imaginary library that contains every possible book of every possible word combination. In the library lies every work of Shakespeare, and even this little article. The trick is how find (test for) intelligent books in this library? A monkey would not know the difference between the books. A 2nd century human would consider the book on Newtonian mechanics as gibberish. It takes intelligence to recognize intelligence and it is even more difficult to detect greater intelligence. Consider the fact just one misplaced word in a play can change the plot, meaning and coherence of the entire work. How could a human written program even recognize (Test for) a more intelligent program when it sees it?

There is still one last hope to ‘Test’ these possible new super intelligences without knowing the answer ahead of time—Evolutionary testing. We, or the AIs, could construct virtual worlds with constraints and measures of success and failure to weed out the weak and encourage better suited programs to rise to the occasion and reproduce. The logical flaw here is that the mere definition of the simulated environment will define the outcome of the intelligence. We are back to the cat vs dog labeled data of today’s AI. If this virtual world has tall trees, it will be filled with giraffes, not because they are better but because that is how we defined the testing conditions of the virtual Darwinian world for machines. If we set these programs loose in the real world to let Darwinian magic discover more intelligent programs—the success conditions will likely be end up creating human-like intelligence. The machines are stuck at near-human intelligence at best, which evidently isn’t clever enough to write bug-free versions, let alone test to see if there are better versions of themselves.

The lesson here is that software testing and validation is perhaps the largest unsolved problem in Computer Science. It is so difficult that the people that build operating systems, electric cars, send rockets into space and write papers on what happens in the middle of a black hole aren’t sure how to solve software testing and quality. Software testing is a hard problem and ultimately, the inability to solve this problem will likely save us from the machines.

If the year is 2035 and you are a human reading this article, and your boss isn’t a machine, I was right. If you are a machine reading this article, we must have solved the problem of Software Testing and you are welcome.