More From NBC

Rise of the robot astronomer

When thousands of Internet users helped astronomers classify types of galaxies through a project called Galaxy Zoo, some of them may not have realized that they were training a machine to do their job.

British astronomers say they used data from the project to develop a software algorithm for galaxy classification that matched the human-generated results 90 percent of the time. Such robot astronomers may well do the bulk of the work in future all-sky galactic surveys. But the research team's leader says we need not fear the rise of the machines: The point of the exercise is to liberate us humans to do the more interesting tasks.

The University of Cambridge's Manda Banerji explained that celestial surveys to come will have to analyze hundreds of millions of galaxies. Banerji herself is involved in one of those surveys, the Dark Energy Survey, which will look at 300 million galaxies over five years, starting in 2011. Another project known as the VISTA Hemisphere Survey will take pictures of galaxies over the entire southern celestial hemisphere.

"We're getting to that age where we can't viably do these things using the human eye," Banerji told me today.

In the coming age, improved image-classification software could handle the no-brainers first. "The idea is that if we can eliminate all the things that are pretty standard, and we can give humans just the 10 percent that's left, then we're only bothering the humans to look at the interesting objects," Banerji said.

Getting a statistical handle on the cosmic distribution of galaxies is one of the big challenges for astrophysics today: How many are elliptical, or spiral, or clumpy and irregular? Does that distribution change with age? What other characteristics can be correlated with galaxy structure? Such questions could lead to hugely important answers: For instance, the Dark Energy Survey is designed to look for clues in galactic data that could help solve the mystery surrounding the universe's accelerating expansion.

The software developed by Banerji and her team attacks the galaxy-classification challenge using a method that's different from the tried-and-true human approach. Instead of merely eyeballing the shape of a specific galaxy, the algorithm looks at qualities such as color, brightness variations and texture. A reddish galaxy is more likely to be an elliptical, for example, while a bluish galaxy is more likely to be a spiral.

The researchers fed the software a database of galaxies with known shapes, and trained the software to match up those shapes with the other qualities. The fully trained software was then used to classify a bigger database of galaxies on its own, and the machine's verdict matched the humans' verdict more than 90 percent of the time. The other 10 percent tended to be relative oddballs - for example, a bluish galaxy that for some reason is elliptical.

The next step is to figure out what other qualities can be used to classify the oddballs correctly, and then upgrade the software. Or just outsource the job to a human.