That mystery may not last much longer. Researchers from Carnegie Mellon University announced this week that they've developed a method to help uncover the biases that can be encoded in those decision-making tools.

Machine learning algorithms don't just drive the personal recommendations we see on Netflix or Amazon. Increasingly, they play a key role in decisions about credit, healthcare, and job opportunities, among other things.

So far, they've remained largely obscure, prompting increasingly vocal calls for what's known as algorithmic transparency, or the opening up of the rules driving that decision-making.

Some companies have begun to provide transparency reports in an attempt to shed some light on the matter. Such reports can be generated in response to a particular incident -- why an individual's loan application was rejected, for instance. They could also be used proactively by an organization to see if an artificial intelligence system is working as desired, or by a regulatory agency to see whether a decision-making system is discriminatory.

But work on the computational foundations of such reports has been limited, according to Anupam Datta, CMU associate professor of computer science and electrical and computer engineering. "Our goal was to develop measures of the degree of influence of each factor considered by a system," Datta said.

CMU's Quantitative Input Influence (QII) measures can reveal the relative weight of each factor in an algorithm's final decision, Datta said, leading to much better transparency than has been previously possible. A paper describing the work was presented this week at the IEEE Symposium on Security and Privacy.

Here's an example of a situation where an algorithm's decision-making can be obscure: hiring for a job where the ability to lift heavy weights is an important factor. That factor is positively correlated with getting hired, but it's also positively correlated with gender. The question is, which factor -- gender or weight-lifting ability -- is the company using to make its hiring decisions? The answer has substantive implications for determining if it is engaging in discrimination.

To answer the question, CMU's system keeps weight-lifting ability fixed while allowing gender to vary, thus uncovering any gender-based biases in the decision-making. QII measures also quantify the joint influence of a set of inputs on an outcome -- age and income, for instance -- and the marginal influence of each.

"To get a sense of these influence measures, consider the U.S. presidential election," said Yair Zick, a post-doctoral researcher in CMU's computer science department. "California and Texas have influence because they have many voters, whereas Pennsylvania and Ohio have power because they are often swing states. The influence aggregation measures we employ account for both kinds of power."

The researchers tested their approach against some standard machine-learning algorithms that they used to train decision-making systems on real data sets. They found that QII provided better explanations than standard associative measures for a host of scenarios, including predictive policing and income estimation.

Next, they're hoping to collaborate with industrial partners so that they can employ QII at scale on operational machine-learning systems.

Katherine Noyes has been an ardent geek ever since she first conquered Pyramid of Doom on an ancient TRS-80. Today she covers enterprise software in all its forms, with an emphasis on cloud computing, big data, analytics and artificial intelligence.