Job hunting is a matter of Big Data, not how you perform at an interview

Firms crunching numbers and algorithms offer a hi-tech route to employees who will fit in – and not quit

Alistair Shepherd proved that numerical analysis of a simple psychological questionnaire on its own was far more efficient in predicting successful candidates than the interview-based model. Photograph: Nullplus/Getty Images

How do we end up in the jobs we end up in? And why did we miss those opportunities we had set our hearts on? If most of us look back, the reality is likely to be as fraught with chance as any other aspect of our biography. Our working lives are essentially fictive constructs, born out of the fantasy and chemistry of CV and interview, the lucky break or wrong call, the age-old laws of square pegs and round holes, or, just occasionally, of "perfect fit".

Where such randomness exists now, of course, "big data" – that amalgam of all that we and our fellow digital citizens do online, the gigabyte human traces we bequeath second by second to machines – is certain to follow.

None of us would like to think of our essential self – our talents and skills, traits and quirks, education and experience, those all-important extracurricular passions and hobbies – as being reducible to a series of data points, a set of numbers and correlations. But what if such information could help us find our perfect workplace, our ideal match?

One man trying to bring data to bear on our careers is Alistair Shepherd, an engineering graduate of Southampton University. In 2009 he won a place at Harvard Business School where he planned to develop an idea for a company based on a wave-power innovation he had created. But he was diverted by two things that his professor, Noam Wasserman, said to him. Wasserman is author of The Founder's Dilemmas and a guru of the reasons businesses go wrong.

He told Shepherd that 83% of all new companies failed and that the evidence showed the primary reason for failure in two-thirds of those cases was not the quality of the business idea or lack of funding. It was a failure in the personal business relationship of the company's founders. Shepherd decided he would abandon his engineering project and attempt some social engineering instead.

He is explaining some of this in his shared office on a floor of the Google incubator campus just off "Silicon Roundabout" at the junction of City Road and Old Street in London. The business he is trying to incubate is called Saberr – tagline: "Optimise your workforce" – and it is one of the more interesting entrants into the increasingly crowded field of "people analytics", the attempt to apply big data to human performance, in an effort to optimise productivity, success (and happiness) in the workplace. Shepherd speaks with something of the infectious excitement of the early adopter.

The social science of human interaction at work that Shepherd discovered at Harvard, he suggests, was a minefield of competing psychological models, mostly from the middle of the last century. As a would-be MBA, he was already familiar, for example, with Myers-Briggs (the model, based on the work of Carl Jung, that was initially developed by Katharine Briggs and Isabel Myers to assign appropriate jobs to women entering the workforce during the second world war) as well as the more evidence-based theory of the components of an ideal team researched from observation by Meredith Belbin at Henley College in the 1970s. Neither they, nor their later variations, however, had ever proved particularly useful in predicting real-world business success.

Shepherd looked for a significant source of data on what might make successful teams. "Online dating," he says, "seemed to me a great place to start. It provides a digital record on a very large scale that answers a very simple question: which two people will have a successful relationship?"

He spent months digging through what research he could access of the characteristics of the most successful online matches (which he defined as when both parties committed to cancelling their online dating accounts, settling for what they had). Using the kind of questionnaire that helped to make those matches in the first place – "Do you like horror movies?", "Would you have sex on a first date?" – stripping out as much romance as possible, and combining it with the latest academic thinking in behavioural science, he worked out a rudimentary algorithm for what might spark a successful business relationship.

All he then needed was a forum in which to test his theory, culled from the data, that a measurable "resonance" of shared core values, rather than the grouping of any particular personality types, was the key driver of the most creative partnerships and teams – and that such a resonance could be predicted using analysis based on his blind online questionnaire.

The first place he tried was at a business competition called the Bristol Spark, a week-long competition for young entrepreneurs in which there were eight teams of eight people. The people in the teams had never met before. The idea was to come together to produce business ideas by the end of the week and present them before a panel of investors. Several thousand pounds were at stake.

Before the competition started, Shepherd got permission to have everyone answer the 25 online questions he had formulated from the dating research. Then he was told who was in each team, so worked out their "resonance" and, based on his "pretty scrappy algorithm", ranked the teams in order one to eight, presented the results to the judges in a sealed envelope and asked them to open it after they announced their decision.

Shepherd never met any of the people involved. He had no knowledge of their skills, experience, education, demographic or, crucially, what ideas they were proposing. He had nothing but their answers to questions such as, "Do spelling mistakes annoy you a great deal or just somewhat?" It turned out he was correct not only in predicting which team won, but also the exact ranking of all the other teams.

He has refined his formula andrepeated the sealed envelope exercise many times at innovation competitions. "The longest we did was the Microsoft Imagine Cup," he says, "which is an eight-month student development competition for the Windows platform. We have a greater than 95% accuracy in predicting the ranking of teams." Last September, he did his trick at the London Seedcamp competition, which won Saberr its office space at this Google campus, among other things, and the chance to see if it could make money from data-driven clairvoyance.

Shepherd is at the very beginning of this commercial experiment, in the process of persuading companies to let him identify their high performers and engineer their most successful teams. He pleads innocence to the idea that he might just have created a monster – another way for companies to justify with data their "up or out" philosophies – and is some way from making it pay, but his ambition is to feed the growing appetite to apply quantitative rigour to some of the more intangible aspects of human potential.

The advance of algorithms into recruitment and "talent management" – workplace Moneyball– is the latest encroachment of the power of big data in areas traditionally policed, for better and worse, with a large element of human instinct and subjective decision-making. If you trace its history, you get back to Daniel Kahneman, the Nobel prize-winning psychologist, who, as he documented in his game-changing book, Thinking Fast and Slow, was tasked at 21 with the job of organising recruitment for the Israeli defence force.

He proved that numerical analysis of a simple psychological questionnaire on its own was far more efficient in predicting successful candidates than the interview-based judgment of expert senior officers (for which, wonderfully, the correlation between actual and predicted performance of recruits was precisely zero).

Digital applications of that insight have the advantage of huge data resources and programmable machine intelligence. They also have all of the dangers – to privacy, notion of the individual, to long fought-for rights in the workplace and beyond – that blind number-faith in arenas of complex human activity always involves. Any employee knows there are few more chilling sights than time-and-motion operatives with clipboards. As our work and our career path moves online, those anonymous monitors of our progress and productivity (or its perceived lack) are more and more likely to be embedded and alert to our every interaction.

Proponents of big data invite us to think of the information gained from such insight as value neutral. They argue, with the persuasive clarity of the digitally born-again, that it will offer certainty in spheres of doubt. And what more dubiously scientific process is there than that of job recruitment?

Lauren Rivera, a sociologist at Northwestern University in America, spent three years from 2006 studying the recruitment practices of global investment banks, management consultancies and law firms, which spent millions of dollars on apparently objective processes to secure "top talent", and concluded that among the most crucial factors in decision-making were "shared leisure interests". Squash players hired squash players. Golfers chose golfers. "Assessors purposefully used their own experiences as models of merit," she concluded.

Listening to Shepherd talk about "cultural resonance", I wonder how his algorithm would counter such biases?

"To achieve that shared spark," he says, "you use the data to maximise behavioural diversity, but to minimise value diversity." His interest is in the alignment of values rather than the values themselves. "When you see companies with the words 'integrity' or 'trust' written on the wall in big type you know straight away that's a load of nonsense, because how do you measure it? We all say we value trust and freedom but do we, say, value trust more than freedom? It is at those points that the data begins to let you see those fundamental values that allow very different kinds of people to work very successfully together." Shepherd is evangelical about the possibilities. "If you think of your workforce as a machine that delivers all your goals, then for you not to pay mathematical attention to how best it fits together is madness."

That perceived attention deficit is being filled at a rapid rate. The science and the pseudo-science of business performance originated in the post-Henry Ford world of US commerce and it is there that most of the analytics pioneers are pimping their algorithms. Ever since there have been corporations and companies, leaders and managers have been invited to think of their employees – arguably always to the detriment of the latter – as numbers in a system.

The complexity and sheer volume of those numbers is expanding exponentially. Max Simkoff is the co-founder of a fast-growing Silicon Valley company called Evolv whose website keeps a ticker of the number of data points its computers are assessing (489,562,868 as I write and rising by the second).

Evolv – tagline: "The first predictive analytics app to proactively manage your employee workforce" – was created in response to a particular problem. Simkoff and his business partner worked for a small-scale private health company employing the majority of its workers on an hourly rate. The biggest and most expensive challenge the company faced was the fact that on average those entry-level staff stayed less than a year. Simkoff assumed there must be software packages available that analysed employee data and helped employers discover which attributes made the people they hired more likely to stay in the job.

In 2006, he discovered there wasn't, really. "People were relying on intuition; looking at how many jobs someone had had, for example, and trying to decipher from that whether a person would be a productive and long-term employee," he says, with an analyst's disdain for gut instinct.

Little was being measured, so Evolv set about measuring whatever it could. Its customers mostly supply internal company data daily: how many interactions each employee has had, whether they have sold anything, and so on. This is added to information about how long each individual has been at the company, how they have performed, who their managers have been, any pay increases, subjective appraisal and engagement scores, along with "ongoing macro-economic data on everything from regional labour markets to median home prices".

And then, Simkoff explains, with the satisfaction of the statistician: "Every single night our machine learning engine runs hundreds of millions of reports to look at individual differences even within the same company – some explainable by cultural differences, some by regional differences."

By morning, he says: "If a customer has thousands of people in similar job types, our system can predict accurately on a given day which individuals are most likely to quit." In response, Evolv then offers employers "what-if types of analysis" by which if they change certain incentives – a bonus, training scheme, change in environment – they can see exactly what effect it is likely to have on a particular person's behaviour. In this way Evolv advertises average reduced employee attrition rates among its clients, who include one fifth of Fortune 100 companies, of up to 15%.

As well as apparently inveigling its way into all sorts of (anonymised) aspects of individual employee behaviour, Evolv data undermines certain truisms, among them the idea of the serial job-hopper. "The number of jobs you have previously had," Simkoff says, "and even whether you're employed or not at the time of application, has zero statistical correlation with how successful you will be or how long you will stay." The statistics also show, he suggests, counter-intuitive correlations, including evidence that former prison inmates are among the most reliable employees.

Evolv has not attempted to use its model in professional salaried jobs because, Simkoff says, the data is less reliable as there are rarely hard and fast productivity measures. "Things get squishier in the professional end," he suggests. "But two-thirds of all jobs are entry level, so there is no shortage of opportunity…"

One of the great challenges facing recruiters and job seekers at that squishier end of the market is the vast increase in the number of applications for positions in the few years since the process became digital. Steve Weston, group technology director at the global recruitment firm Hays, suggests that the number of applications per job has risen tenfold in the past five years. Hays sees 30,000 new CVs uploaded every day. Its algorithmic search engine can filter its 250m CV database in less than a second. The machine can learn precise keyword matches for education and experience; it can give precise location matches for individuals; but it struggles to find that Holy Grail of job qualification: culture fit. Weston suggests that is the "10 terabyte" question facing recruiters around the world; one solution, particularly for the new generation of job seekers might lie in "gamification".

Among many companies attempting that particular quest (including Saberr) is another American startup, Knack, which this month established an office in London. Knack – tagline: "We're reimagining how people discover their potential, from education to work to innovation" – uses immersive online video games to collect data to quantify how creative, cautious, flexible and curious, etc, potential job applicants are, and offers thereby to funnel tens of thousands of applicants for clients. Its Wasabi Waiter game, for example, casts the applicant on service at a sushi restaurant with multiple demands on his time and many orders to fill.

Knack began when its Israeli founder, Guy Halfteck, failed to land a job with a large New York hedge fund after an interview process lasting six months of interviews and tests based on the common fallacy that the more information human decision-makers have, the better judgments they can make. Beyond a certain point, the information just becomes noise, Halfteck argues, with a data-driven weight of evidence on his side. He looked for a different model, one that could "transcend resumés, transcend interviews, transcend what people say about themselves and cut to some data that is actually credible …"

Wasabi Waiter is derived from economic game theory and because it aims to reveal how people behave and make decisions in real time "is not about what you say you do, but what you do". The data views the game as a stream of micro-decisions and micro-behaviours and micro problem-solving. "We collect thousands of data points in milliseconds," Halfteck claims, in the way that Big Data believers tend to. Wasabi Waiter was developed by his (handpicked) Ivy League team of neuroscientists, data experts, researchers in artificial intelligence and gaming engineers.

The game Wasabi Waiter was developed by an Ivy Leave team of data experts and neuroscientists.

"When you compare our data to, say, an interaction between a hiring manager and a candidate, it is orders of magnitude greater," Halfteck says. He has had much success in persuading companies such as Shell that his culture-fit games, and the algorithm that matches individuals to particular roles and particular organisations, are predictors of the future.

Along with all the obvious caveats that such an innovation might require – not least the fact that the long-term efficacy of its model has yet to be tested – one potential advantage of using games is that, unlike in written or IQ tests, language and culture barriers are largely removed. Of the argument that the games favour game-playing "millennials", Halfteck can point to his inhouse research that suggests age has no impact at all on outcomes.

"To explain that, if you look at Flappy Birds or whatever, it is a very broad demographic that successfully plays those games from five-year-olds to people in their 80s." What about that significant element of the population that has never held a games console? What about those who are more stressed by the thought of computers than they are by written tests? Halfteck is predictably adamant that it does not replace one barrier to entry with another.

The premise of Knack games is that "the way you do something is the way you do everything". It claims to be a method of extending and smartly filtering the pool of talent from which companies can draw, by measuring intangibles not covered by paper qualifications.

"At the moment, at the top end of the market companies are competing for people from this very small competitive pool of people from Oxford or Harvard or whatever," Halfteck says. "We are suggesting a way that they can find others who have great or greater potential from way beyond that pool." Steve Jobs is often presented as the case in point. As a college dropout, the Apple founder would never have received an interview with most blue-chip firms.

If the counter arguments are also quickly clear – why trust the evidence of a game over the centuries-old grind of academic pass and fail – Halfteck is not alone in envisaging a future in which "gamification" becomes the norm in applications of all kinds. "This year we will start living side by side with standard tests in many schools and colleges and universities in the States," he says. "Traditional scores only look at written test abilities. They do not begin to measure the factors likely to have most bearing on your success: social skills or personality traits – how you deal with stress, how you collaborate with other people, how much you listen…"

The next logical step in such a philosophy is the extension of such "gamification" to all aspects of work and life. If this prospect sounds alarming – as alarming perhaps as the knowledge that governments and corporations have long been collecting the data of all our private moves on the internet and applying their algorithms accordingly – then that is already upon us. Closely monitoring and publicly sharing one's health information is part of a growing trend of "the quantified self" movement; motto: "Self-knowledge through numbers."

It stems from the belief that the examined life is now made from data points: blood pressure, heart rate, food consumption, hours of sleep, quality of exercise, as well as the nature and range of our real and social media interactions, add up the precise data-set of who we are. Or so the belief goes. That philosophy also threatens to invade the workplace (and brings to mind a variation of that Neil Kinnock formulation: "I warn you not to be stressed, I warn you not to be unfit, I warn you not to be disabled or ageing or tired or having an off day…").

Alex Pentland, professor of media arts and sciences at MIT, has gone a stage further than virtual world "gamification" in trying to collect data on real-time human performance, and what makes successful teams. He has gone into banks, call centres and research institutions and persuaded workers to wear an "electronic badge" that monitors the tone and range of their interactions and certain elements of body language and self-quantification over a period of months.

"What I decided to do is to try to study this behaviour in the way you would study ants or apes," Pentland tells me. The results of his work were published in Nature and Science as well as the Harvard Business Review.

"We found that you could pretty accurately predict how well the group or individual would do without knowing any of the group or the content of their work."

The data suggested that the success of teams had much less to do with experience, education, gender balance, or even personality types; it was closely correlated with a single factor: "Does everybody talk to each other?"

Ideally this talk was in animated short bursts indicating listening, involvement and trust – long speeches generally correlated with unsuccessful outcomes. For creative groups such as drug discovery teams or for traders at financial institutions, say, the other overwhelming factor determining success was: do they also talk to a lot of people outside their group? "What we call 'engagement' and 'exploration' appeared to be about 40% of the explanation of the difference between a low-performing group and a high-performing group across all the studies," Pentland says.

It was important that a good deal of engagement happened outside formal meetings. From this data, Pentland extrapolates a series of observations on everything from patterns of home-working (not generally a good idea) to office design (open and collegiate) to leadership. "If you create a highly energetic environment where people want to talk to each other right across the organisation then you have pretty much done your job right there."

Doesn't the wearing of a badge monitoring your every move alter the way people behave, I wonder. Don't people become deliberately more gregarious and promiscuous in their conversations because they know Pentland's algorithm is "listening"?

If that application and outcome sounds relatively benign, potential extrapolations are not hard to imagine. A German startup called Soma Analytics – tagline: "Evidence-based mobile programmes to increase employee emotional resilience" – has identified some of them in pioneering a system to measure the early-warning signs of anxiety and sleep deprivation in individuals (and potentially employees) which they aim to sell as a pre-emptive strike against the number one enemy of global productivity – "stress-related illness".

There is no data yet to support the idea that monitoring stress levels might itself be stress-inducing, still less to the privacy invasion that such "routine" data collection might violate. Perhaps it is sufficient to remember that the company shares its name with Soma, the drug that maintains the World State's command economy in Aldous Huxley's Brave New World.

The enemy of big data is always privacy. In the ideal world of people, analytics algorithms would have access to all possible "data points" of the lives of employees and potential employees. The argument is that this will ultimately be in the best interests of all parties; employers will recruit ideal candidates and form the best teams, employees will find their most productive role and be most fulfilled. You don't have to believe it for a moment.

Shepherd is not alone in his faith that the best real-time data set by which to examine culture fit and resonance and to optimise teams would be "a semantic examination of email, which the company anyway owns" or uses of social media, which it doesn't, but to which it often has access. "To me," he says, "the ultimate system is not a questionnaire or a game or a badge but an analysis of data that we are already producing all of the time. We like to think of ourselves as special and unique, that a computer cannot tell me who I am, which is wrong because a computer mostly can."

If any computer can do this, it might exist on the Google campus at Mountain View, California. Talking on the phone to Sunil Chandra, Google's vice-president of global staffing and operations, I wonder if search central has any tools that can monitor the semantic code of personal emails? "Not that I know of," he says, a little guardedly.

Google, regularly voted the world's best place to work (and not just for its share options), famously employs a team of industrial-organisational psychologists, behavioural economists and statisticians who use tools including the annual "Googlegeist" survey of every employee to experiment with each detail of campus life – the size of dinner plates, the space between screens. It begins with the data-rich process of recruitment ("Hiring is the most important thing we do," Chandra says. "Everyone is involved"). Google receives around two million job applications a year, and each is analysed systematically. "We certainly try to look at all of them," Chandra says. "We think of recruiting as an art and a science. We are known for the analytics side of it, but we really do have people also look at all the applications we get." The data tells them optimum outcomes are the result of four or five interviews. They used to do 10 or 12.

The data also tells them that exam grades are not predictive of performance at Google at all, so they are disregarded. The urban legend used to be that a Google interview was laced with "brain teasers". "How much should you charge to wash all the windows in Seattle?" did for one candidate. "A man pushed his car to a hotel and lost his fortune. What happened?" for another. Chandra says that approach has also been retired. "It was not predictable performance, and therefore not predictable hiring"

So how do they do it? "There is no secret algorithm," Chandra says. "We use structured behavioural interview techniques, rather than any kind of tests, to look for humble leaders, learners who can work in teams." They guard against bias by having all appointments tested in committee. "We look for cognitive ability, learning capacity and leadership capacity, particularly latent leadership. And then of course for what we call Googleyness …"

As Chandra explains this, I find myself running through the relatively few job applications of my own career (even fewer successful ones). In particular, I'm reminded of my interview for a role at the literary magazine Granta, which involved the then editor Bill Buford pouring me a tumbler full of single malt whisky and employing his own behavioural interview technique which began with: "Do you like women?" My mumbled "yes" seemed to go a long way to convincing him I had the precise culture fit he required in his deputy.

How would you define Googleyness, I ask Chandra.

"We think of it as a characteristic where folks can bring their whole self to work," he says.

Can he imagine a situation where the machines can identify Googleyness without intervention from Googlers?

"No," says the Googlegeister in chief, reassuringly. "We still believe there are a lot of things the data will not tell you."

• This article was amended on 13 May 2014. An earlier version said that Google receives around three million, rather than two million, job applications a year.