Ethical and Social Challenges for Machine Learning

Last Thursday I had the great honour of being invited to give evidence to the Royal Society as part of their policy project on Machine Learning. The Machine Learning Working Group, chaired by Professor Peter Donnelly, had organised a day of oral evidence gathering as part of the project in order to help shape their views on how Machine Learning technology might impact UK society.

The day was broken into three sessions, each of which covering a different aspect of the potential impact of Machine Learning technology.

The first was titled, “Concentration of power and potential inequalities of access in society”, the second “Human-machine/ software interactions and agency in the near term (5-10 years)”, and the third “Transparency vs opaque practices in issues related to trust”.

My own contribution to the day was in the context of the first session, where we were given the task of examining whether developments in machine learning inadvertently benefit particular social groups while failing to consider others. In particular the focus was to whether there was imbalance or inequality in public access, technical literacy and skills development around Machine Learning.

It was a fascinating session to be part of. Alongside me giving evidence was Cat Drew from the Government Data Service and Peter Wells from the Open Data Institute. Some of the thoughts and ideas shared in the sessions has provided me a wealth of fodder for future articles, but for now I’ll be content just to share the questions we were asked, and a few of my thoughts in relation to these.

The areas we were asked to consider were:

Who are the people and organisations taking advantage of machine learning? Who has the data that is being used in machine learning applications?

Without being too specific – to me at time of writing I break down the Machine Learning (ML) enthusiasts into broadly three camps. Those individuals who have the skills and who are developing technology for potentially disruptive impact and within the start-up community, those doing the same in academia, and organisations with sufficient economic power to invest in attracting these skills into their organisation in order to develop solutions.

I personally don’t see access to data being a challenge – anyone with an internet connection can harvest data that might be potentially interesting or useful for their project; to me the challenge is whether those who are in the ‘have-not’ category really understand the impact that the data-driven economy might have to their business and have they started to get some of the basic data collection and data integration infrastructure in place for future mining and analysis?

Who has the skills to develop and benefit from machine learning? Are there any groups without the necessary skills to benefit from it?

Expanding on the first question – I believe the good news is that the talent pool is steadily increasing and the skills are becoming more available (albeit not quickly enough). The challenge in any market where demand outstrips supply is that only those with sufficient economic resources are likely to attract the necessary talent to ensure success of their project. This poses specific problems to society as a whole.

In essence, ML technology achieves two significant results. Those who successfully deploy it develop an ability to ‘anticipate’ or ‘predict’ the future based on patterns encoded within their data; and they can potentially perform tasks with a game-changing degree of time-leverage than previously possible. The combination of these two results give a level of competitive advantage that in itself could lead to rapid dominance in their given market. An example of this is how Uber uses predictive analytics to suggest which parts of town might be a good ‘hunting ground’ for drivers looking for jobs. By being more efficient with traffic routes, mean a greater number of jobs can be completed per driver. The combination of both these factors (along with a positive User Experience, and a positive Viral Marketing co-efficient) leads to rapid and complete market dominance.

What is known about the views of the public on who is using their data and how it is being used? Is there an awareness of how to access resources about machine learning and where you are opting in/out of your data being used for machine learning?

Potentially very little, but my view (and I believe the consensus of my fellow panellists) was that people in general are more likely to be concerned about their data being gathered, stored and used appropriately rather than necessarily that it might be mined by machine.

I gave reference to my earlier article on the hidden value of data and argued for stronger awareness being created in the public for the value of their data and perhaps regulation for firms to offer consumers the choice of paying for their services with cash rather than by surrender of their data.

One of the committee members offered the counter-argument that consumers do indeed have a choice not to use services that might use their data in ways we don’t agree with or can’t control. I put the point back – how many of us really have a choice not to use services like LinkedIn (in my case) or Facebook (in the case of a teenager). Tech firms have a duty not to exploit their market position – policy makers need to recognise this and act where they see abuse in the market.

Is it important to ensure open-source access to data and open-source access to machine learning algorithms?

It’s clear that the Royal Society’s focus for this evidence gathering workshop was on skills and access to technology, but in my view this is not the area of greatest concern. To me, it doesn’t really matter whether a cabbie who has lost his or her livelihood because of Machine Learning (Uber/ Self-Driving cars) has the ability to download an Open-Source library of machine learning algorithms, or indeed has the skills to understand them. What matters is how wealth is redistributed from the cabbie or local taxi firm, and consumers like the rest of us – to an oversees company owned by high-net worth investors outside of the purview of domestic regulation or taxation.

To me what’s important is to ensure open-source access to capital created thanks to the use of machine learning algorithms. If we’re influencing policy decisions, then can’t we somehow incentivise the cabbie to invest in crowd funded machine learning projects so that he or she at least has a hedge against their income being lost in the near term? Shouldn’t the state be finding ways of ensuring capital created by this technology stays within its control?