AI is learning from our encounters with nature – and that's a concern

By Andrew Robinson, Communications Scientist and Scholar, Australian National University

In the Global Biodiversity Information Facility there are 682,447 records of human encounters with dandelions. from www.shutterstock.com

The idea seems wonderful - a phone app that allows you to take a photo of a plant or animal and receive immediate species identification and other information about it. A “Shazam for nature” so to speak.

We are building huge repositories of data related to our natural environments, making this idea a reality.

But there are ethical concerns that should be addressed: about how data is collected and shared, who has the right to share it and how we use public data for machine learning.

And there’s a bigger concern – whether such apps change what it means to be human.

Encounters with dandelions

Oliver Sacks, the brilliant neurologist and author, once arranged to take a group of his patients on a field trip to the New York Botanic Garden. One of his patients, a severely autistic young man named Steve, hadn’t stepped outside the facility for years. He never spoke; indeed, the doctors believed him incapable of speech.

In the gardens with Sacks, however, the invigorated Steve plucked a flower, and to the surprise of everyone, uttered the word “dandelion.”

Over the last decade, this affinity so many of us feel for nature – what the famed biologist Edward Wilson termed “biophilia” – has resulted in an explosion of big data. In the Global Biodiversity Information Facility (GBIF, an online database run out of Copenhagen) there are 682,447 records of human encounters with dandelions. Overall, the database holds more than 850 million observations of over a million different species of flora and fauna.

It’s an impressive achievement, a gestating, global catalogue of life. It allows us to see the world in new ways. For example just this year, thanks to the more than 42,000 recorded sightings from more than 5,000 participants using WhaleShark.org, we’ve gained unprecedented insight into the behaviour of the world’s largest fish species. Or on an bigger scale, the millions of bird observations generated through an app called eBird have allowed us to visualise the precise migratory routes of over a hundred different bird species.

At the same time, in an outcome largely unforeseen by its early collectors, info-engineers are using the data to train artificial intelligence (AI), particularly computer vision apps to help us interpret the plants and animals we see around us. And these tools are raising some interesting, sometimes troubling questions.

Joseph Banks in your pocket

In one sense, of course, such tools are magical. The fictional tricorder of Star Trek is a magnificent device, scanning alien life forms, making them familiar. If we had a version on Earth, it’d be the equivalent of a pocket-sized Joseph Banks, a trusty sidekick of discovery, filling us with a sense of confidence and control.

In China the latest version of the Baidu browser (a so-called Chinese Google) comes with a plant recognition feature built into it. Point your camera at a dandelion and you’ll see the Chinese name for it - 蒲公英. Such apps are triggering a new wave of botanical interest among the general population in China.

But there are also questions about these AI tools interfering with our ability - perhaps a human need - to easily transfer our unique nature expertise to, or gain expertise from, other people. Is the amount of resources going into developing AI matched by what we invest in developing ecological literacy within the billions of supercomputers in peoples’ skulls?

There are questions about data bias. A disproportionate number of data collectors - often called “citizen scientists” - are first world hobbyists, birdwatchers, camera geeks. Typically then, the data comes from a relatively non-diverse sector of society.

There are questions about ownership, data appropriation, human agency. Who’s going to own and control the AI? Will the people whose expertise has trained the AI be fairly acknowledged, respected, rewarded?

Or is all that data, as the US economist Philip Mirowski recently argued, nothing more than “the donation of unpaid work to privately owned entities” - entities who will digest and then regurgitate the information into yet another online product we can’t live without? If you search the terms of popular citizen science apps, you’re unlikely to find any mention of how your data might be used to train AI systems.

Empire building

There’s a sense of déjà vu here. The botanical classification conducted by such scientific luminaries as Carl Linnaeus and Joseph Banks – the book Systema Naturae was a sort of GBIF of its day – is often associated with the big data activity of empire building. As explained by the essayist Anne Fadiman in Collecting Nature, botanists would travel to remote parts of the world, find a species which had been known by a local name for centuries:

[…] rechristen it with a Latin binomial, and presto! It became a tiny British colony.

Subsequent generations, meanwhile, would grow up in a world where the only meaningful descriptions of nature existed in empire-approved systems of classified truth: museums, libraries, the biology labs of universities.

Perhaps we’ll cede control; perhaps we’ll have it wrested away. For the developer of nature identification apps, what incentive exists to disabuse us of the Star Trek Tricorder illusion?

The Colorado-based PlantSnap, for example, claims to be training its AI on “50,000 new species per month, and will have every species on Earth covered by the end of 2017”. You could argue this is not just misleading, it’s impossible. A significant portion of plants are yet to be discovered, and far more have yet to be photographed in the wild.

What is human perception?

the state-of-the-art in computer vision is rapidly approaching that of human perception.

But what is human perception? It’s easy to forget that each record in all that training data represents - like Sack’s autistic patient in New York - a special act of observation, a sudden spark of curiosity, a unique moment of seeing that belongs to the individual.

One thing’s for sure: when it comes to developing AI, there’s an urgent need for more thinking, more consideration, a broader diversity of viewpoints. In developing AI tools, can we program them to value the creative act of human perception - the authentic, the spontaneous, the unpredictable?

Or maybe as Amy Webb, a tech futurist at New York University, has recently proposed, we should establish data sanctuaries. Here, like in nature reserves, our data could roam wild and free, forever untouched by AI, governments, corporate interests. Perhaps a similar space - or a duration of time between data input and response - is needed to protect our unique relationship with the natural world.

In Shakespeare’s Twelfth Night, the lovelorn bachelor Orsino, makes an interesting observation of violets. Their scent, he declares, is like romantic love, it

makes you want everything, but it makes you sick of things a minute later, no matter how good they are.

It’s an astonishing insight, and four centuries later, this insight was scientifically confirmed: the beta-ionone in violets, researchers discovered, produces an anosmic affect in the human olfactory system, allowing you to perceive the scent one moment, only for it to vanish (like romantic love) the next.

This exploration of the natural world - this observing, comparing, playing, discovering, loving - is an impulse that’s core to our humanity, and one, I’d suggest, we should be careful not to lose.