AI training

Our new training tool is OUT. Starting today, the human friends of Iris.AI can train the AI directly from research paper abstracts and help her understand synonyms.

In May 2016 we launched the first version of our AI training tool to help Iris.AI learn via Ted Talks. A large labelled data set was needed to build an effective training loop. Since then we’ve asked our users to join us in our effort to take the existing stock of research into effective use by participating in this learning experiment.

Fast-forward 10 months and the crowd-training has become one of the backbones of our AI development. Iris.AI trainers around the world have done a tremendous job by training thousands of texts, equivalent to more than half a million trained concepts. These inputs have allowed us not just to improve the accuracy of our algorithm by several percentages but also to verify and assess the quality of it.

Today we’re taking the AI training to new heights by releasing the next version of our training tool. The new curriculum allows Iris.AI to learn directly from research papers and synonyms. This means that the source data of the supervised data set expands to millions of texts letting Iris.ai optimise her neural nets with scientific concepts across research fields.

Our next goal is to gather and inject a trained dataset of 5000 paper abstracts to the algorithm. With those inputs we aim to improve the connections in the neural nets of Iris.AI by approximately 10 %.

Interested in joining the effort and taking Iris.AI to the next grade? Sign up to become a trainer and we’ll help you get started.

Kudos to all our existing AI trainers and welcome new ones! We look forward to sharing the next leaps of this journey with you.

Gone are the days when baby Iris.AI could only deal with those really cool TED Talks to begin making sense of the vast, fascinating world of science. She can now also read the abstract of pretty much any English-language scientific paper — a great achievement, no doubt!

In what ways has she grown, you might wonder?

Well, there are several critical aspects to her development worth going over.

As an AI, we wanted young Iris.AI to learn how to instantly map out the research landscape around an inputted scientific text. She is after all an aspiring science assistant, developing ambitious capabilities to save academic and industrial researchers hours of manual search, which requires hard-to-come-by domain expertise of taxonomies and vocabulary.

Let’s look at her brain and how it has evolved since version 1.0.

Iris.AI now addresses natural language processing through the in-house implementation of a novel neural model. There are three aspects to that:

On the AI front Iris.AI now performs non-semantic neural topic modelling, replacing our previous implementation of LDA. Given a user input, she generates a concept hierarchy flexibly tailored to that particular input. To do her tasks Iris.AI uses a relational database with a Python-based API platform and an HTML5/CSS3 client. And in terms of learning Iris.AI now combines unsupervised learning derived from running models like TF-IDF and Word2Vec with a supervised input layer put together by our wonderful community of AI Trainers, all integrated into our Neural Topic Modelling algorithm.

So how much better is Iris.AI performing in terms of extracting concepts, modelling topics and matching papers? With a cool head it is still early days to tell, but we are very excited about the results obtained from our first Scithon run in Gothenburg last week.

What is coming next, in terms of tech developments?

From an AI technology point of view, we will strengthen the current models by shaping them as close as possible to human behavior using state-of-the-art neural models. Looking at systems architecture, a Spark framework with a graph database. And from an AI learning perspective, introducing deep learning with reinforcements, plus semi supervised learning and cutting edge annotation techniques at the disposal of our AI trainers.

So stay tuned for more news around our next Scithons and the plans to grow our AI Trainer community. And please do not hesitate in sending any feedback our way. We’d love to hear from you!

In case you’d like to be part of this learning experiment and become her teacher, sign up, but do it quickly as we only have a limited number spots for AI teachers for the next few months.

AI TRAINING FAQ

What is this AI training about again?

It’s a huge learning experiment where we ask our volunteering AI trainers to help Iris.AI grasp the meaning of what she has read.

Why does an AI need human teachers?

Imagine yourself in a lecture you know nothing about in advance. You’ll understand some of the content, but miss some as well, for sure. Then, later on you’ll pick up the remaining bits and pieces by reading more and asking your peers what they think about the topic. The learning process of an AI is very much like that. She can’t figure out the world by herself. She needs your help!

What does AI training mean in practice?

The very first version of our AI training platform is built around the TED talks as that’s what Iris.AI has read thus far. On the training platform you’ll be asked to watch any TED talk you like and validate the concepts Iris.AI has extracted from that talk. Easier said, we ask you to tell us if Iris.AI got it or not.

Do I need to be a researcher or know something about artificial intelligence in order to teach?

Absolutely not.

Ok, how can I get started?

Sign up here and we’ll provide you with your login credentials via email.

When will I see the results of training?

With thousands of inputs from our AI trainers we’ll be able to optimise the artificial brain of Iris.AI. Through this process certain concepts start to get more emphasis while new concepts are emerging. The algorithm won’t change immediately after trainers’ inputs, though, as it takes some time to gather enough data. In some months, with that data, our AI scientist’s brain starts to mimic the thinking of her teachers.

Imagine attending a lecture on a topic you know nothing about in advance. Although your brain is working hard to make sense of what the teacher is saying, you are only able to grasp some small fractions of the content.

In February we put our AI science assistant Iris.AI on “a lecture” like that. The only difference was that, instead of one topic, we gave Iris.AI the full body of TED talks to make sense of. This body consists of more than 2,000 talks covering all the fields you could possibly imagine. Iris.AI was not just assigned to read through all the texts but, more importantly, to capture the meaning of each text as well.

We structured the first version of Iris.AI’s brain using NLP algorithms. That structure enabled her to extract key concepts from the talks through a process that consists of analysing ways in which words are used in the text, understanding what those words mean in their context, finding synonyms, filtering out irrelevant content, building “mind maps” that we call concept trees across all the talks, and finally, setting those concepts in a hierarchical structure.

How well did she perform? Well, sometimes she was killing it:

… and sometimes she got totally lost:

We measure Iris.AI’s ability to understand what she reads based on the number of relevant concepts that she extracts from the texts we give her to read. Currently, we estimate that she manages to get the concepts right with 70 % probability.

It’s a fairly good start, but our goal is, of course, to get close to 100 %. Now, the question is how to get there. How can we train this artificial brain to understand the meaning of textual content like an expert?

Let’s get back to that lecture. You thought it was interesting and you would like to learn more. To expand your understanding you probably end up reading more about the topic and asking your peers what they think about it.

This is exactly how the curriculum of Iris.AI looks like in the near future. Her algorithmic brain will learn in the upcoming months by being exposed to more data and, even more importantly, by asking her peers, that is human beings, what they think about the texts that she has just familiarised herself with.

Thoughts of peers are important because texts are, by definition, summaries of thoughts written by human beings. We want Iris.AI to be able to understand those thoughts and the best way to go about it is to ask people to articulate them. As simple as that, almost…

Thoughts are highly personal, too. You go and ask people to summarize the same text and you probably get as many answers as there are respondents. To minimize the risk of bias in interpretation, you better ask the opinion from far more than one or two peers.

We’re launching an AI crowd training platform next week. Through that platform, we’ll initially ask thousands of volunteering AI trainers from various backgrounds what they think is most relevant in the TED talks they’ve watched. In this process the trainers will be able help Iris.AI grasp the meaning of the talks via three different types of mechanisms:

Firstly, the AI trainers are asked to validate the concepts that Iris.AI has already extracted from the TED talks by choosing the ones that they think are relevant. Secondly, AI trainers will be able to make corrections to the concepts and thirdly they can give Iris.AI new food for thought, i.e. new knowledge in the form of new concepts.

With thousands of inputs from our AI trainers we’ll be able optimize the artificial brain of Iris.AI. Through this process certain concepts start to get more emphasis while new concepts are emerging. The algorithm won’t change immediately after trainers’ inputs, though, as it takes some time to gather enough data. Eventually, with that data, our AI scientist’s brain starts to mimic the thinking of her teachers.

AI training is very much like teaching, but we hope that it’s not just the brain of Iris.AI that grasps new ideas and develops. Optimally training becomes a two-directional process where the trainers’ thinking evolves as well.

We’re now looking for teachers willing to participate in this huge learning experiment. You don’t need to be a researcher or an AI expert to participate. On the contrary, the more we have teachers from different backgrounds, the merrier. The training tool we’ve been building in the past couple of months is almost ready. If you’d like to be among the first ones to try it out, sign up for our AI fellowship program and you’ll become a teacher of a young, but ambitious, AI scientist.