Neona: A Conversational Agent That Teaches AI

Towards the end of last year, I began working on a chat bot to help people understand artificial intelligence. She's named Neona. People interacting with her will learn about the topics students of AI study. In the future, she will be used by students to find courses and jobs in the field of AI. From the perspective of universities and employers, she will be used as a teaching assistant and a recruiting tool. In this post, I'll describe her technical architecture and the artificial intelligence concepts used in her design. An early demo is now available if you'd like to chat with her.

High-Level Design

There are two primary systems that compose the agent. The first is a knowledge system composed of subsystems that acquire, store, and retrieve knowledge. The second is the conversational system that allows the agent to interact with people and reason over its knowledge. There are two technologies that these systems share that allowed the agent to remain cohesive: Microsoft Azure, an “open, flexible, enterprise-grade cloud computing platform”, and Node.js, an “event-driven I/O server-side JavaScript environment”. These were chosen because of the potential of the Microsoft Bot Framework, which was the main driver behind this architecture.

The Knowledge-Base

The current knowledge-base (KB) consists of AI concepts pulled from Wikipedia and stored in a document store. In the future, the KB will also consist of users that interact with the agent, courses from online programs, and jobs from career websites.

The overall process was to define the data the agent needed, extract it from Wikipedia, and store it in the document store where the agent can retrieve it.

Here are the technologies that were implemented to put this KB into production:

DocumentDB: “A distributed database service for managing JSON documents at Internet scale.” It is a highly-flexible key-value store that integrates closely with other Microsoft systems and interfaces well with Node.js.

Azure Functions: “Process events with a serverless code architecture.” This service allowed several small Node.js modules to run in the cloud. These modules were responsible for extracting data from Wikipedia, processing it into the desired form, and storing it in DocumentDB.

MediaWiki action API: “A web service that provides convenient access to wiki features, data, and meta-data over HTTP.” Using Node.js’ request module and JavaScript Promises, the API is called with carefully constructed queries that return just the data the agent would need.

Azure Search: “A fully managed search-as-a-service in the cloud.” It indexed the contents of the KB and provided an HTTP endpoint the agent can hit to query the KB. It natively supports DocumentDB.

The Knowledge Source

All data was extracted from Wikipedia, starting at the node https://en.wikipedia.org/wiki/Category:Artificial_intelligence. The agent made an attempt to learn concepts from every page and several subcategories from that root category. There is a one-to-one relationship between a Wikipedia page and a concept in the KB. If any issues were encountered while scraping a given page, such as missing data, the page was skipped.

Feel free to try either of those out in a browser to see the results. They were built by referencing the WikiMedia module Categorymembers and the extension Extracts, respectively.

Knowledge Acquisition

The WikipediaCategoryToAIConcepts Azure Function takes in a Wikipedia category and attempts to transform all of its child pages into JSON documents that represent concepts in the KB. The InsertAIConcept Azure Function takes in a single JSON document and inserts it into the KB as long as the concept it represents is not already stored in the KB. These functions are built in a way that they could be ran on a schedule and update the KB as the Wikipedia category and pages are updated. The code for this part of the project can be found on GitHub at https://github.com/praeducer/conversational-agent-functions. This repo is continuously integrated with the production agent.

Knowledge Representation

The concepts are stored in DocumentDB as JSON objects.

Here is an example document:{ "source": { "name": "wikipedia", "accessedDate": 1479691542482, "scriptVersion": "0.0.1", "pageid": 10136 }, "active": true, "title": "Expert system", "extract": "In artificial intelligence, an expert system is a computer system that emulates the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning about knowledge, represented mainly as if–then rules rather than through conventional procedural code. The first expert systems were created in the 1970s and then proliferated in the 1980s. Expert systems were among the first truly successful forms of artificial intelligence (AI) software.\nAn expert system is divided into two subsystems: the inference engine and the knowledge base. The knowledge base represents facts and rules. The inference engine applies the rules to the known facts to deduce new facts. Inference engines can also include explanation and debugging abilities.", "id": "a958db8e-7160-42f4-b3a8-388d2726ba0e" }

The source is designed so that the agent can learn concepts from more than just Wikipedia. Because this is built in DocumentDB, this schema can change easily if new sources provide new information.

Knowledge Retrieval

The agent uses Azure Search to efficiently retrieve concepts from the KB. The KB is indexed nightly so any documents requested are up-to-date. So far, 811 AI concepts are ready for retrieval:

The ‘title’ and the ‘extract’ of each concept is indexed, making them “searchable”:

Which can return a ranked list of results like:[ { "@search.score": 4.4751096, "title": "Expert system", "extract": "In artificial intelligence, an expert system is a computer system that emulates the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning about knowledge, represented mainly as if–then rules rather than through conventional procedural code. The first expert systems were created in the 1970s and then proliferated in the 1980s. Expert systems were among the first truly successful forms of artificial intelligence (AI) software.\nAn expert system is divided into two subsystems: the inference engine and the knowledge base. The knowledge base represents facts and rules. The inference engine applies the rules to the known facts to deduce new facts. Inference engines can also include explanation and debugging abilities.", "id": "a958db8e-7160-42f4-b3a8-388d2726ba0e" }, { "@search.score": 3.9047346, "title": "Legal expert system", "extract": "A legal expert system is a domain-specific expert system that uses artificial intelligence to emulate the decision-making abilities of a human expert in the field of law. Legal expert systems employ a rule base or knowledge base and an inference engine to accumulate, reference and produce expert knowledge on specific subjects within the legal domain.\n\n", "id": "d1f14975-32cd-4243-bdc1-b00d82273092" } ]

The Conversational Agent

The agent’s primary purpose and most robust feature is its ability to search for AI concepts from the KB. It is also able to handle simple conversational components such as ‘Hello’, ‘Goodbye’, ‘How are you?’, ‘Who are you?’, ‘Thank You’, ‘You’re Welcome’, and can even tell some jokes. It is currently tested on Skype but theoretically would work as-is on several other channels including Facebook Messenger and a web client. Connecting to a variety of channels from a single code-base is a key benefit of the Microsoft Bot Framework.

Here are the technologies used to put this conversational agent into production:

Microsoft Bot Framework: This framework provided an impressive amount of functionality and was the inspiration for this project. For this agent, it provided the connection to Skype, managed incoming and outgoing messages, and helped orchestrate the dialogue flow. It also integrated naturally with LUIS.

LUIS: “A fast and effective way of adding language understanding to applications.” After some manual training, LUIS was used to detect the intent of human messages. Then, after some logical routing, the agent could return a reasonable response.

Azure Bot Service: “Intelligent, serverless bot service that scales on demand.” After setting up continuous integration, this service turned the development repository into a production application. It acted as the DevOps team.

Detecting And Acting Upon Intent

The agent has one main or root dialogue which can route the user to several smaller sub-dialogues. The root dialogue is very wide, handling many different kinds of incoming messages. The sub-dialogues are more narrow, doing very specific things but sometimes going a little deeper than the root dialogue would go. When the human first engages with the agent, it introduces itself. After the initial introduction, any incoming messages from the human are routed to LUIS. LUIS then recognizes what the human is trying to say and maps it to an intent. LUIS was manually trained to handle sixteen different intents. Each intent has a set of functionality associated with it that can range from a simple text response to constructing UI elements that are displayed in the chat window. Some intents even start new dialogues and ask the user questions in order to gather more information.

Custom intents are defined in LUIS and then referenced in code. LUIS uses a classifier to map utterances, i.e. things a human may say, to intents. To seed the system, utterances were entered in manually and tagged with intents. Once LUIS had a solid set of initial data, the model was trained. As humans use the system, an administrator must go in and manually label any utterances LUIS had difficulties classifying and retrain the model.

Here are some examples of utterances, how LUIS labeled them, and LUIS’s confidence score:

As you can see, it was very confident when labeling the SearchConcept intent. LUIS’s Thanks label was correct as well, even though its confidence score was low. It won because its confidence score is proportionately much higher than the rest (note that a drop down can be used to correct the label if needed):

The Dialogue

Each intent the agent is programmed to handle leads to a new dialogue. From that dialogue, the human can say things that take it deeper into another child dialogue or take it back into the parent dialogue.

Here are the more important intents the agent handles and what they mean:

Hello: Responds to things like ‘Hey’ or ‘Hi’ with a welcoming message and some instructions.

SearchConcept: Looks up AI concepts for the user. The agent can detect things like ‘find’ or ‘search’ and extract the concept it needs to look up.

More: Displays more search results.

List: Displays any concepts the user has saved.

HowAreYou?: Responds to questions like “How are you?” with a friendly response.

Sorry: If the human gets frustrated, sometimes the agent notices and can apologize.

Help: Provides instructions to the user if they seem confused or explicitly asks for help.

Joke: Tells the human a random joke from its KB.

When trying to understand the human, LUIS allows the agent to handle ambiguity. The more interactions the agent has, the better the agent will get at understanding a variety of possibly equivalent inputs (as long as any new cases are labeled and the model is retrained periodically).

The agent has a variety of responses it can give. By varying responses, the agent has a more natural feel and engages the user longer.

KBAI Concepts

The agent was designed using concepts primarily from the field of study known as Knowledge-Based Artificial Intelligence (KBAI). Many artificial intelligence concepts are used in the design of the agent, with more on the way.

Current Design

Learning By Recording Cases

There are two types of knowledge the agent currently has. The first comes from the knowledge-base (KB) of AI concepts. The second comes from the KB of utterances. Both of these collections of data become cases the agent has learned. Since the AI concepts are indexed, we can consider each search phrase and AI concept pair to be a single case. The process for recording a case includes both extracting the concept name and definition from Wikipedia and then indexing it. Utterances become cases once labeled with an intent. This process is crowdsourced as new utterances are collected from user interactions and then labeled by an administrator.

Classification

Right now the concepts the agent understands are all part of a single class, AI. In future iterations there will be different forms of concepts such as courses and jobs. As of now, the more interesting set of classes is the intents (learn more about LUIS to understand the purpose of intents). Each intent is a class. Each utterance is in a unique intent class. If you chat with the agent enough, you will notice that certain words are strongly associated with certain intents. That is because LUIS has found those words to be features of those intents. Since these words are not guaranteed to map to any specific intent, intents can be considered prototypical concepts. That is why intents are assigned to utterances with a particular probability. Though the underlying system is likely a machine learning system, it shares many similarities with a KBAI system. If LUIS was a KBAI system though, it would be able to explain how it labels each utterance. Instead, it is a black box.

We will discuss the two different types of cases the agent initially stored when it Learned By Recording Cases.

Retrieval

The AI concepts are indexed, retrieved, and ranked by Azure Search as described in the Knowledge Retrieval section of the High-Level Design. Azure Search retrieves data based on exact keywords only. This does not handle ambiguity well but does make for efficient retrieval. For the search mechanism, we can consider the keywords to be the constraints of the problem the agent is trying to solve and the search results to be potential solutions that meet those constraints (i.e. they contain the keywords).

Adaptation

Compared to a human learner, the agent is not very good at finding related concepts. The agent needs to learn how to abstract concepts in order to adapt new solutions. Implementing a class hierarchy would help with this. Unlike with searching for AI concepts, the intent detection built into LUIS is highly adaptable. The more interactions it has, the better it gets. It is good, sometimes aggressively so, at matching new utterances it has never seen before with existing intents. This is due to the underlying classification system described above.

Evaluation

Evaluation will mostly happen in future phases. For AI concepts, there is a mechanism in place for the user to evaluate whether or not a search result was useful: the save button. A user saving an item to their list could be considered a success. This keyword set and AI concept pair can then be considered a more fit solution. For intents, future enhancements where the agent learns by interaction and making mistakes will be valuable. As it works now, an administrator must observe the conversation the agent had post-mortem and evaluate the utterance labels for accuracy.

Storage

As of now, storage is limited to the same process described in the Learning By Recording Cases section above. Once learning is in place, there will be additional mechanisms for storing new cases. We will see this in the Future Design section.

Future Design

The full vision of the agent will require more advanced KBAI concepts, particularly in how the agent will learn new concepts and learn the needs of the users. Future features include helping students find appropriate courses for the kind of work they want to do and helping those students find jobs if they are ready. From the perspective of universities and employers, this agent could become a useful recruiting tool.

Semantic Networks And Frames

The purpose of the KB is to store concepts as frames in a semantic network. The current KB is missing the most powerful component of a semantic network: relationships. The concepts do not relate to each other in any explicit way. The KB is no more than a collection of a single class of concepts, namely AI concepts. These AI concepts would benefit from developing a class hierarchy and relating the concepts together in specific ways. For example, you can search for ‘bot’ and find ‘Niki.ai’ but the agent is unaware that ‘Niki.ai’ ‘is-a’ ‘intelligent agent’. Relationships like these would allow the agent to answer questions like ‘What is an example of an intelligent agent?’. It would also greatly improve any classification system in place by formalizing class hierarchies.

Besides strengthening the relationships between existing concepts, new data sources would allow for new types of relationships that would provide the agent a more robust set of features. Once frames for courses and jobs are in the KB, semantic relationships like “course CS7637 teaches intelligent agents” and “studying intelligent agents is required for a job with Microsoft Research” would be possible.

Constraint Propagation

If we approach finding a job for a student as a constraint satisfaction problem, we limit the search space significantly. Many job descriptions have a ‘requirements’ section. We can consider these the combination of values that satisfy the problem of obtaining a job. The agent can approach the conversation with the student from two ways. One way would be for the agent to discover what the user is capable of and recommend a job. A second way would be to ask the student what job they want and recommend what courses to take to get that job. In the first approach, the agent can eliminate potential jobs by propagating the constraints of the job’s requirements. In the second, the job’s requirements would narrow down the courses the student would need to take to get that job.

Learning by Interaction

No matter how constraints are propagated, the agent will need to learn about the user to provide good recommendations. A user may also be a good source for learning about new concepts, courses, or jobs. For example, if a student is about to graduate and already qualifies for many jobs, this student may be able to teach the agent new concepts or improve existing ones. These use cases lend themselves to an application of Incremental Concept Learning.

Case-based reasoning cooperates advantageously well with incremental concept learning (Bichindaritz 105). Incremental concept learning is able to introduce new cases into the KB that can then be retrieved, adapted, and evaluated. Bichindaritz expresses the integration of these two approaches in the following flowchart:
The main flow that goes down the left-hand side of the diagram maps directly to the case-based reasoning algorithm described earlier. The key new additions are the experimental memory and learning nodes. The students the agent identifies as domain experts can act as teachers for the agent. The agent can extract data it needs through conversation, memorize it, validate it, and store it as a new case.

The following algorithm details the “memory updating/learning” node from the flowchart above:

Keep an index of concepts it needs to learn. These can be gathered through searches that led to no results or by hearing from a user that it used a concept incorrectly.

Identify a user as a domain expert. This can be through the same test the user may go through to see if they qualify for a job.

Present a concept to the expert and ask for a definition if the user is able and willing. The agent may also ask for new relationships between existing concepts.

Store the new concept or relationship in experimental memory.

Identify other domain experts and have them validate this new concept or relationship.

Once a validation threshold is met, permanently store the concept or relationship as a new case.