TensorFlow Chatbot with Twitter and Reddit Datasets

In 1950, Alan Turing proposed a test to evaluate natural conversations known as the Turing Test. Since then many methods have been used to produce outcomes that try to pass this test. The Loebner Prize is an annual competition that assesses " if the responses from the computer were indistinguishable from that of a human, the computer could be said to be thinking.[1]" This year, a chatbot named Mistuku earned the top prize [2]. This project is to use a new method to create a chatbot that can potentially compete for the prize.

There have been successful attempts at chatbot functionality using TensorFlow with software such as SpeakEasy AI [3]. The project can utilize all the public source code [4] and tutorials [5] to create and expand functionality. SpeakEasy AI's implementation learns from a large Reddit data set where other implementations have used movie scripts [6]. The project can recreate SpeakEasy AI that learns with different datasets.

Deliverable's are:

[a] TensorFlow Chatbot learned through Twitter and Reddit datasets

[b] Web Interface to use the chatbot

[c] Software to automate data mining of Twitter and Reddit datasets

[d] Step by Step Tutorial on implementing a chatbot with automated datasets on a server with a web interface.

Here we provide further details on the deliverable's:

[Chatbot]: Software able to compete in the Loebner prize. Examples include Mitsuku [2].

[Twitter Dataset]: A dataset of size `t` that outputs all public tweets in time `t`. See Twitter Search API [7]. The dataset should be small but large enough to use in the TensorFlow Chatbot. The Twitter Search API may create a sufficiently large dataset in a few hours.

[Reddit Dataset]: A dataset of size `t` that outputs all public Reddit comments in time `t`. See dataset from SpeakEasy AI [8]. The dataset should be small but large enough to use in the TensorFlow Chatbot.

[Datamining Automation]: A python script or similar to automate data mining of the defined Twitter and Reddit data sets.

[Step by Step Tutorial]: List of all inputs to produce an automated chatbot using TensorFlow with Twitter and Reddit datasets. Inputs includes keystrokes and commands. Tutorial can be based on a Linux enviornment