Exploring "Exploring Robotic Minds" - Terms and Vocabulary used in my research

One of the difficulties that I faced when I joined Cognitive Neurorobotics lab was that I wasn’t familiar with the terms used in the lab. Some terms are from the field of dynamics and some other terms were “coined” by my advisor (Prof. Tani). So, it took me quite a time to understand them. I guess it might be a bit more difficult for other people sometimes.

So, I’d like to briefly explain those terms that can be frequently found in my studies on “cognitive neurorobotics” or in Tani’s book (“Exploring Robotic Minds Actions, Symbols, and Consciousness as Self-Organizing Dynamic Phenomena”). This post is targeted to the general audiences (someone like me five years ago). So those terms won’t be explained in a great detail. Instead, I’ll just try to give a general idea about them.

Note that the definitions/explanations introduced here are not general ones. There are many far better blogs, websites for those general definitions/usages. I’ll only deal with the terms that I used in my publication (or in Tani’s studies). I hope this post helps you to understand how we study “cognitive neurorobotics” and what we meant by those terms.

This document is going to be updated time-to-time. Please leave comments if you found any vague/incorrect explanations. Also, if you have any questions about the words in our publications, please leave a comment. I’ll try to add it here :)

C

Closed Loop Generation

A closed loop generation method refers to an input/output setting used in our studies. Sometimes, it is called “mental simulation”. In the closed loop generation, the model’s output in the previous time step is fed back into the input of the model in the current time step (x(t) = y(t-1), where x(t) and y(t) are input and output of the model at time t).

Then, why do we do this? What do we expect to get from the closed loop generation? One benefit is that it can generate the output without any inputs from the external environment. Just imagine that you set the initial values of the neural network. Based on that initial values, the model can generate a certain output. Then, we fed back that output into the input of the model. In this way, the model can generate sequential output without any actual input from the environment. So it is “mental simulation” and the initial values can be considered as “a goal” - Telling the model “what” to imagine.

Consciousness

I

Initial States

Initial states is the neuron’s internal states set at the onset of computation (t = 0). The recurrent neural network models require the previous time step’s value to generate the output. So we need to set the initial value at the onset of computation.

Note that these values can be obtained from the training. That is, just like other weights and biases, these values can be optimized during the training in the direction of minimizing error. But there is one big difference between the initial states and the other parameters (weights, biases). Let’s assume we have 10 patterns in our training data. The other parameters are about the model itself (such as connection strength). And we will obtain the same weights and biases for those 10 patterns (because those parameters are about the model itself, not the training data). The initial states, however, will be different for each training patterns. In other words, we will obtain 10 different initial states for 10 training patterns. So these initial states are about the training data, not the model itself. Ideally, if some of the training patterns are similar to each other, then the initial states of them will be also similar. If they are really different training patterns, their initial states will be also very different.

Then, what do we do we these initial states? One cool thing we can do with it is the closed loop generation. That is, we set the initial states from the ones that we obtained from the training. And then, we let the model to generate the output without any external inputs. If the training was successful, the model will generate the pattern which corresponds to that initial states. In this sense, we can tell the model “what” to generate by setting the different initial states at the onset of computation. And this is why we sometimes call them as “intention” states - intention to generate a specific action.

Intention (or Intention States)

Of course, this is not about intention in general. In our works, intention (sometimes intention state) refers to internal cause enabling proactive generation of the patterns. Intention or intention states are specified as the internal states.

For example, let’s consider the closed loop generation. At the beginning of computation, we need to tell the model “what” action it should generate. This “what” can be considered as a goal or intention. And this “what” is expressed as the initial states (which is in the form of internal states).

Of course, the initial states are not the only “intention” states. For example, when we have a prediction error minimization mechanism, it can change the neuron’s internal states in an on-line manner (e.g., the internal states are optimized (updated) while the robot is interacting with a human). Then, we can interpret our model in a way that the model updates its intention to interact with the dynamic environment.

Internal States

We often use the term “internal states” to refer to the neuron’s value before we apply the activation function. It is often denoted as u, such that neuron activation v = f(u) where f is a non-linear function (sigmoid, tanh, etc.).

So we have “Initial” states, “intention” states and “Internal” states. Let’s make it clear here. “Internal” states are more like neural network terms. They mean the values before the activation function. When we talk about the internal states t the onset of computation (t=0), we call them “Initial” states. And these initial states can tell the model “what” to generate - goal/intention to generate a specific action. That’s why it can be also called “Intention” states.

Kinesthetic Teaching

O

Open Loop Generation

An open loop generation method refers to an input/output setting used in our studies. In the open loop generation, the input to the model is from the external environment. For instance, if the robot’s camera captures the image every time step and feed it to the input channel of the model, it is being operated in the open loop manner.

So, for some people, this might be a typical way of using the neural network in the robotic experiment - getting the sensor data, feed to the model and generate an action. In our study, the open loop generation is often compared with another method called “the closed loop generation”. In the closed loop generation, the input to the model is generated by the model itself and we don’t use any sensory information from the external environment.

When this open-loop generation is employed in our robotic studies, we often refer it as “sensory entrainment” which means that the neural dynamics of the model is driven by the external sensory input. On the other hand, we expect that the “error minimization mechanism” will drive the dynamics of the model in the closed loop generation.

P

Proprioception

In general, proprioception means the sense of the relative position of one’s own parts of the body and strength of effort being employed in movement. [Ref: Wikipedia] In my studies, I often use “proprioception” to emphasize that my neural network model (ref) predicts/processes “the perceptual outcome of the robot’s actions”, not the actual action. For example, the P-VMDNN model predicts the “desired” joint position. Then, the motor controller actually drives the motors to the certain joint position. In other words, actual movement is done by the lower-level motor controller, not the neural network model. In this sense, the proprioceptive output can be considered as the kinematic level representation of the action which describes the trajectories of the movement in space and time (Ref.1).

S

Sensory Entrainment

In our experiments, if the model was tested under the “sensory entrainment” condition, it means that the dynamics of the model was driven by sensory information from the external environment. For example, at each time step, we obtain the data from the sensors (e.g., images from the camera). Then, we feed this image to the input channel of the model. The neuron’s activation values will be computed accordingly.

Softmax Transform

This is the method of representing the data into the sparse form. Let’s say the model outputs the joint position value of its right elbow (ranging from 0 (stretched) to 150 (bent)). Then we can configure the model to have 1-dimensional value - ranging from 0 to 150. But by using softmax transform, we can configure the model to have N-dimensional values (N is called the softmax dimension) where those N-dimensional values represent the joint angle together.

Then, why do we do it? It has been known that the sparse encoding eases the difficulty of training. The detailed explanation can be found here.

T

Tutoring

Tutoring in our studies often refers to the process of obtaining the training data, especially in robotic experiments. During the tutoring process, a robot is generally operated by the experimenter, not by the neural network. That is, there is no artificial intelligence during the tutoring process.

For instance, when I was working on imitation between a human and a robot, I needed a training data for my neural network model. I wanted my robot (neural network) to imitate the human gestures. So the first thing I did was tutoring - telling the robot how to imitate the gestures. So I showed the human gestures to the robot and at the same time, I physically guided the robot’s limbs to demonstrate how to imitate the observed gestures. This is also called Kinesthetic teaching. It can be considered a sort of scaffolding, or learning by demonstration.

+) Usually, it might not be possible to teach everything to the robot (and also we don’t want to do that!). So we sometimes teach the robot a few things and see how well it performs the task. If the robot works well in the tutored task, we consider that learning was successful. If the robot even works well in the novel situation, we say that the robot generalizes the skill to the novel situations. And yes, that’s one of our goals in our studies - achieving both high competency and performance.

Last September, I attended the ICDL-EPIROB 2017 which was held in Lisbon, Portugal. I really enjoyed the conference and the beautiful city and I also had a great time with my old & new friends :)

This time, I was lucky to have an opportunity for the oral presentation. I had my presentation on my recent study on visuomotor learning, particularly on P-VMDNN (Predictive Visuo-Motor Deep Dynamic Neural Network).

The manuscript, the presentation slides (pdf) and the supplementary videos that I used during my talk can be downloaded from below:

Basic Information

When I train my neural network models, I often use the method called “Softmax Transformation”. It is the method of representing the training data into the sparse form. When I first learned how to do it, I had some troubles ‘cause there wasn’t enough example about how to do it. And still, I can’t find nice explanation about the softmax transformation with examples. So here I have some brief explanation and sample codes for the softmax transformation. Let’s see how the softmax transformation works step by step.

One of the difficulties that I faced when I joined Cognitive Neurorobotics lab was that I wasn’t familiar with the terms used in the lab. Some terms are from the field of dynamics and some other terms were “coined” by my advisor (Prof. Tani). So, it took me quite a time to understand them. I guess it might be a bit more difficult for other people sometimes.

So, I’d like to briefly explain those terms that can be frequently found in my studies on “cognitive neurorobotics” or in Tani’s book (“Exploring Robotic Minds Actions, Symbols, and Consciousness as Self-Organizing Dynamic Phenomena”). This post is targeted to the general audiences (someone like me five years ago). So those terms won’t be explained in a great detail. Instead, I’ll just try to give a general idea about them.