This tutorial will take you through all the stages involved in creating a simple learning agent that can interact with a small simulated world that we’ve provided. We assume some familiarity with Python programming and the GLib framework, but all PSchema related concepts will be introduced from scratch.

The sandbox

Figure 1: The PSchema tutorial sandbox in its initial state.

To make this tutorial more interactive we have created a simple application called the “Sandbox”. This is an interactive world consisting of three buttons which have different effects on the environment. Inhabiting this world is our learning agent, a disembodied hand capable of moving around the world and manipulating objects.

The application is distributed with PSchema in the doc/tutorial/sandbox/ directory. Simply move to that directory and execute ./sandbox.py to start the application.

The agent in the sandbox communicates with us via a network socket, so can be controlled from any programming language, in this example we’ll be using C. However, before we start writing a controller for the agent we can get a feel for its commands and behaviour by directing it ourselves.

By default the sandbox listens on port 6000 for connections, first we connect to it via telnet:

There are two further commands available to us when controlling the hand, GRASP and RELEASE. These can be used for manipulating objects, some objects in the world can be picked up while others cannot. While it’s not possible to actually pick up the buttons, when we attempt to grasp them we instead press them causing different types of food to be dispensed into the world depending on which button was pressed. As before we can test this functionality by first sending a command to move the hand over the red button (position 188), then sending a grasp command.

MOVE 188
GRASP
GET_SENSORS

Figure 3. Upon pressing the red button a strawberry appears in the world.

And in the response we can now see a strawberry positioned above the button we just pressed. This can be seen visually in figure 3.

Finally we can move the hand, still holding the strawberry to a new location, release the strawberry and move our hand away:

MOVE 133
RELEASE
MOVE 150

At this point we know roughly how the simulated world works ourselves and what actions are available to our agent, next we can begin writing our control software so that our agent can start to act independently of us and learn about the world for itself.

Creating an application that uses PSchema

To start off with we create a very simple skeleton of an application that initialises the PSchema library and sets up a GTK window which we’ll use later for showing a few buttons. Because GTK and PSchema are both based around GLib they can share the same main loop, however it is not necessary to make use of GTK when writing the control software, instead a simple GLib main loop can be created without the need for GTK. Later on we’ll also be communicating with the simulator over a socket connection.

Note that when we create our schema memory object instantiate it by running PSchema.Memory.new() rather than the more traditional PSchema.Memory(), this is a product of the way the dynamic bindings are created. It is important to remember this difference as using simply PSchema.Memory() will work, in that it will instantiate a Memory object, but it won’t run that class’s initialisation code which can result in unexpected behaviour later on.

Debugging

In addition to using a full-fledged debugger such as GDB for inspecting the inner workings of the framework it’s possible to request varying levels of debug information on different topics through a couple of environment variables:

PSCHEMA_DEBUG – This sets the amount of debugging information required, taken as an integer between 1 and 5. When set to 1 it will give very few high level messages, when set to 5 it will give very detailed information.

PSCHEMA_DEBUG_FILTER – This specifies which subsystem debugging information should be output for. If unspecified then all debugging output information will be displayed. For example it can be set to “excitation” for debugging messages relating directly to the excitation calculations.

Communicating with the simulator

Now that we have all the basics set up we can start interacting with the simulator. In this example we communicate with our simulated agent through a simple TCP socket. This could easily be replaced with a custom agent control library, or robotics middleware such as Player or YARP, to allow for communication with robots in the real world or agents in other virtual environments.

First can set up our connection to the simulator in our general initialisation code.

For now this function just prints out any data it receives from the simulator, later on we’ll modify this so it can create a representation of the world state out of the sensor values sent to it from the simulator.

Now that we can receive messages from the simulator we can try sending some commands. First we’ll implement a move_hand() function:

36
37

def move_hand(self, position):
self.sock.send("MOVE %d\n" % position)

Finally we need to implement our function for requesting sensor data. This’ll be very similar to the move_hand() function, as it’s just sending the GET_SENSORS message over the network. All actually processing of the sensor information that the simulator returns will be handled in the data_received() function.

40
41

def get_sensors(self):
self.sock.send("GET_SENSORS\n")

To test our new functions we can try calling them immediately before starting the GTK main loop:

This should move the hand to position 100 then request the sensor data. When the simulator responds to the sensor request this will be processed by our data_received() function, which for now will simply print the information to stdout:

Observations

Now that we’re receiving sensor information from the simulator we need to convert it into a form that the PSchema framework can reason about. We do this by first creating some custom observation classes, inheriting from the abstract Observation class defined by PSchema. While there are already a few basic observation types built into the framework for this tutorial we will create our own custom class to more precisely represent the information we’re being supplied with.

There is a certain amount of boiler plate code involved in defining a new GLib based class, it’s not necessary to understand this in detail, instead the main focus will be placed on the aspects relating specifically to PSchema observations.

Since the sensor information we’re receiving at the moment is about the positions of objects we’ll create a new observation class called MyObjectObservation which can store this information. To begin with we need to create a python file, MyObjectObservation, to store our new class. We’ll begin by writing our initialisation functions:

Note that when overriding methods from the Observation class (or any other PSchema class) it is necessary to prepend their method name with do_.

When an observation is loaded from a previously saved XML file its XML node is passed to the relevant class so that any properties specific to that observation class can be parsed. The Observation base class can handle all
of the XML parsing for us, but it needs a way of setting the concrete and generalised properties in our class, it does this via the set_concrete_var() and set_property_var() methods:

There are a few things to notice here, firstly it’s not necessary to instantiate our custom class via MyObjectObservation.new() because its not part of the dynamically generated bindings. We then assign a concrete value to the name, but a generalised value to the position. When we run the to_xml() method the underlying Observation class that we inherited from creates an XML representation for us and correctly identifies that the position property has a generalised property thanks to the order of preference in our do_get_properties() method

This completes the MyObjectObservation class, however to fully represent the information we get back about the world we’ll need two other Observation types, MyTouchObservation and MyHoldingObservation. Using MyObjectObservation as an example try writing these two Observations. MyTouchObservation should have one boolean property “touching” and MyHoldingObservation should have one string property “object”. If you have difficulty implementing these you can find a reference implementation in the doc/tutorial/python/controller/ directory.

Representing the world

With our new Observation classes we can now create world states representing the robot’s current view of the world. First we need to include our new observations in our controller.

Actions

Using the Observation classes we wrote earlier we’re now able to represent everything our sensors tell us about the world. However we’re not yet able to represent the actions our agent itself takes, to do this we need some Action classes. This should be very familiar to you after creating the Observation classes, so we’ll simply provide the full source for the MyMovementAction class here.

The only main difference from the Observation classes that deserves additional extra attention is the introduction of signals. Rather than directly implementing the code used to make our agent carry out these action within our Action class we send a signal back to the controller software. This means that our action types can be portable across many different agents, all with different controllers. We’ll look at this in more detail when we connect to the signal to the move() function within our main Controller.py file.

It will now be necessary for you to create a couple of additional Action classes based on this; MyGraspAction and MyReleaseAction. Again if you have difficulties there are reference implementations in doc/tutorial/python/controller/.

Bootstrapping

Saving

Loading

Source code

The final version of the controller developed throughout this tutorial can be found in the doc/tutorial/python/controller/ directory, however it is recommended that you construct the controller yourself whilst following the tutorial so that each concept can be studied in isolation and then built upon.