It seems to me that this device might be quite useful for
robot projects. It wasn't very long ago that such as device
would cost a couple of thousand dollars or more.

In addition to the feature based stereo I may also try
implementing a dense stereo algorithm. My thoughts on using
this as a replacement for the cameras on GROK2 are that the
baseline is probably a little on the short side, but that it
probably would work.

At the moment I'm still undecided as to whether I'll use the
Minoru to replace the existing cameras on GROK2. I've
ordered some wide angle lenses which I'll try using. If the
new lenses fit then this device might be ideal.

Have done more testing this weekend and fixed some important
bugs. It transpired that there was still a "glitch" issue
with the direction of one of the encoders reversing
seemingly at random, but I've managed to work around that.
Also the motion control and joystick servers are now talking
together as they should.

I need to write a new utility program which will allow me to
visualise the occupancy grids after doing an initial
joystick guided training run. This should allow me to check
that things look as I expect them to.

I've also ordered one of the Minoru 3D webcams. Apparently
these are UVC compliant and will work on Linux. If I can
get a pair of images out of the device this would make an
excellent replacement for the existing stereo cameras,
whilst also solving my V4L1 issues. However, even if the
Minoru does look like a usable stereo camera I'll need to
assess image quality and field of view compared to the
existing cameras which I'm using.

Wheel odometry is now calibrated, and repeatability looks
good over short distances, such that the rate of increase in
pose uncertainty should be unmanageably small.

I'm getting closer to having a working robot, although there
has been a recent setback in that the new version of Ubuntu
(9.04) doesn't seem to support the webcams which I'm using.
This is odd, because they worked without a hitch in
previous versions. A simple solution would be to downgrade
the OS, although I'm reluctant to do that. I suspect that
there has been some change in the kernel, perhaps related to
gspca and V4L1 devices.

As a workaround I've continued development on Windows. This
is ok, because the Windows version has not received too much
attention and so was lagging behind in some features. My
current strategy is to ensure that the robot's software
works both on Windows and on Linux in order to maximise the
possible range of use cases.

There's some extra work to be done on path integration, and
no doubt there will be additional bugs to fix once I start
testing on the robot in earnest (as opposed to
simulation/unit testing). Both stereo cameras are working
and calibrated, and are returning reasonable stereo
features. The camera images seem to suffer from occasional
glitches, and this might be something to do with their age
(most of them were bought 4 years ago) or more likely it
could be electrical interference with the USB cables from
nearby pan and tilt servos. Either way, the glitches are
not sufficiently serious to cause major concern at this point.

I now have both of the stereo cameras calibrated, and am
fairly confident that I'm getting good quality disparities,
which should at least suffice for navigation purposes. I
wrote an extra program which allows me to visualise the
stereo disparity and manually alter calibration parameters
to observe the effects. Hence, if there are problems I can
check the camera calibration more thoroughly than was
possible previously.

The next step is to do integration testing with all of the
systems running - stereo vision server, motion control
server, servo control server, steersman and ultrasonics
server. With luck I should be able to create some real maps
soon.

One of the problems with 3D occupancy grids, is that they
can occupy a lot of storage space in terms of memory or disk
storage. Imagine a space the size of an average house turned
into small 1cm cubes, and that's quite a lot of cubes to
keep track of.

Much of the space inside homes is actually empty, or rather
filled with air, but from the robot's point of view knowing
about probably empty space is just as important (maybe even
more important!) than knowing about what is occupied, and
thereby a potential obstacle. Some savings can be made by
not storing information about terra incognita - areas of the
map which have so far not been explored, but assuming that
we want the robot to have a good understanding of an entire
house this still leaves us with quite a heap of data.

At this point the unimaginative can simply appeal to Gordon
Moore and his famous "law". The capacity of storage devices,
such as hard disk drives, is always increasing and it does
look as if even the smallest storage devices around today
would be able to handle the number of cubes that we would
like to deal with. Even though this is the case loading from
and saving to the storage device is still going to be
relatively slow, and the robot needs to be able to access
the data more or less in real time if it's going to be
useful. We could also be lazy and just load the whole lot
into a large amount of RAM, but ideally it would be good if
low cost devices could be used, such as netbooks, which only
have modest memory and local storage capacity. This would
help robotics to continue becoming more economical and
therefore marketable.

So what to do? Since the occupancy data in this case is
being produced from stereo vision a way to get better
storage economy might be to only store a random sample of
the stereo disparities observed from a dense disparity
image. If we know the location and pose from which the
observation was originally made, based upon the results of
SLAM, then a local 3D occupancy grid can be regenerated
dynamically from a fairly small amount of data as the robot
moves around the house. This means that storage access times
are going to be much shorter, and potentially a lot of
stereo disparity data could be buffered in memory.

Some back an envelope calculations go as follows:

If we randomly sample 300 stereo disparities from a dense
disparity image, and represent the image coordinates and
disparity as floating point values (sub-pixel accuracy),
this translates into

300 stereo features x 3 values (x,y,disparity) x 4 bytes per
value

= 3600 bytes per observation, or 3.5K

If we also want to store colour information, so that
coloured 3D occupancy grids can be produced this increases
to 4500 bytes or 4.4K. There is also the robot's pose
information to store, but this is only a small number of
bytes, so doesn't make a big overall difference. This seems
quite tractable. Potentially the robot could make several
thousand observations as it maps the house, and this only
translates into a few tens of megabytes which is well within
the limitations of what a netbook could handle. Even if the
number of observations rises into the tens of thousands this
still looks feasible.