Abstract

We consider the following problem: a team of robots is deployed in an unknown environment and
it has to collaboratively build a map of the area without a reliable infrastructure for communication.
The backbone for modern mapping techniques is pose graph optimization, which estimates the
trajectory of the robots, from which the map can be easily built. The first contribution of this paper
is a set of distributed
algorithms for pose graph optimization: rather than sending all sensor data to a remote sensor fusion server,
the robots exchange very partial and noisy information to reach an agreement on the pose graph configuration.
Our approach can be considered as a distributed implementation of the two-stage approach of Carlone et al. (2015b), where
we use the Successive Over-Relaxation (SOR) and the Jacobi Over-Relaxation (JOR)
as workhorses to split the computation among the robots.
We also provide conditions under which
the proposed distributed protocols converge to the solution of the centralized two-stage approach.
As a second contribution, we extend the proposed distributed
algorithms to work with object-based
map models. The use of object-based models avoids
the exchange of raw sensor measurements (e.g., point clouds or RGB-D data) further reducing the communication
burden.
Our third contribution is an extensive experimental evaluation of the proposed
techniques, including tests in realistic Gazebo simulations and field experiments in a military test facility.
Abundant experimental evidence suggests that
one of the proposed algorithms (the Distributed Gauss-Seidel method or DGS) has excellent
performance. The DGS requires minimal information exchange, has an anytime flavor,
scales well to large teams (we demonstrate mapping with a team of 50 robots),
is robust to noise, and is easy to implement.
Our field tests show that
the combined use of our distributed algorithms and object-based models
reduces the communication requirements by several orders of magnitude
and enables distributed mapping with large teams of robots in real-world problems.

The deployment of large teams of cooperative autonomous robots has the potential to enable fast information gathering,
and more efficient coverage and monitoring of vast areas. For military applications such as surveillance, reconnaissance, and battle damage assessment,
multi-robot systems promise more efficient operation and improved robustness in contested spaces.
In civil applications (e.g., pollution monitoring, precision agriculture, search and rescue, disaster response),
the use of several inexpensive, heterogeneous, agile platforms
is an appealing alternative to monolithic single robot systems.

The deployment of multi robot systems in the real world poses many technical challenges,
ranging from coordination and formation control, to task allocation and
distributed sensor fusion. In this paper we tackle a specific instance of the sensor fusion problem.
We consider the case in which a team of robots explores an unknown environment
and each robot has to estimate its trajectory from its own sensor data and by leveraging
information exchange with the teammates.
Trajectory estimation, also called pose graph optimization, is relevant as it constitutes the backbone for many
estimation and control tasks (e.g., geo-tagging sensor data, mapping, position-aware task allocation, formation control).
Indeed, in our application, trajectory estimation enables distributed 3D mapping and
localization (Fig. 1).

We consider a realistic scenario, in which
the robots can only communicate when they are within a given distance.
Moreover, also during a rendezvous (i.e., when the robots are close enough to communicate)
they cannot exchange a large amount of information, due to bandwidth constraints.
Our goal is to design a technique that allows each robot
to estimate its own trajectory, while asking for minimal knowledge of the trajectory
of the teammates. This “privacy constraint” has a clear motivation in a military application:
in case one robot is captured, it cannot provide sensitive information
about the areas covered by the other robots in the team. Similarly, in civilian applications,
one may want to improve the localization of a device (e.g., a smart
phone) by exchanging information with other devices, while respecting users’ privacy.
Ideally, we want our distributed mapping approach to scale to very large teams of robots.
Our ultimate vision is to deploy a swarm of agile robots (e.g., micro aerial vehicles) that
can coordinate by sharing minimal information and using on-board sensing and computation.
The present paper takes a step in this direction and presents distributed mapping
techniques that are shown to be extremely effective in large simulations (with up to 50 robots)
and in real-world problem (with up to 4 robots).

Figure 1: In our field experiments, distributed trajectory estimation enables
3D reconstruction of an entire building using two robots (red, blue). Each column of the figure shows the
reconstructed point cloud of a floor (top), and the estimated trajectories overlaid on an occupancy grid map (bottom).

Contribution.
We consider a distributed maximum-likelihood (ML) trajectory estimation problem in which the robots have to collaboratively estimate
their trajectories while minimizing the amount of exchanged information.
We focus on a fully 3D case, as this setup is of great interest in many robotics applications (e.g., navigation on uneven terrain,
underwater and aerial vehicles).
We also consider a fully distributed setup, in which the robots communicate and acquire relative
measurements only during rendezvous.

We present two key contributions to solve the distributed mapping problem.
The first contribution are a set of distributed algorithms that
enable distributed inference at the estimation back-end.
This contribution is presented
in Section 3.
Our approach can be understood as a distributed implementation of the
chordal initialization
discussed in Carlone et al. (2015b). The chordal initialization
(recalled in Section 3.2) consists in approximating the ML trajectory estimate
by solving two quadratic optimization subproblems. The insight of the present work is that
these quadratic subproblems can be solved in a distributed fashion, leveraging distributed linear system solvers.
In particular, we investigate distributed implementations of the
Jacobi Over-Relaxation and the Successive Over-Relaxation.
These distributed solvers imply a communication burden that is linear in the
number of rendezvous among the robots. Moreover, they do not rely on the availability of an
accurate initial guess as in related work (see Section 2).
In Section 3.3 we discuss conditions under which the distributed algorithms
converge to the same estimate of the chordal initialization (Carlone et al., 2015b),
which has been extensively shown to be accurate and resilient to measurement noise.

The second contribution is the use of high-level object-based models at the
estimation front-end and as map representation.
This contribution is presented
in Section 4.
Traditional approach for multi robot mapping rely on feature-based maps which are composed of low level primitives like points and lines (Davison et al., 2007).
These maps become memory-intensive for long-term operation,
contain a lot of redundant information (e.g., it is unnecessary
to represent a planar surface with thousands of points), and lack the semantic information necessary for
performing wider range of tasks (e.g., manipulation tasks, human-robot interaction). To solve these issues,
we present an approach for multi robot SLAM which uses
object landmarks (Salas-Moreno et al., 2013) in a multi robot mapping setup. We show that
this approach further reduces the information exchange among robots, results in compact human-understandable maps,
and has lower computational complexity as compared to low-level feature-based mapping.

The third contribution is an extensive experimental evaluation including realistic
simulations in Gazebo and field tests in a military facility.
This contribution is presented
in Section 5.
The experiments demonstrate that one of the proposed algorithms, namely the
Distributed Gauss-Seidel method, provides accurate trajectory estimates,
is more parsimonious, communication-wise, than
related techniques, scales to large tea, and is robust to noise.
Finally, our field tests show that
the combined use of our distribute algorithms and object-based models
reduces the communication requirements by several orders of magnitude
and enables distributed mapping with large teams of robots.

Section 6 concludes the paper and discusses current and future work towards
real-world robust mapping with large swarms of flying robots.

In many applications, however, it is not practical to collect all measurements at a single inference engine.
When operating in hostile environment, a single attack to the centralized inference engine (e.g., one of the
robot) may threaten the operation of the entire team. Moreover, the centralized approach requires massive communication
and large bandwidth. Furthermore, solving trajectory estimation over a large team of robots can be too demanding for a
single computational unit. Finally, the centralized approach poses privacy concerns as it
requires to collect all information at a single robot; if an enemy robot is able to
deceive the other robots and convince them that it is part of the team, it can easily gather sensitive
information (e.g., trajectory covered and places observed by every robot).
These reasons triggered interest towards distributed trajectory estimation, in which the robots
only exploit local communication, in order to reach a consensus on the trajectory estimate.
Nerurkar et al. (2009) propose an algorithm for cooperative localization
based on distributed conjugate gradient.
Franceschelli and Gasparri (2010) propose a gossip-based algorithm for distributed
pose estimation and investigate its convergence in a noiseless setup.
Aragues et al. (2011) use a distributed
Jacobi approach to estimate a set of 2D poses, or the centroid of a formation (Aragues et al., 2012a).
Aragues et al. (2012b) investigate consensus-based approaches for map merging.
Knuth and Barooah (2013) estimate 3D poses using distributed gradient descent.
Cunningham et al. (2010) use Gaussian elimination, and develop an approach, called DDF-SAM, in which
each robot exchange a Gaussian marginal over the separators (i.e., the variables shared by
multiple robots); the approach is further extended in Cunningham et al. (2013), to avoid
storage of redundant data.

Another related body of work is the literature on
parallel and hierarchical approaches for mapping. Also in this case, Gaussian elimination and Schur complement have been used as a key component for
hierarchical approaches for large-scale mapping (Ni and Dellaert, 2010; Grisetti et al., 2010; Suger et al., 2014).
Decoupled stochastic mapping was one of the earliest approach for
submapping proposed by Leonard and Feder (2001). Leonard and Newman (2003) propose a constant-time
SLAM solution which achieves near-optimal results under the assumption that the
robot makes repeated visits to all regions of the environment. Frese et al. (2005) use
multi-level relaxations resulting
in a linear time update. Frese (2006) propose the TreeMap
algorithm which divides the environment
according to a balanced binary tree. Estrada et al. (2005) present a hierarchical SLAM
approach which consist of a set of local maps
and enforces loop consistency when calculating the optimal estimate
at the global level.
Ni et al. (2007) present an exact submapping
approach within a ML framework, and propose to
cache the factorization of the submaps to speed-up computation.
Grisetti et al. (2010) propose hierarchical updates on the map:
whenever
an observation is acquired, the highest level of the hierarchy is modified and only
the areas which are substantially modified are changed at lower levels.
Ni and Dellaert (2010) extend their previous approach (Ni et al., 2007)
to include multiple levels and use nested dissection to minimize the dependence
between two subtrees. Grisetti et al. (2012) compute a good initial estimate for global alignment
through a submapping approach.
Zhao et al. (2013) propose an approximation for
large-scale SLAM by solving for a sequence of submaps and joining them
in a divide-and-conquer manner using linear least squares. Suger et al. (2014) present an approximate SLAM approach based on hierarchical
decomposition to reduce the memory consumption for pose graph optimization.

While Gaussian elimination has become a popular approach
it has two major shortcomings. First, the marginals to be exchanged among the robots are dense, and
the communication cost is quadratic in the number of separators.
This motivated the use of sparsification techniques to reduce the communication cost (Paull et al., 2015).
The second reason is that Gaussian elimination is performed on a linearized version of the
problem, hence these approaches require good linearization points and
complex bookkeeping to ensure consistency of the linearization points across the robots (Cunningham et al., 2013).
The need of a linearization point also characterizes gradient-based techniques (Knuth and Barooah, 2013).
In many practical problems, however, no initial guess is available,
and one has to develop ad-hoc initialization techniques, e.g., (Indelman et al., 2014).

Related Work in Other Communities.
Distributed position and orientation estimation is a fertile research area in other communities,
including sensor networks, computer vision, and multi agent systems.
In these works, the goal is to estimate the state (e.g. position or orientation) of an agent
(e.g., a sensor or a camera) from relative measurements among the agents. A large body of literature
deals with distributed localization from distance measurements,
see Anderson et al. (2010); Calafiore et al. (2012); Simonetto and Leus (2014); Wei et al. (2015) and
the references therein. The case of position estimation from linear measurements is considered
in Barooah and Hespanha (2005, 2007); Russell et al. (2011); Carron et al. (2014); Todescato et al. (2015); Freris and Zouzias (2015);
the related problem of centroid estimation is tackled in Aragues et al. (2012a).
Distributed rotation estimation has been studied in the context of
attitude synchronization (Thunberg et al., 2011; Hatanaka et al., 2010; Olfati-Saber, 2006),
camera network calibration (Tron and Vidal, 2009; Tron et al., 2012a), sensor network
localization (Piovan et al., 2013), and distributed consensus on
manifolds (Sarlette and Sepulchre, 2009; Tron et al., 2012b).

High-Level Map Representations.Semantic mapping using high-level
object-based representation has gathered a large amount of interest from
the robotics community. Kuipers (2000) model the
environment as a spatial semantic hierarchy, where each level expresses
states of partial knowledge corresponding to different level of representations.
Ranganathan and Dellaert (2007) present a 3D generative
model for representing places using objects. The object models are
learned in a supervised manner. Civera et al. (2011) propose a semantic SLAM
algorithm that annotates the low-level 3D point based maps with precomputed object models.
Rogers et al. (2011) recognize door signs and
read their text labels (e.g., room numbers) which are used as landmarks
in SLAM. Trevor et al. (2012) use planar surfaces corresponding
to walls and tables as landmarks in a mapping system.
Bao et al. (2012) model semantic structure from motion as a joint inference
problem where they jointly recognize and estimate the location of high-level
semantic scene components such as regions and objects in 3D.
SLAM++, proposed by Salas-Moreno et al. (2013),
train domain-specific object detectors corresponding to repeated
objects like tables and chairs. The learned detectors are integrated
inside the SLAM framework to recognize and track those objects resulting
in a semantic map. Similarly, Kim et al. (2012) use
learned object models to reconstruct dense 3D models from single scan
of the indoor scene. Choudhary et al. (2014) proposed an approach for for online
object discovery and object modeling, and extend a SLAM system to utilize these discovered
and modeled objects as landmarks to help localize the robot in an online manner.
Pillai and Leonard (2015) develop a SLAM-aware object recognition system which result in a
considerably stronger recognition performance as compared to related techniques. Gálvez-López et al. (2016) present a
real-time monocular object-based SLAM using a large database of 500 3D objects and
show exploiting object rigidity both improve the map and find its real scale. Another body of related work is in the area of dense semantic mapping
where the goal is to categorize each voxel or 3D point with a category label.
Related work in dense semantic mapping include Nüchter and Hertzberg (2008); Koppula et al. (2011); Pronobis and Jensfelt (2012); Finman et al. (2013); Kundu et al. (2014); Vineet et al. (2015); Valentin et al. (2015); McCormac et al. (2016) and the references therein.

The first contribution of this paper is to devise distributed algorithms that
the robots can implement to reach consensus on a globally optimal trajectory estimate
using minimal communication.
Section 3.1 introduces the mathematical notation and formalizes the problem.
Section 3.2 presents a centralized algorithm, while
Section 3.3 presents the corresponding distributed implementations.

Figure 2:
An instance of multi robot trajectory estimation: two robots (α in blue, and β in dark green)
traverse and unknown environment, collecting intra-robot measurements (solid black lines).
During rendezvous,
each robot can observe the pose of the other robot (dotted red lines). These are called inter-robot measurements and
relate two separators (e.g., xαi,xβj).
The goal of the two robots is to compute the ML estimate of their trajectories.

3.1 Problem Formulation: Distributed Pose Graph Optimization

We consider a multi robot system and we denote each robot with a Greek letter, such that the
set of robots is Ω={α,β,γ,…}. The goal of each robot is to estimate
its own trajectory using the available measurements, and
leveraging occasional communication with other robots.
The trajectory estimation problem and the nature of the available measurements
are made formal in the rest of this section.

We model each trajectory as a finite set of poses (triangles in Fig. 2);
the pose assumed by robot α at time i is denoted with xαi
(we use Roman letters to denote time indices). We are interested in a 3D setup, i.e., xαi∈SE(3),
where SE(3) is the Special Euclidean group of 3D rigid transformations; when convenient, we write xαi=(Rαi,tαi), making explicit that each pose includes a
rotation Rαi∈SO(3), and a position tαi∈R3.
The trajectory of robot α is then denoted as xα=[xα1,xα2,…].

Measurements. We assume that each robot acquires relative pose measurements. In practice these are obtained
by post-processing raw sensor data (e.g., scan matching on 3D laser scans).
We consider two types of measurements: intra-robot
and inter-robot measurements. The intra-robot measurements involve the poses of a single robot
at different time instants; common examples of intra-robot measurements are odometry measurements
(which constrain consecutive robot poses, e.g., xαi and xαi+1 in Fig. 2)
or loop closures (which constrain non-consecutive poses, e.g., xαi−1 and xαi+1 in Fig. 2).
The inter-robot measurements are the ones relating the poses of different robots.
For instance, during a rendezvous,
robot α (whose local time is i), observes a second robot β (whose local time is j)
and uses on-board sensors to measure the relative pose of the observed robot in its own reference frame.
Therefore, robot α acquires an inter-robot measurement, describing the relative pose between xαi and xβj (red links in Fig. 2). We use the term
separators to refer to the poses involved in an inter-robot measurement.

While our classification of the measurements (inter vs intra) is based on the robots involved in the
measurement process, all relative measurements can be framed within the same measurement model.
Since all measurements correspond to noisy observation of the relative pose between a pair of poses, say
xαi and xβj, a general measurement model is:

(1)

where the relative pose measurement ¯zαiβj includes the relative rotation measurements
¯Rαiβj, which describes the attitude Rβj with respect to the
reference frame of robot α at time i,
“plus” a random rotation Rϵ (measurement noise), and the relative position measurement ¯tαiβj, which describes the position
tβj in the reference frame of robot α at time i, plus random
noise tϵ.
According to our previous definition,
intra robot measurements are in the form ¯zαiαk, for some robot α
and for two time instants i≠k; inter-robot measurements, instead, are in the form
¯zαiβj for two robots α≠β.

In the following, we denote with EαI the set of intra-robot
measurements for robot α, while we call EI the set of intra-robot
measurements for all robots in the team, i.e., EI=∪α∈ΩEαI.
The set of inter-robot measurements involving robot α is denoted with EαS
(S is the mnemonic for “separator”).
The set of all inter-robot measurements is denoted with ES.
The set of all available measurements is then E=EI∪ES.
Note that each robot only has access to its own intra and inter-robot measurements
EαI and EαS.

ML trajectory estimation.
Let us collect all robot trajectories in a single
(to-be-estimated) set of poses x=[xα,xβ,xγ,…].
The ML estimate for x is defined as the maximum of the measurement likelihood:

x⋆=argmaxx∏(αi,βj)∈EL(¯zαiβj|x)

(2)

where we took the standard assumption of independent measurements.
The expression of the likelihood function depends on the distribution
of the measurements noise, i.e., Rϵ,tϵ in (1).
We follow the path of Carlone et al. (2015a) and
assume that translation noise is distributed according to a zero-mean Gaussian with information matrix
ω2tI3, while the rotation noise follows a Von-Mises distribution with
concentration parameter ω2R.

Under these assumptions, it is possible to demonstrate (Carlone et al., 2015a)
that the ML estimate x≐{(Rαi,tαi),∀α∈Ω,∀i}
can be computed as solution of the following optimization problem:

mintαi∈R3,Rαi∈SO(3)∀α∈Ω,∀i∑(αi,βj)∈E

ω2t∥∥tβj−tαi−Rαi¯tαiβj∥∥2+

(3)

ω2R2∥∥Rβj−Rαi¯Rαiβj∥∥2F

The peculiarity of (3) is the use of the chordal distance∥Rβj−Rαi¯Rαiβj∥F to quantify
rotation errors, while the majority of related works in robotics uses
the geodesic distance, see (Carlone et al., 2015b) for details.

In this paper we consider the more interesting case in which it is not possible to
collect all measurements at a centralized estimator, and the problem has to be solved in a distributed
fashion. More formally, the problem we solve is the following.

Problem 1 (Distributed Trajectory Estimation).

Design an algorithm that each robot α can execute
during a rendezvous with a subset of other robots Ωr⊆Ω∖{α}, and that

takes as input: (i) the intra-robot measurements EαI
and (ii) the subset of inter-robot measurements EαS,
(iii) partial estimates of the trajectory of robots β∈Ωr;

returns as output: the ML estimate x⋆α, which is such that
x⋆=[x⋆α,x⋆β,x⋆γ,…]
is a minimizer of (3).

While the measurements EαI and EαS
are known by robot α, gathering the estimates from robots β∈Ωr requires
communication, hence we want our distributed algorithm to exchange a very small portion of the trajectory estimates.

The next sections present our solution to Problem 1.
To help readability, we start with a centralized description of the approach, which is an adaptation
of the chordal initialization of Carlone et al. (2015b) to the multi robot case. Then we tailor the discussion to
the distributed setup in Section 3.3.

3.2 Two-Stage Pose Graph Optimization: Centralized Description

The present work is based on two key observations.
The first one is that the optimization problem (3) has a quadratic objective;
what makes (3) hard is the presence of non-convex constraints,
i.e., Rαi∈SO(3).
Therefore, as already proposed in Carlone et al. (2015b) (for the single robot, centralized case),
we use a two-stage approach: we first solve a relaxed version of (3) and get an estimate for the rotations
Rαi of all robots, and then we recover the full poses and top-off the result with a Gauss-Newton (GN) iteration. The second key observation is that each of the two stages can be solved in distributed fashion,
exploiting existing distributed linear system solvers.
In the rest of this section we review the two-stage approach of Carlone et al. (2015b),
while we discuss the use of distributed
solvers in Section 3.3.

The two-stage approach of Carlone et al. (2015b)
first solves for the unknown rotations, and then recovers the full poses via a
single GN iteration. The two stages are detailed in the following.

Stage 1: rotation initialization via relaxation and projection.
The first stage computes a good estimate of the rotations of all robots by solving the following rotation subproblem:

minRαi∈SO(3)∀α∈Ω,∀i∑(αi,βj)∈Eω2R∥∥Rβj−Rαi¯Rαiβj∥∥2F

(4)

which amounts to estimating the rotations of all robots in the team by considering only the
relative rotation measurements (the second summand in (3)).

While problem (4) is nonconvex (due to the nonconvex constraints Rαi∈SO(3)),
many algorithms to approximate its solution are available in literature.
Here we use the approach proposed in Martinec and Pajdla (2007) and reviewed in Carlone et al. (2015b).
The approach first solves the quadratic relaxation obtained by
dropping the constraints Rαi∈SO(3), and then projects the relaxed solution to SO(3).
In formulas, the quadratic relaxation is:

minRαi,∀α∈Ω,∀i∑(αi,βj)∈Eω2R∥∥Rβj−Rαi¯Rαiβj∥∥2F

(5)

which simply rewrites (4) without the constraints.
Since (5) is quadratic in the unknown
rotations Rαi,∀α∈Ω,∀i, we
can rewrite it as:

minr∥Arr−br∥2

(6)

where we stacked all the entries of the unknown rotation matrices
Rαi,∀α∈Ω,∀i into a single
vector r, and we built the (known) matrix Ar and
(known) vector br accordingly (the presence of a nonzero vector br
follows from setting one of the rotations to be the reference frame, e.g., Rα1=I3).

Since (5) is a linear least-squares problem, its solution can be found by
solving the normal equations:

(ATrAr)r=ATrbr

(7)

Let us denote with ˘r the solution of (7).
Rewriting ˘r in matrix form, we obtain the matrices ˘Rαi,
∀α∈Ω,∀i.
Since these rotations were obtained from a relaxation of (4), they
are not guaranteed to satisfy the constraints Rαi∈SO(3); therefore
the approach (Martinec and Pajdla, 2007)
projects them to SO(3), and gets the rotation estimate ^Rαi=project(˘Rαi), ∀α∈Ω,∀i.
The projection only requires to perform
an SVD of ˘Rαi and can be performed independently for each rotation (Carlone et al., 2015b).

Stage 2: full pose recovery via single GN iteration.
In the previous stage we obtained an estimate for the rotations ^Rαi,∀α∈Ω,∀i. In this stage we use this estimate to reparametrize
problem (3). In particular, we rewrite each unknown rotation Rαi
as the known estimate ^Rαi “plus” an unknown perturbation;
in formulas, we rewrite each rotation as Rαi=^RαiExp(θαi),
where Exp(⋅) is the exponential map for SO(3), and θαi∈R3
(this is our new parametrization for the rotations).
With this parametrization, eq. (3) becomes:

(8)

+ω2R2∥∥^RβjExp(θβj)−^RαiExp(θαi)¯Rαiβj∥∥2F

Note that the reparametrization allowed to drop the constraints (we are now trying to estimate
vectors in R3), but moved the nonconvexity to the objective (Exp(⋅) is nonlinear in its argument).
In order to solve (8), we take a quadratic approximation of the cost function.
For this purpose we use the following first-order approximation of the exponential map:

Exp(θαi)≃I3+S(θαi)

(9)

where S(θαi) is a skew symmetric matrix whose entries are
defined by the vector θαi.
Substituting (9) into (8) we get the
desired quadratic approximation:

Rearranging the unknown tαi,θαi of all robots into a single vector p,
we rewrite (10) as a linear least-squares problem:

minp∥App−bp∥2

(11)

whose solution can be found by solving the linear system:

(ATpAp)p=ATpbp

(12)

From the solution of (12) we can build our trajectory
estimate: the entries of p directly define the
positions tαi, ∀α∈Ω,∀i; moreover, p includes the rotational corrections
θαi, from which we get our rotation estimate as:
Rαi=^RαiExp(θαi).

Remark 1 (Advantage of Centralized Two-Stage Approach).

The approach reviewed in this section
has three advantages. First, as shown in Carlone et al. (2015b), in common problem instances
(i.e., for reasonable levels of measurement noise) it
returns a solution that is very close to the ML estimate.
Second, the approach only requires to solve two linear
systems (the cost of projecting the rotations is negligible), hence it is computationally efficient.
Finally, the approach does not require an initial guess, therefore, it is able to converge even when the
initial trajectory estimate is inaccurate (in those instances, iterative optimization
tends to fail (Carlone et al., 2015b)) or is unavailable.
■

3.3 Distributed Pose Graph Optimization

In this section we show that the two-stage approach described in Section 3.2
can be implemented in a distributed fashion. Since the approach only requires solving
two linear systems, every distributed linear system solver can be used as workhorse to
split the computation among the robots. For instance, one could adapt the Gaussian elimination
approach of Cunningham et al. (2010) to solve the linear systems (7) and (12).
In this section we propose alternative approaches, based on the Distributed Jacobi Over-Relaxation and
the Distributed Successive Over-Relaxation algorithms,
and we discuss their advantages.

In both (7) and (12) we need to solve a linear system where
the unknown vector can be partitioned into subvectors, such that each subvector contains the
variables associated to a single robot in the team. For instance, we can partition the vector r in (7),
as r=[rα,rβ,…], such that rα describes the rotations of robot α.
Similarly, we can partition p=[pα,pβ,…] in (12), such
that pα describes the trajectory of robot α.
Therefore, (7) and (12) can be framed in the general form:

Hy=g⇔⎡⎢
⎢
⎢⎣HααHαβ…HβαHββ…⋮⋮⋱⎤⎥
⎥
⎥⎦⎡⎢
⎢⎣yαyβ⋮⎤⎥
⎥⎦=⎡⎢
⎢⎣gαgβ⋮⎤⎥
⎥⎦

(13)

where we want to compute the vector y=[yα,yβ,…] given
the (known) block matrix H and the (known) block vector g; on the right of (13)
we partitioned the square matrix H and the vector g according to the block-structure of y.

In order to introduce the distributed algorithms, we first observe that
the linear system (13) can be rewritten as:

∑δ∈ΩHαδyδ=gα∀α∈Ω

Taking the contribution of
yα out of the sum, we get:

Hααyα=−∑δ∈Ω∖{α}Hαδyδ+gα∀α∈Ω

(14)

The set of equations (14) is the same as the original system (13),
but clearly exposes the contribution of the variables associated to each robot.
The equations (14) constitute the basis for
the Successive Over-Relaxation (SOR) and the Jacobi Over-Relaxation (JOR)
methods that we describe in the following sections.

Distributed Jacobi Over-Relaxation (JOR):

The distributed JOR algorithm (Bertsekas and Tsitsiklis, 1989) starts at an arbitrary initial estimate
y\and0=[yα\and0,yβ\and0,…] and
solves the linear system (13) by repeating the following iterations:

y\andk+1α=(1−γ)y\andkα+(γ)H−1αα⎛⎝−∑δ∈Ω∖{α}Hαδy\andkδ+gα⎞⎠∀α∈Ω

(15)

where γ is the relaxation factor.
Intuitively, at each iteration robot α attempts to solve eq. (14)
(the second
summand in (15) is the solution of (14)
with the estimates of the other robots kept fixed), while remaining close to the previous estimate
y\andkα (first summand in (15)).
If the iterations (15) converge to a fixed point,
say yα∀α, then the resulting estimate solves
the linear system (14) exactly (Bertsekas and Tsitsiklis, 1989, page 131).
To prove this fact we only need to rewrite (15) after convergence:

In our multi robot problem, the distributed JOR algorithm can be understood in a simple way: at each iteration, each robot
estimates its own variables (y\andk+1α) by assuming that the ones of the other robots are constant
(y\andkδ); iterating this procedure, the robots reach an agreement on the estimates,
and converge to the solution of eq. (13). Using the distributed JOR approach, the
robots solve (7) and (12) in a distributed manner. When γ=1, the
distributed JOR method is also known as the distributed Jacobi (DJ) method.

We already mentioned that when the iterations (15) converge, then they return the exact solution of the linear system. So a natural question is: when do the Jacobi iteration converge?
A general answer is given by the following proposition.

where I is the identity matrix of suitable size.
Then, the JOR iterations (15) converge from any initial estimate
if and only if ρ(M)<1, where ρ(⋅) denotes the spectral radius
(maximum of absolute value of the eigenvalues) of a matrix.

The proposition is the same of Proposition 6.1 in (Bertsekas and Tsitsiklis, 1989)
(the condition that I−M is invertible is guaranteed to hold as
noted in the footnote on page 144 of (Bertsekas and Tsitsiklis, 1989)).

It is non-trivial to establish whether our
linear systems (7) and (12) satisfy the condition
of Proposition 2.
In the experimental section, we empirically observe that the Jacobi iterations indeed converge
whenever γ≤1. For the SOR algorithm, presented in the next section, instead,
we can provide stronger theoretical convergence guarantees.

Distributed Successive Over-Relaxation (SOR)

The distributed SOR algorithm (Bertsekas and Tsitsiklis, 1989) starts at an arbitrary initial estimate
y\and0=[yα\and0,yβ\and0,…] and, at iteration k,
applies the following update rule, for each α∈Ω:

where γ is the relaxation factor, Ω+α is the set of robots that already computed the (k+1)-th estimate, while
Ω−α is the set of robots that still have to perform the update (17),
excluding node α (intuitively: each robot uses the latest estimate).
As for the JOR algorithm, by comparing (17) and (14), we see that
if the sequence produced by the iterations (17) converges to a fixed point,
then such point satisfies (14), and indeed solves the original linear
system (13). When γ=1, the distributed SOR method is known
as the distributed Gauss-Seidel (DGS) method.

The following proposition, whose proof trivially follows from (Bertsekas and Tsitsiklis, 1989, Proposition 6.10, p. 154)
(and the fact that the involved matrices are positive definite),
establishes when the distributed SOR algorithm converges to the desired solution.

Proposition 3 (Convergence of SOR).

The SOR iterations (17) applied to the linear systems (7) and (12)
converge to the solution of the corresponding linear system
(from any initial estimate) whenever γ∈(0,2), while the iterations do no
converge to the correct solution whenever γ∉(0,2).

According to (Bertsekas and Tsitsiklis, 1989, Proposition 6.10, p. 154),
for γ∉(0,2),
the SOR iterations (17) do not converge to the
solution of the linear system in general, hence also in practice, we restrict
the choice of γ in the open interval (0,2). In the experimental section, we
show that the choice γ=1 ensures the fastest convergence.

Communication Requirements for JOR and SOR

In this section we observe that to execute the JOR and SOR iterations (15)(17),
robot α only needs its intra and inter-robot measurements EαI and EαS,
and an estimate of the separators, involved in the inter-robot measurements in EαS.
For instance, in the graph of Fig. 3 robot α only needs the
estimates of yβ1 and yβ3, while does not require any
knowledge about the other poses of β.

To understand this fact, we note that both (7) and (12) model
an estimation problem from pairwise relative measurements. It is well known that the
matrix H (sometimes called the Hessian(Dellaert, 2005)) underlying these problems has a block structure defined by the Laplacian matrix of the
underlying graph (Barooah and Hespanha, 2007). For instance, Fig. 3 (right) shows the
block sparsity of the matrix H describing the graph on the left: off-diagonal block-elements
in position (αi,βj) are non zero if and only if there is an edge (i.e., a measurement) between αi
and βj.

By exploiting the block sparsity of H, we can further simplify the JOR (15) iterations
as:

where we removed the contributions of the zero blocks from the sum in (17);
the sets EαS+ and EαS− satisfy EαS+∪EαS−=EαS,
and are such that EαS+ includes the inter-robot measurements involving
robots which already performed the (k+1)-th iteration, while EαS−
is the set of measurements involving robots that have not performed the iteration yet (as before:
each robot simply uses its latest estimate).

Eqs. (18) and (19) highlight that the JOR and SOR iterations (at robot α) only require the estimates
for poses involved in its inter-robot measurements EαS.
Therefore both JOR and SOR involve almost no “privacy violation”: every other robot
β in the team does not need to communicate any other information about its own
trajectory, but only sends an estimate of its rendezvous poses.

Flagged Initialization

As we will see in the experimental section and according to
Proposition 3, the JOR and SOR approaches converge
from any initial condition when γ is chosen appropriately. However, starting from a “good” initial
condition can reduce the number of iterations to converge, and in turns reduces the communication
burden (each iteration (18) or (19) requires the robots to exchange
their estimate of the separators).

In this work, we follow the path of Barooah and Hespanha (2005) and adopt a flagged initialization.
A flagged initialization scheme only alters the first JOR or SOR iteration as follows.
Before the first iteration, all robots are marked as “uninitialized”.
Robot α performs its iteration (18) or (19) without
considering the inter-robot measurements, i.e., eqs. (18)-(19)
become y\andk+1α=H−1ααgα; then the robot α
marks itself as “initialized”. When the robot β performs its iteration, it includes
only the separators from the robots that are initialized; after performing the JOR or SOR iteration,
also β marks itself as initialized. Repeating this procedure, all robots
become initialized after performing the first iteration. The following iterations then proceed
according to the standard JOR (18) or SOR (19) update.
Barooah and Hespanha (2005) show a significant improvement in convergence
using flagged initialization. As discussed in the experiments, flagged initialization is
also advantageous in our distributed pose graph optimization problem.

The second contribution of this paper is the use of high-level object-based models at the
estimation front-end and as a map representation for multi robot SLAM.
Object-based abstractions are crucial to further reduce the memory storage and the
information exchange among the robots.

Previous approaches for multi robot mapping rely on feature-based maps which become memory-intensive for long-term operation, contain a
large amount of redundant information, and lack the semantic understanding necessary to
perform a wider range of tasks (e.g., manipulation, human-robot interaction). To solve these issues,
we present an approach for multi robot SLAM which uses
object landmarks (Salas-Moreno et al., 2013) in a multi robot mapping setup.

Figure 4: Factor graph representation of Multi-Robot Object based SLAM. xαi and xβi
denote the poses assumed by robot α and β at time i respectively. The
pose of the kth object in the coordinate frame of
robot α and β is denoted with oαk and oβk respectively. Green dots shows inter-robot factors
whereas orange and purple dots shows intra-robot factors.

4.1 Distributed Object-based SLAM

We consider a multi robot system as defined in Section 3.1.
Each robot, in addition to estimating its own trajectory using local measurements and occasional communication with other robots, also estimates the pose of a set of objects in the environment.
We model each trajectory as a finite set of poses; the trajectory of robot α is xα=[xα1,xα2,…].
In addition, we denote with oαk∈SE(3) the pose of the kth object in the coordinate frame of robot α (Fig. 4).

Measurements. Similar to distributed pose graph optimization (Section 3.1), we assume that each robot acquires two types of relative pose measurements: intra-robot and inter-robot measurements. The intra-robot measurements consist of the odometry measurements, which constrain consecutive robot poses (e.g., xαi and xαi+1 in Fig. 4), and object measurements which constrains robot poses with the corresponding visible object landmarks (e.g., xαi and oαk in Fig. 4).
Contrarily to Section 3.1,
the inter-robot measurements relate the object poses observed by different robots.
During a rendezvous between robot α and robot β, each robot shares the label and pose of detected object landmarks with the other robot. Then, for each object observed by both robots,
the teammates add an inter-robot measurements,
enforcing the object pose estimate to be consistent across the teammates.
For instance, if oβk and oαk in Fig. 4
model the pose of the same object, then the two poses should be identical. For this reason, intra-robot measurement between a pair of associated object poses is zero.

The intra-robot object measurements follow the same measurements model of eq. (1).
For instance, if the robot α at time i and at pose xαi
observes an object at pose oαk, then the corresponding measurement
¯zxαioαk measures the relative pose between xαi and oαk.
Consistently with our previous notation, we denote intra-robot object measurement between xαi and oαk as ¯zxαioαk,
and inter-robot measurement between object poses oαk and oβk as ¯zoαkoβk.

In the following, we denote with EαI the set of intra-robot odometry for robot α, while we call EI the set of intra-robot odometry measurements for all robots in the team, i.e., EI=∪α∈ΩEαI.
Similarly the set of intra-robot object measurements for robot α is denoted as Eαo, whereas the set of all intra-robot object measurements is denoted as Eo.
Similar to Section 3.1, the set of inter-robot measurements involving robot α is denoted with EαS.
The set of all inter-robot measurements is denoted with ES. The set of all available measurements is then E=EI∪EO∪ES.
Note that each robot only has access to its own intra and inter-robot measurements
EαI, EαO and EαS.

ML trajectory and objects estimation.
Let us collect all robot trajectories and object poses in a
(to-be-estimated) set of robot poses x=[xα,xβ,xγ,…] and set of object poses o=[oα,oβ,oγ,…].
The ML estimate for x and o is defined as the maximum of the measurement likelihood:

where we used the same assumptions on measurement noise as in Section 3.1.
Defining X=x∪o, we rewrite eq. (20) as:

X⋆=argmaxX∏(αi,βj)∈EL(¯zαiβj|X)

(21)

Since the optimization problem in eq. (21) has the same structure of the one in
eq. (2), we follow the same steps to solve
it in a distributed manner using the Distributed Gauss-Seidel method.

The next section presents the implementation details of our distributed object-based SLAM system.

Figure 5: Flowchart of Object based SLAM

4.2 Object-based SLAM Implementation

Object detection and pose estimation.
Each robot collects RGBD data using a depth camera, and measures its ego-motion
through wheel odometry. In our approach, each RGB frame (from RGBD) is passed to the YOLO object detector (Redmon et al., 2015),
which detects objects at 45 frames per second.
Compared to object-proposal-based detectors, YOLO is fast, since it avoids the computation burden of extracting object proposals, and is less likely to produce false positives in the background. We fine-tune the YOLO detector on a subset of objects from the BigBird dataset (Singh et al. (2014)). The training dataset contains the object images in a clean background taken from different viewpoints and labeled images of the same objects taken by a robot in an indoor environment. During testing, we use a probability threshold of 0.3 to avoid false detections.

Each detected object bounding box is segmented using the organized point cloud segmentation(Trevor et al., 2013). The segmented object is then matched to the 3D template of the detected object class to estimate its pose. We extract PFHRGB features (Rusu et al., 2008) in the source (object segment) and target (object model) point clouds and register the two point clouds in a Sample Consensus Initial Alignment framework (Rusu, 2009). If we have at least 12 inlier correspondences, GICP (generalized iterative closest
point Segal et al. (2009) is performed to further refine the registration and the final transformation is used as the object pose estimate. If less than 12 inlier correspondences are found, the detection is considered to be a false positive and the
corresponding measurement is discarded. In hindsight, this approach verifies the detection both semantically and geometrically.

Data Association.
If object pose estimation is successful, it is data-associated with other instances already present in the map by finding the object landmark having the same category label within 2σ distance of the newly detected object. If there are multiple objects with the same label within that distance, the newly detected object is matched to the nearest object instance. If there exists no object having the same label, a new object landmark is created.

Before the first rendezvous event, each robot performs
standard single-robot SLAM using OmniMapper (Trevor et al., 2012).
Both wheel odometry and relative pose measurements to the observed objects
are fed to the SLAM back-end.
A flowchart of the approach is given in Fig. 5.

Robot Communication.
During a rendezvous between robots α and β, robot α communicates the category labels (class) and poses (in robot α’s frame) of all the detected objects to robot β.
We assume that the initial pose of each robot is known to all the robots, hence,
given the initial pose of robot α, robot β is able to transform the communicated object
poses from robot α’s frame to its own frame.1 For each object in the list communicated by
robot α,
robot β finds the nearest object in its map, having the
same category label and within 2σ distance.
If such an object exists, it is added to the list of shared objects: this is the set of objects seen by both robots.
The list of shared objects contains pairs (oαk,oβl) and informs the
robots that the poses oαk and oβl correspond to the same physical object,
observed by the two robots.
For this reason, in the optimization we enforce the relative pose between
oαk and oβl to be zero.

We remark that, while before the first rendezvous the robots α and β have
different reference frames, the object-object factors enforce both robots to have a single
shared reference frame, facilitating future data association.

Next we show the experimental evaluation which includes realistic Gazebo simulations and field experiments in a military test facility.

We evaluate the distributed JOR and SOR along with DJ and DGS approaches
(with and without using objects) in large-scale simulations (Section 5.1 and 5.2)
and field tests (Section 5.3 and 5.4).
The results demonstrate that (i) the DGS dominates the other algorithms
considered in this paper in terms of convergence speed,
(ii) the DGS algorithm is accurate, scalable, and robust to noise,
(iii) the DGS requires less communication than techniques from related work (i.e., DDF-SAM),
and (iv) in real applications, the combination of DGS and object-based mapping reduces the
communication requirements by several orders of magnitude compared to approaches
exchanging raw measurements.

5.1 Simulation Results: Multi Robot Pose Graph Optimization

In this section, we characterize the performance of the proposed approaches in terms of
convergence, scalability (in the number of robots and separators),
and sensitivity to noise.

Simulation setup and performance metrics.
For our tests, we created simulation datasets in six different configurations
with increasing number of robots: 4, 9, 16, 25, 36 and 49 robots. The robots are arranged
in a 3D grid with each robot moving on a cube, as shown in Fig. 6.
When the robots are at contiguous corners, they can communicate (gray links).
Unless specified otherwise, we generate measurement noise from a zero-mean Gaussian distribution with
standard deviation σR=5∘ for the rotations and σt=0.2m for the translations.
Results are averaged over 10 Monte Carlo runs.

In our problem, JOR or SOR are used to sequentially solve two linear systems, (7)
and (12), which return the
minimizers of (6) and (11), respectively.
Defining, mr≐minr∥Arr−br∥2, we use
the following metric, named the rotation estimation error, to quantify the error in solving (7):

er(k)=∥Arr\andk−br∥2−mr

(22)

er(k) quantifies how far is the current estimate r\andk (at the k-th iteration)
from the minimum of the quadratic cost. Similarly, we define the pose estimation error as:

ep(k)=∥App\andk−bp∥2−mp

(23)

with mp≐minp∥App−bp∥2.
Ideally, we want er(k) and ep(k) to quickly converge to zero for increasing k.

Ultimately, the accuracy of the proposed approach depends on the number of iterations,
hence we need to set a stopping condition for the JOR or SOR iterations.
We use the following criterion: we stop the iterations if the change in the estimate is sufficiently small.
More formally, the iterations stop when ∥r\andk+1−r\andk∥≤ηr (similarly, for the
second linear system ∥p\andk+1−p\andk∥≤ηp). We use ηr=ηp=10−1 as
stopping condition unless specified otherwise.

(a) Rotation Estimation(b) Pose EstimationFigure 7:
JOR: convergence of (a) rotation estimation and (b) pose estimation
for different values of γ (grid scenario, 49 robots).
In the case of pose estimation, the gap between the initial values
of γ>1 and γ≤1 is due to the bad initialization provided by the rotation estimation for γ>1.
(a) Rotation Error(b) Pose ErrorFigure 8: SOR: convergence of (a) rotation estimation and (b) pose estimation
for different values of γ (grid scenario, 49 robots).

Comparisons among the distributed algorithms.
In this section we consider the scenario with 49 robots.
We start by studying the convergence properties of the JOR and SOR algorithms in
isolation. Then we compare the two algorithms in terms of convergence speed.
Fig. 7 shows the rotation and the pose error versus the number of iterations
for different choices of the parameter γ for the JOR algorithm.
Fig. 7a confirms the result of Proposition 2:
JOR applied to the rotation subproblem converges as long as γ≤1.
Fig. 7a shows that for any γ>1 the estimate diverges,
while the critical value γ=1 (corresponding to the DJ method) ensures the fastest convergence rate.
Fig. 8 shows the rotation and the pose error versus the number of iterations
for different choices of the parameter γ∈(0,2) for the SOR algorithm.
The figure confirms the result of Proposition 3: the SOR algorithm converges
for any choice of γ∈(0,2).
Fig. 8a shows that choices of γ close to 1 ensure fast
convergence rates, while Fig. 8b established
γ=1 (corresponding to the DGS method) as the parameter selection with
faster convergence.
In summary, both JOR and SOR have top performance when γ=1.
Later in this section we show that γ=1 is the best choice independently
on the number of robots and the measurement noise.

(a) Rotation Error(b) Pose ErrorFigure 9: JOR vs SOR:
convergence of (a) rotation estimation and (b) pose estimation for
the JOR and SOR algorithms with γ=1 (grid scenario, 49 robots).
(a) Rotation Estimation(b) Pose EstimationFigure 10: JOR vs SOR: number of iterations
required for (a) rotation estimation and (b) pose estimation for
the JOR and SOR algorithms with γ=1 (grid scenario, 49 robots).
The average number of iterations is shown as a solid line, while the 1-sigma standard deviation
is shown as a shaded area.

Let us now compare JOR and SOR in terms of convergence.
Fig. 9 compares the convergence rate of SOR and JOR for both the rotation subproblem (Fig. 9a)
and the pose subproblem (Fig. 9b).
We set γ=1 in JOR and SOR since we already observed that
this choice ensures the best performance.
The figure confirms that SOR dominates JOR in both subproblems.
Fig. 10 shows the number of iterations for
convergence (according to our stopping conditions) and for different choices of the parameter γ.
Once again, the figure confirms that the SOR with γ=1 is able to
converge in the smallest number of iterations, requiring only few tens of iterations
in both the rotation and the pose subproblem.

(a) Rotation Error(b) Pose ErrorFigure 11:
SOR: number of iterations required for (a) rotation estimation and (b) pose estimation in the SOR algorithm
for different choices of γ and increasing number of robots.
(a) Rotation Error(b) Pose ErrorFigure 12:
SOR: number of iterations required for (a) rotation estimation and (b) pose estimation in the SOR algorithm
for different choices of γ and increasing measurement noise.

We conclude this section by showing that setting γ=1 in SOR ensure faster
convergence regardless the number of robots and the measurement noise.
Fig. 11 compares
the number of iterations required to converge for increasing number of robots for varying γ values. Similarly Fig. 12 compares
the number of iterations required to converge for increasing noise for varying γ value. We can see that in both the cases γ=1 has the fastest convergence (required the least number of iterations)
irrespective of the number of robots and measurement noise.
Since SOR with γ=1, i.e., the DGS method, is the
top performer in all test conditions, in the rest of the paper we restrict our analysis to this algorithm.

Flagged initialization. In this paragraph we discuss the advantages of the flagged initialization.
We compare the DGS method with flagged initialization against a naive initialization in which the variables
(r\and0 and p\and0, respectively) are initialized to zero. The results, for the dataset with
49 robots, are shown in Fig. 13.
In both cases the estimation errors go to zero, but the convergence is faster when using the flagged initialization.
The speed-up is significant for the second linear system (Fig. 13b).
We noticed a similar advantage across all tested scenarios.
Therefore, in the rest of the paper we always adopt the flagged initialization.

Stopping conditions and anytime flavor.
This section provides extra insights on the convergence of the DGS method.
Fig. 14a-b show the evolution of the rotation and pose error for each robot in the 49-robot grid:
the error associated to each robot (i.e., to each subgraph corresponding to a robot trajectory)
is not monotonically decreasing
and the error for some robot can increase to bring down the overall error.
Fig. 14c-d report the change in the
rotation and pose estimate for individual robots. Estimate changes become negligible within few tens of iterations. As mentioned at the
beginning of the section, we stop the DGS iterations when the estimate change is sufficiently small
(below the thresholds ηr and ηp).

(a) Initial(b) 10 iterations(c) 1000 iterationsFigure 15: DGS: Trajectory estimates for the scenario with 49 robots.
(a) Odometric estimate (not used in our approach and only given for visualization purposes), (b)-(c) DGS estimates after given number of iterations.

Fig. 15 shows the estimated trajectory after 10 and 1000 iterations
of the DGS algorithm for the 49-robot grid. The odometric estimate (Fig. 15a) is shown for visualization purposes, while
it is not used in our algorithm. We can see that
the estimate after 10 iterations is already visually close to the estimate after 1000 iterations.
The DGS algorithm has an any-time flavor: the trajectory estimates are already accurate
after few iterations and asymptotically converge to the centralized estimate.

Scalability in the number of robots.
Fig. 16 shows the average rotation and pose errors for all the simulated datasets
(4, 9, 16, 25, 36 and 49 robots). In all cases the errors quickly converge to zero.
For large number or robots the convergence rate becomes slightly slower, while in all cases
the errors is negligible in few tens of iterations.

#Robots

Distributed Gauss-Seidel

Centralized

ηr=ηp=10−1

ηr=ηp=10−2

Two-Stage

GN

#Iter

Cost

#Iter

Cost

Cost

Cost

4

10

1.9

65

1.9

1.9

1.9

9

14

5.3

90

5.2

5.2

5.2

16

16

8.9

163

8.8

8.8

8.7

25

17

16.2

147

16.0

16.0

15.9

36

28

22.9

155

22.7

22.6

22.5

49

26

35.1

337

32.9

32.7

32.5

Table 1:
Number of iterations and cost attained in problem (3) by the DGS algorithm
(for two choices of the stopping conditions), versus a centralized two-stage approach and a GN method.
Results are shown for scenarios with increasing number of robots.

While so far we considered the errors for each subproblem (er(k) and ep(k)),
we now investigate the overall accuracy of the DGS algorithm to solve our original problem (3).
We compare the proposed approach against the centralized two-stage approach of Carlone et al. (2015b)
and against a standard (centralized) Gauss-Newton (GN) method,
available in gtsam (Dellaert (2012)). We use the cost
attained in problem (3) by each technique as accuracy metric (the lower the better).
Table 1 reports the number of iterations and the cost
attained in problem (3), for the compared techniques.
The number of iterations is the sum of the number of iterations required to solve (7)
and (12).
The cost of the DGS approach is given for two choices of the thresholds ηr and ηp.
As already reported in Carlone et al. (2015b), the last two columns of the table confirm that
the centralized two-stage approach is practically as accurate as a GN method.
When using a strict stopping condition (ηr=ηp=10−2), the DGS approach
produces the same error as the centralized counterpart (difference smaller than 1%).
Relaxing the stopping conditions to ηr=ηp=10−1 implies a consistent reduction
in the number of iterations, at a small loss in accuracy (cost increase
is only significant for the scenario with 49 robots). In summary,
the DGS algorithm (with ηr=ηp=10−1) ensures accurate estimation within
few iterations, even for large teams.

Sensitivity to measurement noise.
Fig. 17 shows the average rotation and pose errors
for increasing levels of noise
in the scenario with 49 robots.
Also in this case, while larger noise seems to imply longer convergence tails,
the error becomes sufficiently small after few tens of iterations.

Table 2 evaluates the performance of the DGS method in
solving problem (3) for increasing levels of noise, comparing it against
the centralized two-stage approach of Carlone et al. (2015b)
and the Gauss-Newton method.
The DGS approach is
able to replicate the accuracy of the centralized two-stage approach, regardless the
noise level, while the choice of thresholds ηr=ηp=10−1 ensures accurate estimation within
few tens of iterations.

Measurement

Distributed Gauss-Seidel

Centralized

noise

ηr=ηp=10−1

ηr=ηp=10−2

Two-Stage

GN

σr(∘)

σt(m)

#Iter

Cost

#Iter

Cost

Cost

Cost

1

0.05

8.5

2.1

51.0

1.8

1.8

1.8

5

0.1

21.8

14.8

197.8

14.0

14.0

13.9

10

0.2

35.6

58.4

277.7

56.6

56.6

56.0

15

0.3

39.8

130.5

236.8

128.4

129.3

126.0

Table 2:
Number of iterations and cost attained in problem (3) by the DGS algorithm
(for two choices of the stopping conditions), versus a centralized two-stage approach and a GN method.
Results are shown for increasing measurement noise.
(a)(b)Figure 18:
DGS vs DDF-SAM:
(a) average number of iterations versus number of separators for the DGS algorithm.
(b) communication burden (bytes of exchanged information) for DGS and DDF-SAM, for increasing
number of separators.

Scalability in the number of separators.
In order to evaluate the impact of the number of separators on convergence,
we simulated two robots moving along parallel tracks for 10 time stamps.
The number of communication links were varied from 1 (single communication) to 10 (communication at every time),
hence the number of separators (for each robot) ranges from 1 to 10.
Fig. 18a shows the number of iterations
required by the DGS algorithm (ηr=ηp=10−1), for increasing
number of communication links.
The number of iterations is fairly insensitive to the number of communication links.

Fig. 18b compares the
information exchanged in the DJ algorithm against a state-of-the-art algorithm,
DDF-SAM (Cunningham et al. (2010)). In DDF-SAM, each robot sends
KGN[sBp+(sBp)2] bytes,
where KGN is the number of iterations required by a GN method applied to problem (3)
(we consider the best case KGN=1),
s is the number of separators and Bp is the size of a pose in bytes.
In the DGS algorithm, each robots sends
KrDGS(sBr)+KpDGS(sBp)
bytes,
where KrDGS and KpDGS are the number of iterations required by the
DGS algorithm to solve the linear systems (7)
and (12), respectively, and Br is the size of a rotation (in bytes).
We assume Br=9 doubles (72 bytes)2 and Bp=6 doubles (48 bytes).
The number of iterations KrDGS and KpDGS are the one plotted in
Fig. 18a. From Fig. 18b we see that
the communication burden of DDF-SAM quickly becomes unsustainable, while the
linear increase in communication of the DGS algorithm implies large communication saving.

Realistic simulations in Gazebo.
We tested our DGS-based approach in two scenarios in Gazebo simulations as shown in Fig. 19.
The robots start at fixed locations and explore the environment by moving according to a random walk.
Each robot is equipped with a 3D laser range finder, which is used to intra-robot and inter-robot measurements
via scan matching.
In both scenarios, two robots communicate only when they are within close proximity of each other (0.5m in our tests).
Results are average over 100 Monte-Carlo runs.

Fig. 19 shows the aggregated point cloud corresponding to the DGS trajectory estimate, for one of the runs. The point cloud closely resembles the ground truth environment shown in the same figure. Fig. 20a shows that number of steps required to explore the whole environment quickly decreases with increasing number of robots. This intuitive observation motivates our interest towards mapping techniques that can
scale to large teams of robots. Fig. 20b reports trajectory samples
for different robots in our Monte Carlo analysis.

5.2 Simulation Results: Multi Robot Object based SLAM

In this section we characterize the performance of the DGS algorithms associated with our object-based model
described in Section 4.
We test the resulting
multi robot object-based SLAM approach in terms of scalability in the number of robots and sensitivity to noise.

Simulation setup and performance metrics.
We consider two scenarios, the 25 Chairs and the House, which we simulated in Gazebo. In the 25 Chairs scenario, we placed 25 chairs as objects on a grid, with each chair placed at a random angle. In the House scenario, we placed furniture as objects in order to simulate a living room environment. Fig. 21 shows the two scenarios.
Unless specified otherwise, we generate measurement noise from a zero-mean Gaussian distribution with standard deviation σR=5∘ for the rotations and σt=0.2m for the translations. Six robots are used by default.
Results are averaged over 10 Monte Carlo runs.

We use the absolute translation error (ATE*) and absolute rotation error (ARE*) of the robot and landmark poses with respect to the centralized estimate as error metric.
These two metrics are formally defined below.

Absolute Translation Error (ATE*). Similar to the formulation by Sturm et al. Sturm et al. (2012), the average translation error measures the absolute distance between the trajectory and object poses estimated by our approach versus the centralized GN method.
The ATE* is defined as:

ATE∗=(1∑α∈Ωnα∑α∈Ωnα∑i=1∥tαi−t∗αi∥2)12

(24)

where tαi is the position estimate for robot α at time i,
t∗αi is the corresponding estimate from GN, and nα is the number of
poses in the trajectory of α. A similar definition holds for the object positions.

Absolute Rotation Error (ARE*). The average rotation error is computed by
evaluating the angular mismatch between the (trajectory and objects) rotations produced by the proposed
approach versus a centralized GN method:

ARE∗=(1∑α∈Ωnα∑α∈Ωnα∑i=1∥Log((R∗αi)TRαi)∥2)12

(25)

where Rαi is the rotation estimate for robot α at time i,
R∗αi is the corresponding estimate from GN.
A similar definition holds for the object rotations.

CentralizedDistributedFigure 22:
Trajectories of the six robots and object locations (shows as red dots) estimated using the centralized GN method and
the proposed DGS method for the 25 Chairs (top) and House scenarios (bottom).