"... The success of mobile robots, and particularly of those interfacing with humans in daily environments (e.g., assistant robots), relies on the ability to manipulate information beyond simple spatial relations. We are interested in semantic information, which gives meaning to spatial information li ..."

The success of mobile robots, and particularly of those interfacing with humans in daily environments (e.g., assistant robots), relies on the ability to manipulate information beyond simple spatial relations. We are interested in semantic information, which gives meaning to spatial information like images or geometric maps. We present a multi-hierarchical approach to enable a mobile robot to acquire semantic information from its sensors, and to use it for navigation tasks. In our approach, the link between spatial and semantic information is established via anchoring. We show experiments on a real mobile robot that demonstrate its ability to use and infer new semantic information from its environment, improving its operation.

...z¢ @ctima.uma.es sensors, semantic information is most often hand-coded into the system. Recently, a few authors have reported systems in which the robot can acquire and use semantic information [7], =-=[8]-=-. In most cases, however, the acquisition is done via a linguistic interaction with a human and not using the robot’s own sensors. An interesting exception is [9], in which the robot extracts semantic...

"... This paper describes a new model for understanding natural language commands given to autonomous systems that perform navigation and mobile manipulation in semi-structured environments. Previous approaches have used models with fixed structure to infer the likelihood of a sequence of actions given t ..."

This paper describes a new model for understanding natural language commands given to autonomous systems that perform navigation and mobile manipulation in semi-structured environments. Previous approaches have used models with fixed structure to infer the likelihood of a sequence of actions given the environment and the command. In contrast, our framework, called Generalized Grounding Graphs (G 3), dynamically instantiates a probabilistic graphical model for a particular natural language command according to the command’s hierarchical and compositional semantic structure. Our system performs inference in the model to successfully find and execute plans corresponding to natural language commands such as “Put the tire pallet on the truck. ” The model is trained using a corpus of commands collected using crowdsourcing. We pair each command with robot actions and use the corpus to learn the parameters of the model. We evaluate the robot’s performance by inferring plans from natural language commands, executing each plan in a realistic robot simulator, and asking users to evaluate the system’s performance. We demonstrate that our system can successfully follow many natural language commands from the corpus. 1

"... Abstract—Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging because diverse words and phrases must be mapped into structures that the robot can understand, and elements in those ..."

Abstract—Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging because diverse words and phrases must be mapped into structures that the robot can understand, and elements in those structures must be grounded in an uncertain environment. We present a system that follows natural language directions by extracting a sequence of spatial description clauses from the linguistic input and then infers the most probable path through the environment given only information about the environmental geometry and detected visible objects. We use a probabilistic graphical model that factors into three key components. The first component grounds landmark phrases such as “the computers ” in the perceptual frame of the robot by exploiting co-occurrence statistics from a database of tagged images such as Flickr. Second, a spatial reasoning component judges how well spatial relations such as “past the computers ” describe a path. Finally, verb phrases such as “turn right ” are modeled according to the amount of change in orientation in the path. Our system follows 60 % of the directions in our corpus to within 15 meters of the true destination, significantly outperforming other approaches. Index Terms—spatial language, direction understanding, route instructions I.

...3]). Others have created language understanding systems that follow natural language commands, but without using a corpus-based evaluation to enable untrained users to interact with the system (e.g., =-=[5, 6]-=-). Bauer et al. [7] built a robot that can find its way through an urban environment by interacting with pedestrians using a touch screen and gesture recognition system. The structure of the spatial d...

by
Matt Macmahon
- In Proc. of the Nat. Conf. on Artificial Intelligence (AAAI, 2006

"... Following verbal route instructions requires knowledge of language, space, action and perception. We present MARCO, an agent that follows free-form, natural language route instructions by representing and executing a sequence of compound action specifications that model which actions to take under w ..."

Following verbal route instructions requires knowledge of language, space, action and perception. We present MARCO, an agent that follows free-form, natural language route instructions by representing and executing a sequence of compound action specifications that model which actions to take under which conditions. MARCO infers implicit actions from knowledge of both linguistic conditional phrases and from spatial action and local configurations. Thus, MARCO performs explicit actions, implicit actions necessary to achieve the stated conditions, and exploratory actions to learn about the world. We gathered a corpus of 786 route instructions from six people in three large-scale virtual indoor environments. Thirtysix other people followed these instructions and rated them for quality. These human participants finished at the intended destination on 69 % of the trials. MARCO followed the same instructions in the same environments, with a success rate of 61%. We measured the efficacy of action inference with MARCO variants lacking action inference: executing only explicit actions, MARCO succeeded on just 28 % of the trials. For this task, inferring implicit actions is essential to follow poor instructions, but is also crucial for many highly-rated route instructions.

"... Abstract—Mobile robots that interact with humans in an intuitive way must be able to follow directions provided by humans in unconstrained natural language. In this work we investigate how statistical machine translation techniques can be used to bridge the gap between natural language route instruc ..."

Abstract—Mobile robots that interact with humans in an intuitive way must be able to follow directions provided by humans in unconstrained natural language. In this work we investigate how statistical machine translation techniques can be used to bridge the gap between natural language route instructions and a map of an environment built by a robot. Our approach uses training data to learn to translate from natural language instructions to an automatically-labeled map. The complexity of the translation process is controlled by taking advantage of physical constraints imposed by the map. As a result, our technique can efficiently handle uncertainty in both map labeling and parsing. Our experiments demonstrate the promising capabilities achieved by our approach. Index Terms—Human-robot interaction; instruction following; navigation; statistical machine translation; natural language I.

...a. IV. RELATED WORK Navigation is a critical and widely-studied task in mobile robotics, and following natural-language instructions is a key component of natural, multi-modal human/robot interaction =-=[19]-=-. There has been substantial work on mapping and localization [20], segmenting and describing a map from sensor data [7], [8], and navigating through such an environment [15], [25]. Our work fits into...

...nomy of the robot. There are many approaches for interacting with a robot, including gestures [30], [31], haptics [32]–[34], web-based controls [35], [36], and personal digital assistants (PDAs) [37]–=-=[39]-=-. Fong and Murphy addressed the idea of using dialog to reason between an operator and a robot when the human or robot needs more information about a situation [40], [41]. Most of these approaches ten...

"... 1 Language in the World How does language relate to the non-linguistic world? If an agent is able to communicate linguistically and is also able to directly perceive and/or act on the world, how do perception, action, and language interact with and influence each other? Such questions are surely amo ..."

1 Language in the World How does language relate to the non-linguistic world? If an agent is able to communicate linguistically and is also able to directly perceive and/or act on the world, how do perception, action, and language interact with and influence each other? Such questions are surely amongst the most important in Cognitive Science and Artificial Intelligence (AI). Language, after all, is a central aspect of the human mind – indeed it may be what distinguishes us from other species. There is sometimes a tendency in the academic world to study language in isolation, as a formal system with rules for well-constructed sentences; or to focus on how language relates to formal notations such as symbolic logic. But language did not evolve as an isolated system or as a way of communicating symbolic logic; it presumably evolved as a mechanism for exchanging information about the world, ultimately providing the medium for cultural transmission across generations. Motivated by these observations, the goal of this special issue is to bring together research in AI that focuses on relating language to the physical world. Language is of course also used to communicate about non-physical referents, but the ubiquity of physical metaphor in language [21] suggests that grounding in the physical world provides the foundations of semantics.

...nterfaces to robots (many everyday objects and environments such as cars and houses may be treated as robots in the sense that they have sensors, actuators, bodies, and control systems) (for example, =-=[43, 19]-=- and Roy’s paper in this special issue); • Natural language interfaces to virtual reality systems and games (for example, [16] and Kelleher et al’s paper in this special issue); • Situated NLP for mob...

"... We have developed a simulation model that accepts instructions in unconstrained natural language, and then guides a robot to the correct destination. The instructions are segmented on the basis of the actions to be taken, and each segment is labeled with the required action. This flat formulation re ..."

We have developed a simulation model that accepts instructions in unconstrained natural language, and then guides a robot to the correct destination. The instructions are segmented on the basis of the actions to be taken, and each segment is labeled with the required action. This flat formulation reduces the problem to a sequential labeling task, to which machine learning methods are applied. We propose an innovative machine learning method for explicitly modeling the actions described in instructions and integrating learning and inference about the physical environment. We obtained a corpus of 840 route instructions that experimenters verified as follow-able, given by people in building navigation situations. Using the four-fold cross validation, our experiments showed that the simulated robot reached the correct destination 88 % of the time. 1

"... www.elsevier.com/locate/robot Robots are rapidly evolving from factory work-horses to robot-companions. The future of robots, as our companions, is highly dependent on their abilities to understand, interpret and represent the environment in an efficient and consistent fashion, in a way that is comp ..."

www.elsevier.com/locate/robot Robots are rapidly evolving from factory work-horses to robot-companions. The future of robots, as our companions, is highly dependent on their abilities to understand, interpret and represent the environment in an efficient and consistent fashion, in a way that is comprehensible to humans. The work presented here is oriented in this direction. It suggests a hierarchical probabilistic representation of space that is based on objects. A global topological representation of places with object graphs serving as local maps is proposed. The work also details the first efforts towards conceptualizing space on the basis of the human compatible representation so formed. Such a representation and the resulting conceptualization would be useful for enabling robots to be cognizant of their surroundings. Experiments on place classification and place recognition are reported in order to demonstrate the applicability of such a representation towards understanding space and thereby performing spatial cognition. Further, relevant results from user studies validating the proposed representation are also reported. Thus, the theme of the work is — representation for spatial cognition. c ○ 2007 Elsevier B.V. All rights reserved.

...) will form a critical component of the interaction between a human and a robot. This in turn will be the deciding factor towards the compatibility and acceptability of robots in our homes. The works =-=[22,23]-=- are examples of some recent efforts towards integrating dialogue and NLP in robotics. Most works in mobile robotics, however, have until now restricted themselves to navigation related problems. Thus...