Commentaires 0

Retranscription du document

IForm ApprovedIJPAGEOMB No0704-0188P~bl.(IePOIOu~r per response.including the time for tev-t..ng instruction% $earCh,ng e..%stng data sourcesgalhe'-4 Ji- bn of information Send (Omnment, regarding this bu Iden etimate or any other IaPEi1 of tistD -A 23E40tand Budget, PaperworkReductionPrclet(0704-01),W~rjunton. DC 205034. TITLE AND SUBTITLE S. FUNDING NUMBERSImagination and Situated Cognition N00014-85-K-O1 24_______________________________________________N00014-89-J-32026. AUTHOR(S)Lynn Andrea Stein7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) S. PERFORMING ORGANIZATIONArtificial IntelligenceLaboratoryREPORT NUMBER545 Technology Square AIM 1277Cambridge, Massachusetts 021399. SPONSORING I MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSORING/ MONITORINGOfficeofNavalResearchAGENCYREPORT NUMBERInformation SystemsArlington, Virginia 2221711. SUPPLEMENTARYNOTESNone1 2a. DISTRIBUTION/I AVAILABILITY STATEMENT 12b. DISTRIBUTION CODEDistribution of this document is unlimited13. ABSTRACT(Maximum 200 words)A subsumption-based mobile robot is extendedtoperform Cognitivetasks. Following directions, the robotnavigatesdirectly topreviouslyunexplored goals. This robot exploitsanovelarchitecturebasedonthe idea that cognitionusesthe underlyingmachinery of interaction,imaginingsensationsandactions.DTIC'-IELECTEbAPR2G 199114. SUBJECT TERMS(key words)1S. NUMBER OF PAGES816. PRICE CODE17. SECURITY CLASSIFICATIONI18.SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20. LIMITATION OF ABSTRACTOF REPORT Of THIS PAGE OF ABSTRACTUNCLASSIFIEDUNCLASSIFIED UNCLASSI FIED UNCLASSI FIEDNSN7540-01 -280-5500Standard Form 298 (Rev2-89)Prescribed bir ANSIStd139-18299-102MASSACHUSETTSINSTITUTEOF TECHNOLOGYARTIFICIALINTELLIGENCELABORATORYA.I. MemoNo. 1277February 1991Imaginationand Situated CognitionLynn AndreaSteinAbstractA subsumption-based mobile robot is extended to performcognitivetasks. Following directions,the robot navigates directly to previouslyunexplored goals. This robot exploitsa novel architecture based onthe idea that cognition uses the underlyingmachinery of interaction,imagining sensationsand actions.Copyright @ Massachusetts Institute ofTechnology, 1991This report describes research doneat the Artificial Intelligence Laboratory of the MassachusettsInstituteof Technology. Support for the laboratory's artificial intelligence researchis providedin part by the Advanced Research Projects Agency of theDepartment of Defense under Officeof Naval Research contracts N00014-85-K-0124and N00014-89-J-3202 and in part by the SystemDevelopment Foundation.91 4 2471 IntroductionThis paper is concernedwith a concrete example of the integration of higher-level cognitive Al and lower-level robotics. Roboticsystems are embodied: theircentral tasks concern interactionwith the immediately present world. In contrast,cognition is concerned with objects that are remote-in distance, in time, or insome other dimension. We exploit the architecture of a particular robotic systemto perform a cognitive task, by imagining the subjects of our cognition.We suggest that much ofthe abstract information that forms the meat ofcognition is used not as a centralmodel of the world, but as virtual reality. Theself-same processes that robots use to explore and interact with the world formthe interface to this information. The only difference between interaction withthe actual world and with the imagiued one is the set of sensorsand actuatorsproviding thelowest-level interface.Consider, for example, the following tasks. In the first, a pitcher and bowl siton a table before you. You lift thepitcher and pour its contents into the bowl.Now consider youractions in reading the preceding example. In all likelihood,you formed a picture in your mind's eye of the tabletop. pitcher, andbowl. Yousimulated the pouring. In the virtual world that you created for yourself, yousensed and acted. Indeed, there is evidence in the psychology literaturethat such"imagings" are accompanied by activity patterns in the visual cortex, resemblingthose observed during actual vision.This virtual reality, your imagination, isprecisely the goal of our programme.2ARobotthatExploresOfToto [Mataric, 19901 is a mobile robot capable of goal-directed navigation.It isimplemented on a Real World Interface base augmentedwith a ring of twelvePolaroidultrasonic ranging sensors and a flux-gate compass. Its primary compu-tational resource is a CMOS 68000. Its softwaresimulates a subsumption archi-tecture [Brooks, 19861.Toto's most basic level consists of routinesto explore its world. Independentcollections of finite state machines implement such basic competenciesas obstacle-avoidance and random walking. Wall-following-"maze exploration"-emerges asthe result of this collection of lowest-level behaviors. orA second layer, above the wall-following routines, implements a fully distributed"world modeler." This behavior is implemented as a dynamic graph of landmark0recognizers. Landmarks correspond to gross sonar configurations (e.g., wall left) 0augmented with compass readings. Rough odometry is used toaid in recognition aofpreviously visited landmarks. Each time a novel landmark is recognized, anew graph node allocates itself, making graph connectionsas appropriate. TheAv ilabilityCodes1....Avtiand/or')I -i SpollFigure 1: Toto.2cognitionroboticsFigure 2: Traditional architecture.resulting behaviors form an internal representation of the environment.Finally, Toto accepts commands (by means of threebuttons) to return to pre-viously recognized landmarks. When a goallocation is specified, Toto's landmarkgraphuses spreading activation to determine the appropriate direction in whichto head.Activation persists until Toto has returned to the requested location.Throughout,Toto's lowest level behaviors enforce obstacle avoidance and corridortraversal,and Toto's intermediate layer processes landmarks as they are encoun-tered.Toto's landmark representation and goal-driven navigation are cognitive tasks,involving internal representation of theexternal environment. This represents aqualitative advance in thecapabilities of subsumption-based robots. Nonetheless,this internal representation is accessible onlythrough interaction with the world.Toto cannotreason about things unless it has previously encountered them. In thenext section,we describe a simple modification to Toto's architecture that allowsToto to represent previously unvisitedlandmarks.3 Exploring the UnknownPrevious approachesto cognition in robotic systems have implemented more in-telligent behaviorsas higher levels of control. In the MetaToto project, we havetaken a different approach. The existing machinery that implements Toto'scoreprovidesa strong base for cognitive tasks. It is limited, however, in being able toconceptualizeonly what has been physically encountered.MetaToto is an extension of Toto'score behavior that accepts directions tonavigate to a goal not previously encountered.Toto's goal-directed navigationroutines are implemented in terms of its existing internalrepresentation, and it is3cognitionroboticsFigure 3: Proposed architecture.impossibleeven to ask that Toto visit an unexplored location: Toto hasno conceptcorresponding to locztions it has not encountered. The primarytask for MetaToto,then, is the representationof landmarks that have simply been described.Our approach to architectureis to reuse Toto's existing mechanisms in addingthis newskill to MetaToto. Where Toto must encounter a landmark,MetaTotomerely envisions that landmark.That is, M.taToto takes the landmark descriptionand imagines whatthat landmark would "feel" like: what sonar readings it mightevoke, what MetaToto'scompass might indicate, etc. We claim that cognition isoften simply imagined sensation and action.In the traditional architecture,cognition rests on top of robotics: roboticsprovides an intermediary betweenthe external world and a central "cognitionbox." This approach has led to widespreadbelief that the two problems can bestudied independently, and that technologyand research will ultimately meet at theinterfacebetween cognition and robotics. Unfortunately, there islittle agreementeven as to what constitutes this interface.In contrast, our view suggests that cognition issimply the robotic architectureapplied to imagined stimuli. That is, the interface between roboticsand the imme-diate world is multiplexed to providea second, low-level interface between roboticsand imagination. The robot senses and actsin this imagined world precisely as itdoes in the actualworld.4 Implementing ImaginationIf cognition islargely imagined sensation and action, then the difficult tasksforimplementing cognition are simulating sensorsand actuators, and modeling theappropriate feedbackthrough the imagined world. Both tasks have been attemptedin other contexts. The relative success of the approach hererelies on some criticalassumptions about the nature of the robot's interfacewith the world and hence4with imagination.4.1 Sensingand ActingToto relies on qualitative, rather thanquantitative, information about the world.In part, this means that it does not matter if Totohas an occasional anomoloussonar reading. More significantly, it means that moderate inaccuracies in thephysicalsensors and actuators are not merely tolerated, but expected. Toto'sdecisions are based on gross judgements (e.g., dangerously close) and measurementsaveraged over time.Second, Toto relies on constant feedback from the world, andconstant interac-tion with the world. In contrast to traditional planners, which decide on a courseof action and then pass control to an executer, Toto "continually redecides whatto do" [Agre and Chapman, 1987]. This serves as a form of protection from ma-jor errors:any incorrect actions will be recognized and corrected before they canbecome disasterous. As a result, Toto need not worry about plansgone awry.Both of theseproperties mean that MetaToto's simulation of the sensors andactuators need not be accurate. Sonars are simulated using simple ray projection.Angles are approximated. Still, the inaccuracyof MetaToto's imagination are littleworse than the variance between two runs of the actualrobot, and dose enoughto allow construction of the appropriate landmark graph.4.2Imagination vs. World ModelsA second aspect of the architecturebears on the simulation of feedback throughimagination, rather than through the world. Feedback through the world hasbeen a strength of reactive systems, and imagination removes that aspect of thearchitecture. In this sense, it represents a step towards the more traditional worldmodels of classical planning systems.Imagination differs from classical world models, however. Imagination isephemeral. MetaToto need onlyknow the sensations that occur now. WhereToto "continually redecides what to do," MetaToto continually re-imagines theworld. Thus, while world models persist and require maintenence, imaginationcan be reconstructed on the fly.In addition, cognition requires imaginingonly the relevant details. That is,only those aspects that bear on things immediatelysense-able must be imagined.Because the interface between robotics and imagination is at the level of sensation,rather than in terms of higher-level predicates, we do not need a model of the globalproperties of the world. Only that which is imagined to be immediately accessiblemust be simulated.05A floor plan-as seen by MetaToto'scamera-is shown in figure 4. The use of ageometric communication language facilitates certain of the simulation aspects ofMetaToto's imagination. In section 6, we discuss a moreverbal communicationlanguage.MetaToto is implemented on the same hardwareas Toto, using largely thesame software. The modifications to Toto's software involve only the creationand integration of an imagination system. The entire system allowsthe robotto performall tasks of which Toto was previously capable, plus the additionalcognitive exploration of physically unseen environments.MetaToto's imagination usesa photographed floor plan of the environmentit is to explore. Rather than looking at the plan from above, however,MetaTotoimagines that it is located in a particular place in the plan. Virtual sensors describewhat it "feels" like to be at that location: what sonar and compass readingsMetaToto might receive if physically present. MetaToto imagines sensing andacting in the floor plan much as Toto would sense and act in the actual world,with muchthe same effect. The routines that sense and act in the imagined worldare precisely the same as those that would sense and act in the actual world; theydiffer only by calling the imagined sonar rather than the real. In this manner,MetaToto explores the floor plan,building the same internal representation oflandmarks as Toto would create in its explorations of the environment.Once MetaToto has completed its exploration of the floor plan, it is capable6aof goal-directed navigation in the world. However, unlike Toto, MetaToto can goto places that ithas only imagined, and not actually encountered. Becausethelandmark graph has been created by thesame mechanisms that are used in ex-ploring the world, MetaToto cannot distinguish those generated by its imaginationandthose actually encountered. Should the floor plan prove to have been incom-plete or inaccurate, MetaTotowill simply augment its internal representation as itexplores the uncharted area of the actual world.6 Following DirectionsMetaToto'suse of a geometric representation for communication facilitates thesimulation aspects of imagination. Humans, however, are capable of understand-ing verbally imparted directions. While this is in some senses an unfair task forMetaToto, it is nonetheless achievable.Giving MetaToto directions is "unfair" in the sense that humans give humansdirections in anthropocentric terms. We speak of "the secoad left" or "the cor-ner" because these are the landmarks in terms of which we represent the world.MetaToto has no notion of left turns or corners; instead, it represents the world interms of sonar and compass readings. Thus, to make this task fair in MetaToto'sterms, we ought to speak of such landmarks as "the second extended short sonarreading on left and right simultaneously."Nonetheless, MetaToto could understand the anthropocentric landmarks inmuch the same way as it uses the floor plan. What, after all, does it "feel"like to explore these landmarks? The simulation aspect may be more complicated,but the task is essentially the same. For example, the landmark "the second left"corresponds to the following (imagined) sensations:short sonar leftlong sonar leftshort sonar leftlong sonar leftBy imagining this sequence, MetaToto could construct an internal representa-tion corresponding to that which would be encountered while seeking the secondleft. Directions, although more remote than geometric representation, still have anatural analog in terms of imagined sensation.7 ConclusionUnlike previous"cognition boxes," MetaToto is distinguished only by the set ofsensors and actuators in which the behaviors ground out: when imagining, Meta-Toto seizes control of the sensor and actuator control signals, and substitutes7interaction with the floor plan. Rather than a "higher level reasoning module,"MetaToto is a lowest level interface to an alternate (imagined) reality.MetaToto achieves by embodied imagination thecogition-intensive task ofreading,understanding, and acting on the knowledge contained in a floorplan,and MetaToto does this using entirely Toto's existingarchitecture, with the soleaddition of the virtual sensors and actuators required for navigationof the floorplan. Although MetaToto is onlya simple example of imagination, we are hopefulthat experiences withMetaToto will lead to more sophisticated use of imaginationand virtual sensing, and to thedevelopment of truly embodied forms of cognition.AcknowledgementsThis paper could not have been written withoutthe help of Ian Horswil], MajaMataric, and Rod Brooks.References[Agre and Chapman, 1987] Philip E. Agre and David Chapman.Pengi: An imple-mentation of a theory of activity. In Proceedingsof the Sizth National Confer-ence on Artificial Intelligence, pages 196-201, Seattle,Washington, July 1987.Morgan Kaufmann Publishers, Inc.[Brooks, 1986] Rodney A. Brooks. A robust layeredcontrol system for a mobilerobot. IEEE Journal of Roboticsand Automation, 2(1):14-23, April 1986.[Mataric, 1990]Maja Mataric. A distributed model for mobile robot environmentlearning. Technical Report 1228, Massachusetts Instituteof Technology Artifi-cial Intelligence Laboratory, Cambridge,Massachusetts, May 1990.S0