Σχόλια 0

Το κείμενο του εγγράφου

Omnidirectional stereo systems for robot navigationGiovanni Adorni(*) Stefano Cagnoni(+)Monica Mordonini(+) Antonio Sgorbissa(*)(+) Dept.of Computer Engineering (*) DISTUniversity of Parma University of GenoaParma,Italy 43100 Genoa,Italy 16145AbstractThis paper discusses howstereo vision achieved through theuse of omnidirectional sensors can help mobile robot nav-igation providing advantages,in terms of both versatilityand performance,with respect to the classical stereo systembased on two horizontally-displaced traditional cameras.The paper also describes an automatic calibration strat-egy for catadioptric omnidirectional sensors and results ob-tained using a stereo obstacle detection algorithm devisedwithin a general framework in which,with some limitations,many existing algorithm designed for traditional camerascan be adapted for use with omnidirectional sensors.1.IntroductionThe need for robotic sensory systems that provide a globaldescription of the surrounding environment is increasing.In mobile robotics applications,autonomous robots are re-quired to react to visual stimuli that may come from anydirection at any moment of their activity,and to plan theirbehaviour accordingly.This has stimulated growing interestin omnidirectional vision systems [1].Such systems pro-vide the widest possible eld of view and obviate the needfor active cameras that require complex control strategies,at the cost of reduced resolution with respect to traditionalcameras,that distribute a smaller eld of view on the samesensor surface.Recent robotics and surveillance applications in whichomnidirectional sensors have been used effectively,as theonly vision sensor or jointly with other higher-resolutionnon-omnidirectional ones,are described in [2,3,4].Applications of mobile robotics in which robots rely onvision for safe and efcient navigation share a set of featuresand requirements,often conicting with one another,thatstrongly inuence application design criteria.Among these:robots are immersed in a dynamic environment thatmay change quite rapidly,within and beyond their eldof action;robots require high-resolution vision for accurate op-eration within their eld of action;robots need wide-angle vision to be aware of what hap-pens beyond their eld of action and to react/plan ac-cordingly.Regarding the typical environment where virtually allindoor robotics and most outdoor robotics take place,fur-ther considerations can be made about the natural partialstructuration of the space in which mobile robots operate.Such a space is usually inferiorly delimited by the plane(oor/ground) on which robots move and extends verticallyup to where robots can see or physically reach.The oorcan therefore be assigned the role of reference plane in themain tasks in which mobile robots are routinely engagedduring navigation,namely self-localization,obstacle detec-tion,free-space computation.We could therefore call it a2D augmented environment,to underline that the two di-mensions along which the oor extends are privileged withrespect to the third dimension.Even within these limita-tions,robots actually operate in a 3Denvironment and theiroperation can take advantage of 3D information.Stereo vi-sion is therefore appealing to several navigation tasks.However,traditional stereo vision setups,made upof two traditional cameras displaced horizontally,hardlysatisfy the above-mentioned requirements of autonomousrobotics applications.The use of omnidirectional sensors,besides providing the robot with obvious advantages interms of self-localization capabilities,can be extremely use-ful also to extract 3Dinformation fromthe environment us-ing stereo algorithms.In section 2 we introduce a sensor model,based on thejoint use of an omnidirectional sensor and a traditional one,with which powerful stereo algorithms can be implemented.We then briey compare such a model with traditional andfully-omnidirectional stereo setups.In section 3 we proposea framework within which a particular class of algorithmsfor omnidirectional sensors can be easily developed,as anextension of traditional stereo algorithms,with almost no1extra overhead.Such a class of algorithms that are applica-ble to 2Daugmented environments,which includes many ifnot most real-world applications,can be termed the quasi-3D (q3D) class.More precisely,it comprises algorithmsthat can exploit the presence,in the environment,of a ref-erence plane for which a transform(the Inverse PerspectiveTransform) exists,which allows for the recovery of visualinformation through a remapping operation.Afast and sim-ple auto-calibration process that allows for such a mappingis described in section 4.In section 5,as an example,weeventually describe the basics of an efcient obstacle detec-tion algorithmdeveloped within this framework.2.Hybrid and fully-omnidirectionalstereo vision sensorsUsing traditional stereo systems,typically made up of twotraditional cameras aligned and displaced along the hori-zontal axis,has several drawbacks in mobile robot applica-tions.Among them:the constraints imposed by the conguration of the twotraditional cameras needed to obtain sufcient dispar-ity often conict with the general requirements of theapplications for which the stereo systemis used;the resulting eld of viewof the stereo systemis muchsmaller than the,already limited,eld of view of eachof the two cameras.The rst drawback mainly affects robot design,since itrequires that a front and a rear side of the robot be clearlydened.This can be a severe limitation when holonomousrobots are used.With a traditional stereo setup,any recon-guration of the (strongly asymmetric) vision system re-quires that both cameras be repositioned and might possiblycall for structural modications.The second drawback is particularly relevant in dynamicenvironments.If one considers that a robot movementshould be ideally exclusively nalized to performing thetask of interest,it is immediately evident how penalizingit is for the robot having to move itself just to second itsown perceptual needs.Using omnidirectional sensors is benecial with re-gard to both problems.Here,we consider two mod-els,a hybrid omnidirectional/pin-hole system and a fully-omnidirectional one.In particular,it is clear that a symmetric coaxial fully-omnidirectional model as the one briey discussed in sec-tion 2.1 can solve both problems.However,the solutioncomes at the cost of a lower resolution in the far eld andof the loss of horizontal disparity between the two views,which may be also unacceptable in some applications.Figure 1:Afully-omnidirectional sensor model (above) andthe Inverse Perspective Transform(see section 3) images ofa simulated RoboCup eld with four robots and a ball ob-tained with such a mirror conguration (upper sensor belowon the left,lower one below on the right).A way to obtain stereo images,providing the robot withboth low-resolution omnidirectional vision in the far eldand high-resolution vision in the near eld while keepingthe eld of viewas wide as possible,is to use a sensor madeup of both an omnidirectional camera and a traditional one.In the following,after showing the results of a simula-tion of a fully-omnidirectional system to provide a feelingof howimages acquired by such systems may look like,wedescribe in details HOPS (Hybrid Omnidirectional/Pin-holeSystem),a stereo model that tries to achieve a good trade-off,with particular attention to mobile robot applications,between the features provided by omnidirectional and tra-ditional systems.2.1.Fully omnidirectional modelA fully-omnidirectional stereo model uses two omnidirec-tional sensors for stereo-disparity computation.In gure 1we show preliminary results of a simulated vision systemsmade up of two catadioptric omnidirectional sensors.Wehave taken into consideration a conguration in which thevision sensors are placed one above the other,and share acommon axis(gure 1,above on the right) perpendicularto the reference plane.2Figure 2:The two hybrid sensor prototypes:HOPS1 andHOPS2.The main drawback of such a coaxial conguration isthat it provides no lateral stereo disparity (see section 5),which make obstacles recognizable only exploiting verticalstereo disparity.On the other hand,dealing with a stereosensor having two sensors with parallel axes is more com-plicated,both in terms of construction,size and calibration.2.2.Hybrid omnidirectional/pin-hole modelHOPS (of which two prototypes are shown in gure 2) is ahybrid vision sensor that integrates omnidirectional visionwith traditional pin-hole vision,to overcome the limitationsof the two approaches.If a certain height is needed by thetraditional camera to achieve a reasonable eld of view,thetop of the omnidirectional sensor may provide a base for thetraditional CCD-camera based sensor that can lean on it,asshown in gure 2.In the prototype shown in gure 2a thetraditional camera is xed and looks down with a tilt angleof aboutwith respect to the ground plane and has a eldof viewof about .To obtain both horizontal and verticaldisparity between the two images,it is positioned off thecenter of the device.The'blind sector'caused by the uppercamera cable on the lower sensor is placed at an angle of with respect to a conventional'front view',in order torelegate it to the back of the device.If a lower point of viewis acceptable for the traditional camera,it can also be placedbelow the omnidirectional sensor,provided it is out of theeld of view of the latter.The top of the device is easilyaccessible,allowing for easy substitution of the catadioptricmirror.Consequently,also the camera holder on which theupwards-pointing camera is placed can be moved upwardsor downwards,to adjust its distance fromthe mirror.In theprototype in gure 2b,the traditional camera is positionedlaterally above the omnidirectional sensor on a holder thatFigure 3:Example of images that can be acquired throughthe omnidirectional sensor (left) and through the CCD cam-era (right) of the HOPS1 prototype.can be manually rotated.An example of the images that can be acquired throughthe two sensors of the rst prototype is provided in gure 3.The aims with which HOPS was designed are accuracy,efciency and versatility.The joint use of a standard CCDcamera and of an omnidirectional sensor provides HOPSwith different and complementary features:while the CCDcamera can be used to acquire detailed information abouta limited region of interest,the omnidirectional sensor pro-vides wide-range,but less detailed,information about thesurroundings of the system.HOPS,therefore,suits severalkinds of applications as,for example,self-localization orobstacle detection,and makes it possible to implement pe-ripheral/foveal active vision strategies:the wide-range sen-sor is used to acquire a rough representation of a large areaaround the systemand to localize the objects or areas of in-terest,while the traditional camera is used to enhance theresolution with which these areas are then analysed.Thedifferent features of the two sensors can be exploited in botha stand-alone way as well as in a combined use.In particu-lar,as discussed in section 5,HOPS can be used as a stereosensor to extract three-dimensional information about thescene that is being observed.3.General framework for stereo algo-rithmdevelopmentImages acquired by the cameras on-board the robots are af-fected by two kinds of distortions:perspective effects anddeformations that derive fromthe shape of the lens throughwhich the scene is observed.Given an arbitrarily chosenreference plane (typically,the oor/ground on which robotsmove),it is possible to nd a function  thatmaps each pixel in the image onto the correspondingpoint of a new image(with coordinates) thatrepresents a bird's view of the reference plane.Limitingone's interest to the reference plane,it is possible to reasonon the scene observing it with no distortions.The most ap-pealing feature,in this case,is that a direct correspondence3between distances on the reconstructed image and in the realworld can be obtained,which is a fundamental requirementfor geometrical reasoning.This transformation is often re-ferred to as Inverse Perspective Transform (IPT) [5,6,7],since perspective-effect removal is the most common aimwith which it is performed,even if it actually representsonly a part of the problemfor which it provides a solution.If all parameters related to the geometry of the acquisi-tion systems and to the distortions introduced by the camerawere known,the derivation ofcould be straightforward.However,this is not always the case,most often because ofthe lack of an exact model of camera distortion.However,it is often possible to effectively (and efciently) deriveempirically using proper calibration algorithms,as shownin next section.The IPT plays an important role in several robotics appli-cations in which nding a relevant reference plane is easy.This is true for most indoor Mobile Service Robotics ap-plications (such as surveillance of banks and warehouses,transportation of goods,escort for people at exhibitions andmuseums,etc.),since most objects which the robot observesand with which it interacts lie in fact on the same plane sur-face of the oor on which the robot is moving.Since oursystem has been mainly tested within the RoboCup1envi-ronment,in the following we will take it as a case study.In RoboCup everything lies on the playing eld and hardlyraise signicantly above,as happens,for example,with theball.Therefore,the playing eld can be taken as a naturalreference plane.In the rest of the paper we will show how a general em-pirical IPT mapping can be applied,even more effectively,also to catadioptric omnidirectional sensors.The intrinsicdistortion of such sensors,especially with respect to the typ-ical images with which humans are used to dealing,makesdirect image interpretation difcult,since a different refer-ence system(polar coordinates) is implicitly'embedded'inthe images thus produced.However,their circular symme-try allows for a simplication of the IPT computation.Exploiting this feature in implementing the IPT for cata-dioptric omnidirectional sensors,we have devised an ef-cient automatic calibration algorithmthat will be describedin the next section.4.Omnidirectional sensor calibrationIn computing,the generalization of the IPT for a cata-dioptric omnidirectional sensor,the problemis complicatedby the non-planar prole of the mirror;on the other hand,the circular simmetry of the device provides the opportunityof dramatically simplifying such a procedure.If the reecting surface were perfectly manufactured,itwould be sufcient to compute just the restriction of1visit http://www.robocup.org for more information.along one radius of the mirror projection on the image planeto compute the whole function.However,possible man-ufacturing aws may affect both the mirror shape and thesmoothness of its surface.In addition to singularities that donot affect sensor symmetry and can be included in the radialmodel of the mirror (caused,for example,by the joint be-tween two differently shaped surfaces required by the spec-ications for a particular application,as in [8]),a few otherminor isolated aws can be found scattered over the sur-face.Similar considerations can be made regarding the lensthrough which the image reected on the mirror is capturedby the camera.To account for all sorts of distorsions an empiricalderivation ofbased on an appropriate sampling of thefunction in the image space can be made.Choosing sucha procedure to computepermits to include also the lensmodel into the mapping function.The basic principle by whichcan be derived empiri-cally is to consider a set of equally-spaced radii,along eachof which values ofare computed for a set of uniformly-sampled points for which the relative position with respectto the sensor is known exactly.This produces a polar gridof points for which the values ofare known.To compute the function for a generic pointlocatedanywhere in the eld of view of the sensor,a bi-linear in-terpolation is made between the four points,belonging to auniformly-sampled polar grid,among whichis located.This makes reconstruction accuracy better in proximity ofthe robot,as the actual area of the cells used for interpola-tion increases with radial distance while,correspondingly,image resolution decreases.The number of data-points (in-terpolation nodes) needed to achieve sufcient accuracy de-pends mainly on the mirror prole (the smoothest the pro-le,the fewest the points) and on the mirror surface quality(the fewest the aws,the fewest the points).This calibration process can be automated,especially inthe presence of well manufactured mirrors,by automati-cally detecting relevant points.To do so,a simple pat-tern consisting of a white stripe with a set of aligned blacksquares superimposed on it can be used,as shown in g-ure 4.The reference data-points,to be used as nodes for thegrid,are extracted by automatically detecting the squaresin a set of one or more images grabbed turning the robotaround the vertical axis of the sensor.Doing so the refer-ence pattern is reected by different mirror portions in eachimage.Using different shapes instead of squares,e.g.,circles orellipses,is obviously possible:using appropriate ellipses inpoints located far from the center of the mirror could evenbe advantageous,because they could be seen approximatelyas circles in the grabbed images,simplifying their recogni-tion.In any case,if distances between the shapes form-4Figure 4:The pattern used for calibrating a catadioptric om-nidirectional sensor (above).The fourth square from thecenter has a different color,to act as a landmark in auto-matically computing distances;below it the IPT image ob-tained after calibration is shown.The black circle hides theexpansion of the area,roughly corresponding to the robotfootprint,whose reection is removed in the original imageby providing the mirror with a discontinuity in its center.ing the pattern are known exactly,the only requirement isthat one of the shapes,at known distance,be distinguish-able (e.g.,by its color) from the others.The shape shouldbe possibly located within the highest-resolution area of thesensor.This permits to use the reference shape as a land-mark to automatically measure the distance from the cam-era of every shape on the reference plane.This also removesthe need to accurately position the robot at a predened dis-tance from the pattern,which could be a further source ofcalibration errors.Operatively,in the rst step of the automatic calibrationprocess,the white stripe,as well as the center of every ref-erence shape,are easily detected.These reference pointsare inserted into the set of samples on which interpolationis then performed.The actual position of such points canbe simply derived from the knowledge of the relative posi-tion of the square pattern to which it belongs with respect tothe reference differently-colored shape.The process can berepeated for different headings of the robot,simply turningthe robot around its central symmetry axis.In the second step,interpolation is performed to computethe functionfrom the point set extracted as described.A look-up table that associates each pair of coordinates inthe IPT-transformed image to a pair of coordinates in theoriginal image can thus be computed.This calibration process is fast,can be completely auto-mated and provides good results,as shown in gure 4.5.Experiments with an IPT-based ob-stacle detection algorithm for om-nidirectional sensorsAs an example of algorithmporting from traditional stereosystems to omnidirectional ones using the generalized IPT,we report some sample results,obtained in a robot soccerenvironment,of a stereo algorithm for obstacle detectiondeveloped for traditional stereo systems [5] and adapted foruse with HOPS.The algorithm is described in details else-where [9]:here we mainly aimat showing its potentials andevidentiating the role played by the generalized IPT.Besides removing the distortion introduced by the omni-directional sensor using the IPT,the algorithm exploits theintrinsic limitation of the IPT to be able to provide undis-torted views only of the objects that lie on one referenceplane.Everything that is above the plane is distorted dif-ferently as a function of its height and of the point of viewfromwhich it is observed.Therefore,two IPT-transformedimages of the same scene will differ only in those regionsthat represent obstacles,i.e.,any object located above thereference plane.In mobile robotics applications,the refer-ence plane is chosen to be the oor on which the robots aremoving.Given two images of the same spatial region that in-cludes the oor on which a robot is moving,the obstacledetection algorithmcan be roughly summarised as follows:1.compute the IPT of both images with respect to theplane identied by the oor;2.apply an edge extraction algorithm to the IPT-transformed images;3.skeletonize and binarize the contours using a ridge-following algorithm;4.compute the difference between the two images ob-tained in the previous step.When the chromatic features of the two images obtainedfrom the two sensors are virtually identical the steps 2 and3 of the algorithmcan also be substituted by a thresholdingalgorithm by which objects that clearly stand out with re-spect to the background are highlighted.It is worth notingthat the task to have identical chromatic features is noteasy to achieve in hybrid systems,where one image is ac-quired directly while the other is acquired as a reection ona surface that may alter colors to some extent.5a)b)c)d)e)Figure 5:Obstacle detection:(a) images acquired by thehybrid vision sensor;(b) the IPT of the spatial region in(a) common to both images;(c) results of edge detectionapplied to (b);(d) result of the ridge extraction from(c);(e)difference between the two images in (d).The white regions that can be observed in the differ-ence image,that represent areas where an obstacle may bepresent,derive fromtwo kinds of disparity that can be foundin stereo image pairs.If they derive froma lateral displace-ment of the two cameras,they are located to the left and/orright of obstacle projections in the IPT transformed images;because of this,both approaches used to obtain binary dif-ference images considered above provide very similar re-sults.When a vertical displacement of the two cameras oc-cur instead,such regions are located above and/or belowtheobstacle projections.Figure 6:Above:simulated results obtained by a coaxialfully-omnidirectional system.The two IPT images (up-per sensor on the left,lower on the right) of a simulatedRoboCup environment.Below:the difference image thatcan be obtained with the coaxial conguration.The virtu-ally null lateral disparity can be clearly noticed.From these considerations and using other kinds of in-formation (e.g.color) it is possible to tell regions that arecertainly free from regions that may be occupied by obsta-cles.Figure 5 shows the results that can be obtained at theend of each step.To give a avor of the potential of the algorithm whenapplied to a fully-omnidirectional stereo device,in gure 6the difference images is shown,which was obtained by IPT-transforming the (simulated) images taken from the twosensors,and subsequently computing and pre-processingthe difference between the two images.In particular,theresults of the difference between the self-reections of therobot onto the two mirrors have been removed.6.DiscussionIn this paper we have described a HybridOmnidirectional/Pin-hole Sensor (HOPS) and a gen-eral framework within which the IPT is used to allow forporting the quasi-3D (q3D) class of stereo algorithmsfrom traditional stereo systems to omnidirectional orpartially-omnidirectional ones.The joint use of a standard CCD camera and of an omni-directional sensor provides HOPS with their different andcomplementary features:while the CCD camera can beused to acquire detailed information about a limited regionof interest (foveal vision),the omnidirectional sensor pro-vides wide-range,but less detailed,information about thesurroundings of the system (peripheral vision).HOPS,6therefore,suits several kinds of applications as,for exam-ple,self-localization or obstacle detection,and makes itpossible to implement peripheral/foveal active vision strate-gies:the wide-range sensor is used to acquire a rough rep-resentation of a large area around the systemand to localizethe objects or areas of interest,while the traditional camerais used to enhance the resolution with which these areas canthen be analysed.The different features of the two sensorsare very useful for a combined exploitation in which infor-mation gatheredfromboth the sensors is fused,allowing ex-traction of 2D augmented information fromthe observedscene by means of IPT.The IPT implementation that has been proposed allowsfor a fully-automatic calibration of the sensor,and for a veryefcient derivation and subsequent use of the mapping func-tion,implemented through a look-up table.An algorithmfor obstacle detection,based on such an implementation ofthe IPT has been briey presented to show the effective-ness of the approach.One of the most noticeable featuresof this approach is the cancellation of false obstacles ly-ing on the IPT reference plane:shadows projected on theoor,spots,drawings or bi-dimensional objects lying on theoor,which can appear in the acquired images and can bemistaken with obstacles by a monocular vision system be-cause of their texture,color,etc.can be easily removed byIPT.The application of the look-up table is the only overheadimposed on the algorithms by the use of IPT,with respect totheir'standard'implementation.This,along with an MMX-optimization of the code,has made it possible to achievereal-time or'just-in-time'performance,allowing the algo-rithmto track objects that move with a relative speed up tooveron recent mid-top class PCs.AcknowledgementsThis work has been partially supported by ASI under theHybrid Vision System for Long Range Rovering grantand by ENEA under the Intelligent Sensors grant.References[1] Benosman,R.and Kang,S.B.,editors.Panoramic Vision:Sensors,Theory and Applications.Monographs in Com-puter Science.Springer-Verlag,New York (2001).[2] Gutmann,J.S.,Weigel,T.,and Nebel,B.Fast,accurate,and robust selocalization in polygonal environments.Proc.1999 IEEE/RSJ Int.Conf.on Intelligent Robots and Systems(1999) 14121419.[3] Cl´erentin,A.,Delahoche,L.,P´egard,C.,and Brassart-Gracsy,E.A localization method based on two omnidirec-tional perception systems cooperation.In Proc.2000 ICRA.Millennium Conference,vol.2 (2000) 12191224.[4] Sogo,T.,Ishiguro,H.,and Trivedi,M.N-ocular stereofor real-time human tracking.In Benosman,R.and Kang,S.B.,editors,Panoramic Vision:Sensors,Theory and Ap-plications,Monographs in Computer Science,chapter 18.Springer-Verlag,New York (2001) 359376[5] Mallot,H.A.,B¨ulthoff,H.H.,Little,J.J.,and Bohrer,S.Inverse perspective mapping simplies optical ow compu-tation and obstacle detection.Biological Cybernetics,vol.64(1991) 177185.[6] Onoguchi,K.,Takeda,N.,and Watanabe,M.Planar projec-tion stereopsis method for road extraction.IEICE Trans.Inf.&Syst.,vol.E81-D n.9 (1998) 10061018.[7] Adorni,G.,Cagnoni,S.,and Mordonini,M.An efcientperspective effect removal technique for scene interpreta-tion.Proc.Asian Conf.on Computer Vision (2000) 601605.[8] Adorni G.,Cagnoni S.,Carletti M.,Mordonini M.,Sgor-bissa A.Designing omnidirectional vision sensors,AI*IANotizie,vol.15 n.1 (2002) 27-30.[9] Adorni G.,Bolognini L.,Cagnoni S.,Mordonini M.,Anon-traditional omnidirectional vision system with stereo capa-bilities for autonomous robots,In F.Esposito (ed.) AI*IA2001:Advances in Articial Intelligence.7th Congress ofthe Italian Association for AI,Bari,Italy,September 2001:Proceedings,Springer,LNAI 2175 (2001) 344-355.7