1996 moving-object recognition

Upload: barga-sp-deori

Post on 04-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 1996 Moving-object Recognition

    1/6

    Proceedings of the 1996 IEEEInternational Conference on Robotics and AutomationMinneapolis, Minnesota - April 1996 MOVING-OBJECT RECOGNITIONUSING PREMARKING AND ACTIVE VISIOND. He, D . Hujic, J.K. Mills and B. Benhabib*

    Computer Integrated Manufacturing Laboratory (CIMLab), Department of Mechanical EngineeringUniversity of Toronto, 5 King's College Road, Toronto, Ontario,M5S 3G8, *email:[email protected]

    ABSTRACTThis paper presents an active-vision system for therecognition of 3D objects moving along predictabletrajectories. The novelty of this system lies in its uniqueapproach to deal with the problem of moving-objectrecognition, by integrating object pre-marking, object-trajectory-prediction and time-optimal robot-motiontechniques developed in our laboratory. The recognitiontechnique is an extension of our earlier work on static-

    object recognition. Therein, objects were pre-markedoptimally using circular markers, which are utilized duringrun time for guiding a robot-mounted camera to acquire2D standard-views for efficient matching purposes. TheKalman-filter based prediction of the object trajectory andthe time-optimal movement of the mobile camera forimage acquisition are also based on our earlier researchresults on moving-object interception. The discussion ofthe various implementation issues and experimentalresults presented in this paper should provide researcherswith useful tools and ideas.1. INTRODUCTION

    The development and implementation of an active-vision system for moving-object recognition involvesnumerous issues that currently constitute individualresearch endeavors in different laboratories. The twoprimary issues, however are (i) tracking and (ii)recognition of moving objects. The latter area attempts toobtain information, for the identification of the objects,from consecutive motion images. Two commonapproaches to the solution of motion-image processingand recognition problems have been: the optical-flowapproach and feature-based approach [11. Feature-basedapproaches require that correspondence be establishedbetween a set of features extracted from one image andthose from its next image [2,3]. Optical-flow approaches,on the other hand, rely on local spatial and temporalderivatives of image-brightness values. Nocorrespondence between successive images is required.Although many methods have been developed based onboth approaches, they are only partial solutions, suitable

    for simplified environments, sensitive to noise, andcomputationallyexpensive [11.Integration of tracking and recognition techniques toform robust active-vision systems appear to be a logicalanswer to greatly simplify the above-mentioned problems.Feddema and Lee [4] roposed such an adaptive systemfor visually tracking a moving object with a single mobilecamera. This system uses the concept of active sensing toreduce the time for feature searching and extraction. Thevision system accurately predicts the object location at thenext sampling time. When a tracking task begins, thevision system uses all a priori knowledge to recognize andlocate the object. After the initial recognition stage, theimage processing is reduced to a simple verificationprocess. Hwang, Ooi and Ozawa [ 5 ] developed analternate adaptive-sensing system with the capability oftracking and zooming onto moving objects using a fuzzy-logic controlled camera. Another active-vision system,which uses an active camera mounted on a padtiltplatform, was proposed by Murray and Basu, for real-timemotion detection [6] This system successfully extractedmoving edges from dynamic images. The camera motionis compensated by the tracking algorithm, which allowsstatic techniques to be applied to active image sequences.In contrast to the above proposed methods, the primaryadvantage of the active-vision system, designed andsuccessfully implemented in the CIMLab, is therecognition of 3D moving-objects using a 2D modelingand matching process [7]. Based on an integration ofactive sensing and object pre-marking principles, thesystem is capable of successfully tracking and recognizingmoving-objects defined a priori in a given object library.In the next sections, we will first briefly describe theindividual components of the system, which wouldsubsequently lead to the discussion of the implementationissues and experimental results.

    2. THE ACTIVE-VISION SYSTEM ANOVERVIEWThe task of moving-object recognition is broken hereininto two sub-tasks: (i) tracking and prediction of the

    Laboratory for Non-Linear Systems Control, Department of Mechanical Engineering, University of Toronto0-7803-2988-4196 4.00 1996 IEEE 198

  • 8/13/2019 1996 Moving-object Recognition

    2/6

    objects trajectory, and (ii) object recognition fromconsecutive images. The motivation behind this approachis twofold: to gain speed via parallel computing and tosimplify the implementation of the system. Based on thisrationale, the active-vision system was designed as a two-module architecture: namely, comprising the objectrecognizer module and the object-trajectory predictorand robot-motion planner module, Figure 1. The formermodule functions as an image processor and patternrecognizer. The latter module deals with the prediction ofthe object motion and planning of the robot trajectory, soas to guide the robot-mounted camera to desired locationsfor the acquisition of consecutive images.MODULE 1:OBJECT RECOGNIZER; ....................................................................................................................................... I

    :standard-view ocator I 2~ shape recognirer:, :

    ,

    continuo us prediction and planningj MODULE : OBJECT-TRAJECTORY PREDICTOR, AN DROBOT-MOTION PLANNER

    Figure 1.Structure of the Active-Vision System.The object-recognizer module was based on thepresumption that a 3D-object can be modeled by a pre-defined set of its 2D views referred to as standard-views[7]. Run-time recognition is then initiated by acquiring

    one of these standard-view images and followed by thematching process. Since markers placed on an objectdefine a limited set of unique standard-views, by pre-marking objects, the 3D-recognition process is simplifiedinto afast 2D-matching process.Based on the above principle, the recognition processis carried out in two stages. During Stage 1, a markersmotion is observed via a fixed camera and its trajectory ispredicted by Module 2. In parallel, a robot-mountedcamera is utilized by Module 1 to acquire two sequentialimages of the same viewed marker. The 3D pose of thecircular feature is determined through the use of these twoimages. The correspondence between the poses of thesame marker in the two consecutive images identifies thetrue orientation of the marker. During Stage 2, a time-optimal robot trajectory is planned to position the camerafor the acquisition of a standard-view of the movingobject at the right instant. Once a standard-view isacquired, recognition is conducted by matching the

    objects shape signature with thlose in the standard-viewlibrary.

    3. THE OBJECT-RECOGNIZER MODULEAn optimal number of circular markers of known sizeare used for pre-marking the objects, whose normals

    define the necessary standard-viewing axes. A standard-view is acquired by aligning the optical axis of the camerawith one of the standard-viewing axes of the object.Visible markers, however, undergo perspective projectionand would be perceived as elliptical shapes in arbitrarilyacquired images. Thus, the parameters of these ellipticalshapes must Ibe used to determine the 3D position andorientation (pose) of the marker.3.1 Calculation of the Elliptical ShapeParameters [SI

    The parameters of the elliptical shape of a circularmarkers acquired image can be calculated as follows.LetQ ( X ,Y ) = aX 2 + bX Y + cY 2 + d X + e Y + f = O 1)

    be the general equation of an ellipse, and

    be a set of boundary points on the markers image to befitted. The (five) elliptical parameters can be thencomputed by minimizing the following error function:

    where wi are the weighing factors that take into accountthe non-uniformity of the data points along the ellipsesboundary.3.2 Estimation of a Circular-Markers 3D-Pose

    In order to move the robot-mounted camera to astandard-viewing position, the 3D pose of a circularmarker has to be determined first. Circular-marker poseestimation is equivalent to the solution of the followingproblem: Given a 3D conic surface, defined by anelliptical base (the perspective projection of a circularfeature in the image plane) and ia vertex (the center of thecameras lens) with respect to a reference frame,determine the pose of a plane (with respect to the samereference frame which intersects the cone and generates acircular curve.

    191

    1981

  • 8/13/2019 1996 Moving-object Recognition

    3/6

    The general form of the equation of a cone withrespect to the image frame is as follows:a x 2 by2 a 2 2 f y x + 2 g z x 4)2hzy 2ux 2vy 2wz = 0An intersection plane can be defined by

    x + m y nz = 0 Therefore, the problem of finding thecoefficients of the equation of a plane, for which theintersection is circular, can be expressed mathematicallyas: determine 1, m and n such that the intersection of theconical surface with the following surface is a circle: Zx my z = 0, where l2 m2 n2 = 1.To solve the problem, first the equation of the conicalsurface can be reduced to a more compact form:

    where the XYZ-frame is called the canonical frame ofconicoids. It can be shown that the reduction of thegeneral equation of a cone to the above form results in aclosed-form analytical solution. There exist two possiblesolutions to the problem. To obtain a unique acceptablesolution, as part of the moving-object recognition process,an extra geometrical constraint, such s the change ofeccentricity in a second image, has to be obtained.To obtain a unique solution for a markers position, theradius of the circular feature has to be known. There existtwo solutions: one on the positive Z-axis, and one on thenegative Z-axis. Since only the positive one is acceptablein our case (being located in front of the camera), thecoordinates of the center of the circle (X6 Y Zo withrespect to the XYZ-frame are found to be:

    cAY =--2 A rzo = B~ c2A D

    where A, B, C and D are defined in terms of the elementsof the transformation between the XYZ-frame and XYZ-frame,hi (i=1,2,3)and the known radius r.3.3 Solving the 3D-Orientation-Duality in aingle-Marker-Scene

    As mentioned in Section 3.2, in order to obtain aunique solution for the orientation of a circular marker,acquisition of multiple images of the same marker wouldbe necessary. In general, obtaining a unique solution forthe 3D pose of a circular marker in an unknown motion

    requires two consecutive images with at least three visiblecircles [3]. In our proposed system, however, a uniquesolution can be calculated with one circular marker giventhat the object is subject to certain motion constraint, andthe size of the marker is known. For example, when theobject is moving in translation or planar motion theorientation can be uniquely determined.Figure 2depicts asituation where the marker undergoes 3D translation.Point is the cameras focal center and Plane I is theimage plane. From the fist image, using the techniquepresented in Section 3.2, two possible surface normals ofa circular marker, denoted as unit vectors nl and n2, canbe computed. As the marker moves to the second position,another two possible solutions n and ni are obtained. Bythe definition of translation, the surface normal vector of atranslating plane remains constant. Therefore, the truesolution of the surface orientation can be distinguished sthe one that remains unchanged. In Figure 2, since nl=nI,while nzfnz,nl and nl are found to be the true solution.

    If the object motion is limited to planar motion, then asimilar concept can be applied. We know that when avector in space undergoes planar motion, its z componentremains constant. Therefore, defining mi and mil as the zcomponent of ni and n respectively, if m*=ml, whilem$mi , then nl and nl are the true orientations of thecircular marker.

    Figure 2 . Elimination of the false solution in the case oftranslation.3.4 Standard-View Matching [la]

    When a standard-view image is acquired and thesilhouette of the object is extracted, the resulting chain-coded contour of the object is used to compute the globaleccentricity measure and the shape signature. The objectis then identified by matching its feature vector, consistingof a global eccentricity measure and the shape signature,with the feature information of the standard views in thedatabase. The identity of the shape is determined by theminimum-distance rule. However, if the measures of

    1982

  • 8/13/2019 1996 Moving-object Recognition

    4/6

    dissimilarity between the acquired shape and multiplestandard views are below a certain threshold, the objectcannot be uniquely recognized. In this case, the systemwill identify the several candidates in the database andpass the control back to the active vision system for theacquisition of an additional standard view to resolve theambiguity at hand.4. THE OBJECT-TRAJECTORYPREDICTOR AND ROBOT-MO TIONPLANNER MODULE4 1 Determination of Optimal Initial CameraOrientation[12]

    Optimal camera placement is necessary to maximizemarker detectability, since this would consequently allowus to minimize the number of markers placed on objects.Several optimization problems were formulated andsolved in [121. The specific problem pertinent to this workis finding: the minimum number of markers to be placedon a given set of objects, such that the visibility of at leastone marker (on a randomly appearing object) isguaranteed in a single-camera environment. The outcomeof the optimization is the minimum number and locationsof the markers on the object, as well as the optimal initial-viewing angle ( of the camera.4.2 Object-Trajectory Prediction[13]

    A recursive Kalman filter KF) was proposed in 1131to obtain optimal estimates of the objects present two-dimensional position, as well as predictions of its futuretrajectory. The recursive KF is a computationally efficientalgorithm, which yields an optimal least-squares estimateof a state from a series of noisy observations. It producesa new optimal estimate from an additional observationwithout having to reprocess past data. The KF can also beused to obtain multiple-step-ahead predictions bypropagating the KF extrapolation equations (i.e., one-stepahead predictor). As will be discussed in the next sub-section, these multiple step-ahead predictions are used inour system to guide the mobile camera to optimal viewinglocations.4.3 Motion Planning for Mobile-Camera

    The problem addressed here is two-fold: (i) finding anoptimal viewing point within the robots workspace, and(5) generate time-optimal robot trajectory to this viewing-point. As previously shown in [14] for a moving-objectinterception problem, these two issues are stronglycoupled and should be addressed simultaneously forachieving time-optimal results.

    First Marker-Viewing LocationGiven the predicted trajectory of an objects travelthrough the robots workspace, a corresponding cameraplacement trajectory, { G ( t ) } , an be determined, via aconstant transformation, to represent potential viewing-points. Using this camera trajectory and the robots latestposition, our objective is to find a time-optimal initial-

    viewing point on { G ( t ) . A solution of this problem wasprovided in [1141 and will not be repeated here.It should be noted however, that while the currentrobot trajectory is executing, the planning modulecontinuously re-plans the unexecuted portion of the robot-mounted cameras motion in response to new informationgenerated by the object-motion prediction algorithm.lii Standard-Viewine Location

    The robot-mounted mobile camera has to be re-located, from its first viewing location, in order to acquirea standard-view of the object. As for the solution utilizedabove, this standard-view acquisition location can beoptimally determined for the mobile camera, based on theinformation provided by the object-trajectory predictionmodule and the calculateld standard-viewing-axisorientation, Section 3.3. In order to allow real-timeapplicability, time-optimal task-space quintic polynomialsare used in our work for robot trajectories.5. EXPERIMENTAL SYSTEM5.1 Experimental Setup

    The experimental system is an integration of thefollowing, Figure 3:(a) The object-trajectory predictor and robot-motionplanner module:-Host computer I : 80486 PC DX4 100MHz.Imaging subsystem: A Hitachi 30Hz CCD camerafixed above the object motion plane. A PC-basedPIP Matrox digitizer board with 640x480resolution.

    Software: A KF-based object-motion-predictionalgorithm. A camera-viewing-location-planner androbot-motion-planner algorithm.b) The object recognizer module:Host computer 2 : 80486 PC DX 33MHz.Zmaging subsystem: A .JVC 30Hz CCD cameramounted on the a six degnee-of-freedom GMF S-100robots wrist (fifth link). A PC-based PIP Matroxdigitizer board with 640x480 resolution.SofhYare: Object-recognition algorithms.

    - Object-motion simulator: A NC X-Y table,controllable from a RS-232 port, used to produce theobject motion.

    c) Other auxiliary subsystems:

    983

  • 8/13/2019 1996 Moving-object Recognition

    5/6

    Communication interface: Implemented on a 9600-bps RS-232 serial communication line, to provideexchange of commands and data amongst Host 1,Host 2, robot controller and the NC-table controller.fixed camera

    object-mot1onpredictor & robot- object-motion planneralgorithm host

    recognitionhost 2 algorithmFigure 3. Experimental setup.

    5.2 System ImplementationThe Hitachi CCD camera with a 25 mm lens is placed

    1.8 m above the surface of the X-Y table. This setupyields a 600x400 mm2 field of view with -0.93 /pixelresolution. The JVC CCD camera, mounted near therobots end-effector, also had a 25 mm lens. Bothcameras were calibrated using the mono-view non-co-planar point technique proposed in [15]. The error in Xand Y directions was less than 0.5% for both. Differentobject-motion trajectories were induced via programmedmovement of the NC table at speeds from 4-12 /s(limited by the travel length and field of view in ourexperiment, but can be increased to higher values in adifferent setting).Since the objects were pre-marked using red markers,a red-color filter, was used to threshold the analogcamera-signals such that only one feature, the circlescentroid, is tracked. After locating the marker in thecameras field of view, the markers centroid isdetermined, and used to update the KF. One-step-aheadKF prediction is used to follow the marker across the fieldof view. At present, the entire process (i.e., grabbing animage, finding the objects centroid, and updating the KFtakes -65 msec.Once the objects trajectory has been predicted and therobot has reached its initial viewing-position, twoconsecutive images are taken. Markers on the object aredetected by the mobile camera. Based on the algorithmpresented earlier, a markers pose, represented by a set of3D coordinates of the marker center and the surfacenormal vector of the marker, is determined. The markerspose is passed to the motion-planning system, which inturn sends the mobile camera to its standard-view-acquisition location. Subsequently, the recognition systemperforms the matching of 2D signature.

    5 3 ResultsThe recognition system was implemented and testedsuccessfully for a set of seven different objects, shown in

    Figure 4. Their sizes ranged from 40x40x35mm3 (Object2) t o 9 0 x 9 0 ~ 7 0 ~Object 5). All the objects weredistinguished and classified successfully in ourexperiments. Figure 5(a)-(d) show images taken by the

    robot-mounted camera at different stages. Initially (Time0), the camera is placed at a home position. When theoverhead camera detects the object, the trajectory plannerguides the robot to move to the initial viewing position,which is approximately 1 m above the X-Y table, aimingat the randomly posed object at an angle of 63 degrees. AtTime 1, the first image is obtained and a unique solutionfor the position of the marker is calculated, Fig. 5(a), (theelliptical image of the marker is highlighted, and its majorand minor axes are shown). Two possible surface normalvectors of the surface are calculated and stored. At Time2 , when the X-Y table moves to a second position at anarbitrary speed, a second image is taken, Fig. 5(b), and thetrue surface normal is identified. The surface normal valueis sent to the trajectory planner for the planning of thecamera path. Subsequently, he robot automatically movesto the standard-viewing position. Without any delay, themobile camera is able to take a snapshot of the standard-view s soon as it arrives at the standard-viewingposition,Fig. 5(c). From this view, the object is successfullyrecognizedasObject #7, Fig. 5(d).

    Figure 4. Tested objects.

    6. CONCLUSIONSAs existing general approaches for moving-object-recognition have proven to be difficult to implement,active-vision systems have shown their potential insimplifying the problem and thus expediting the process.The active-vision system presented in this paperdemonstrated a collection of novel techniques used intracking, trajectory planning, recognition and camera

    1984

  • 8/13/2019 1996 Moving-object Recognition

    6/6

    placement. This system, which features the principles ofactive sensing and object-pre-marking, is capable ofrecognizing pre-marked objects moving along predictablepaths.Our system is only a potential implementation exampleand should not be viewed as globally optimal. Variety ofissues, especially in real-time imaging, still remain to beaddressed.

    IDENTITY: OBJECT I

    (c> (dlFigure 5. (a) Image of the object taken at Time 1;(b) Time 2; (c) Time 3, a standard-view of the object;(d) Object is recognized as Object #7.

    REFERENCES:[I ] J.K. Aggarwal and N. Nandhakumar, On theComputationof Motion from Sequence of Images: AReview, Proceedings o the IEEE, Vol. 76, No. 8,[2] T.S. Huang and A.N. Netravali, Motion andStructure from Feature Correspondence: A Review,Proceedings of the IEEE, Vol. 82, No. 2 , Feb. 1994,[3] S.D. Ma, Conics-Based Stereo, Motion Estimation,and Pose Determination, International Journal ofComputer Vision,Vol. 10,No. 1, 1993, pp. 7-25.[4] J.T. Feddema and C.S.G. Lee, Adaptive ImageFeature Prediction and Control for Visual Trackingwith a Hand-Eye Coordinated Camera, IEEETransactions on Systems, Man and Cybernetics, Vol.20, No. 5 September/October 1990,pp. 1172-1183.[5] J. Hwang, Y. Ooi, and S . Ozawa, An AdaptiveSensing System with Tracking and Zooming aMoving Object, IEICE Transactions on Information& Systems, Vol. E76-D, No. 8, Aug. 1993. pp. 926-934.

    August 1988, pp. 971-935.

    pp. 252-268.

    [6] D. Murray and A. Basu, Active Tracking,International Conference on Intelligent Robots andSystems, Yokohama, Japan, July 1993, pp. 1021-1028.

    [7] R. Safaee-Rad, I. Tchoukanov, X. He, K.C. Smith,and B. Benhabib, An Active-Vision System forRecognition of Pre-Marked Objects in RoboticAssembly Workcells, IEEE Conference onComputer Vision and Palttem Recognition, N.Y.City, U.S.A., June 1993, pp. 722-723.[8] R. Safaee-Rad, K.C. Smith, B. Benhabib and I.Tchoukanov, Application of Moment and FourierDescriptors to the Accurate Estimation of EllipticalShape Parameters, Pa ttem Recognition Letters, Vol.

    [9] R. Safaee-Rad, K.C. Smith, B. Benhabib, and I.Tchoukanov, 3D-Location Estimation of CircularFeatures for Machine Vision, IEEE Transactions onRobotics and Automation, Vol. 8, No. 5 1992, pp.[10]D. He, M. Tordon and B. Benhabib, An Active-Vision System for the Recognition Moving Objects,IEEE International Conference on Systems, M an and

    Cybernetics, Vancouver, B.C. Oct. 1995, pp. 2274-2279.[11]I. Tchoukanov, R. Safaee-Rad, K.C. Smith, and B.Benhabib, The Angle-of-Sight Signature for 2D-Shape Analysis of Manufactured Objects, Journal ofPa ttem Recognition, Vol. 2 5 No. 11, Dec. 1992, pp.[12]X. He, B. Benhabib, and K.C. Smith, CAD-BasedOff-line Planning for Active Vision, Journal of

    Robotic Systems, Vol. 12, October, 1995.131 D. Hujic, G. Zak, E.A. Croft, R.G. Fenton, J.K. Mills

    13, July 1992, pp. 479-508.

    624-640.

    1289-1305.

    14

    and B. Benhabib, An Acltive Prediction, Planningand Execution System for Interception of MovingObjects, IEEE lntemational Symposium onAssembly and Task Planning, Pittsburgh, Aug. 1995,E.A. Croft, R.G. Fenton, and B. Benhabib, Time-Optimal Interception of Objects Moving AlongPredictable Paths, IEEE lntemational Symposiumon Assembly and Task Planning, Pittsburgh, PA,R.Y. Tsai, A Versatile Camera Calibration Tech-nique for High-Accuracy 3D Machine VisionMetrology Using Off-the-shelf TV Cameras and

    pp. 347-352.

    Aug. 1995, pp. 419-425.

    Lenses,iEEE Journal of Robotics and Automation,Vol. 3, NO.4, Aug. 1987, pp. 323-344.

    1985