research article monocular vision slam for indoor aerial ... · battery manual override yaw...
TRANSCRIPT
Hindawi Publishing CorporationJournal of Electrical and Computer EngineeringVolume 2013 Article ID 374165 15 pageshttpdxdoiorg1011552013374165
Research ArticleMonocular Vision SLAM for Indoor Aerial Vehicles
Koray Ccedilelik and Arun K Somani
Department of Electrical and Computer Engineering Iowa State University Ames IA 50010 USA
Correspondence should be addressed to Koray Celik korayiastateedu
Received 14 October 2012 Accepted 23 January 2013
Academic Editor Jorge Dias
Copyright copy 2013 K Celik and A K Somani This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
This paper presents a novel indoor navigation and ranging strategy via monocular camera By exploiting the architectural orthog-onality of the indoor environments we introduce a new method to estimate range and vehicle states from a monocular camera forvision-based SLAM The navigation strategy assumes an indoor or indoor-like manmade environment whose layout is previouslyunknown GPS-denied representable via energy based feature points and straight architectural lines We experimentally validatethe proposed algorithms on a fully self-contained microaerial vehicle (MAV) with sophisticated on-board image processing andSLAM capabilities Building and enabling such a small aerial vehicle to fly in tight corridors is a significant technological challengeespecially in the absence of GPS signals and with limited sensing options Experimental results show that the system is only limitedby the capabilities of the camera and environmental entropy
1 Introduction
The critical advantage of vision over active proximity sensorssuch as laser range finders is the information to weight ratioNevertheless as the surroundings are captured indirectlythrough photometric effects extracting absolute depth infor-mation from a single monocular image alone is an ill-posedproblem In this paper we have aimed to address this problemwith as minimal use of additional information as possiblefor the specific case of a rotorcraft MAV where size weightand power (SWaP) constraints are severe and investigate thefeasibility of low-weight and low-power monocular vision-based navigation solution Although we emphasize MAV usein this paper our approach has been tested and proved per-fectly compatible with ground based mobile robots as wellas wearable cameras such as helmet or tactical vest mounteddevice and further it can be used to augment the reliabilityof several other types of sensors Considering the foreseeablefuture of intelligence surveillance and reconnaissance mis-sions will involveGPS-denied environments portable vision-SLAMcapabilities can pave the way for a GPS-free navigationsystems
Our approach is inspired by how intelligent animals suchas cats and bats interpret depth via monocular visual cuessuch as relative height texture gradient and motion parallax
[1] by subconsciously tracking dense elements such as foliageWe integrate this ranging technique with SLAM to achieveautonomous indoor navigation of an MAV
11 Related Work on Vision-Based SLAM Addressing thedepth problem the literature resorted to various methodssuch as the Scheimpflug principle structure from motionoptical flow and stereo vision The use of moving lenses formonocular depth extraction [2] is not practical for SLAMsince thismethod cannot focus atmultiple depths at onceThedependence of stereo vision on ocular separation [3] limitsits useful range And image patches obtained via optical flowsensors [4 5] are too ambiguous for the landmark associationprocedure for SLAM In sensing efforts to retrieve depthinformation from a still image by using machine learningsuch as the Markov Random Field learning algorithm [6 7]are shown to be effective However a-priori informationabout the environment must be obtained from a training setof images which disqualifies them for an online-SLAM algo-rithm in an unknown environment Structure from Motion(SFM) [3 8 9]may be suitable for the offline-SLAMproblemHowever an automatic analysis of the recorded footage froma completed mission cannot scale to a consistent localizationover arbitrarily long sequences in real time Methods such
2 Journal of Electrical and Computer Engineering
Right hallway line
Range B
Range A
120573119867
120595
119897
119903119906119903
119906119897119910 119911
119909∙
∙
∙
∙120579
Figure 1 A three-dimensional representation of the corridor with respect to the MAV Note that the width of the hallway is not provided tothe algorithm and the MAV does not have any sensors that can detect walls
as monoSLAM [10 11] which depend on movement fordepth estimation and offer a relative recovered scale may notprovide reliable object avoidance for an agile MAV in anindoor environment A rotorcraft MAV needs to bank tomove the camera sideways a movement severely limited in ahallway for helicopter dynamics it has to be able to performdepth measurement from a still or nearly-still platform
In SLAM Extended Kalman Filter based approaches withfull covariance have a limitation for the size of a manageablemap in real time considering the quadratic nature of thealgorithm versus computational resources of anMAV Globallocalization techniques such as Condensation SLAM [12]require a full map to be provided to the robot a-prioriAzimuth learning based techniques such as Cognitive SLAM[13] are parametric and locations are centered on the robotwhich naturally becomes incompatible with ambiguous land-marksmdashsuch as the landmarks our MAV has to work withImage registration basedmethods such as [14] propose a dif-ferent formulation of the vision-based SLAM problem basedon motion structure and illumination parameters withoutfirst having to find feature correspondences For a real-timeimplementation however a local optimization procedure isrequired and there is a possibility of getting trapped ina local minimum Further without merging regions witha similar structure the method becomes computationallyintensive for an MAV Structure extraction methods [15]have some limitations since an incorrect incorporation ofpoints into higher level features will have an adverse effecton consistency Further these systems depend on a successfulselection of thresholds
12 Comparison with Prior Work and Organization Thispaper addresses the above shortcomings using an unmodifiedconsumer-grade monocular web camera By exploiting thearchitectural orthogonality of the indoor and urban outdoorenvironments we introduce a novel method for monocularvision-based SLAMby computing absolute range and bearinginformationwithout using active ranging sensorsMore thor-ough algorithm formulations and newer experimental resultswith a unique indoor-flying helicopter are discussed in thispaper than in our prior conference articles [16ndash19] Section 2explains the procedures for perception of world geometryas pre-requisites for SLAM such as range measurement
methods as well as performance evaluations of proposedmethodsWhile a visual turn-sensing algorithm is introducedin Section 3 SLAM formulations are provided in Section 4Results of experimental validation as well as a descriptionof the MAV hardware platform are presented in Section 5Figure 2 can be used as a guide to sections as well as to theprocess flow of our proposed method
2 Problem and Algorithm Formulation
We propose a novel method to estimate the absolute depth offeatures using a monocular camera as a sole means of navi-gation The camera is mounted on the platform with a slightdownward tilt Landmarks are assumed to be stationaryMov-ing targets are also detected however they are not consideredas landmarks and therefore ignored by the map Altitude ismeasured in real time via the on-board ultrasonic altimeteron our MAV or in the case of a ground robot it can beprovided to the system via various methods depending onwhere the camera is installed It is acceptable that the cameratranslates or tilts with respect to the robot such as mountedon a robotic arm as long as the mount is properly encoded toindicate altitude We validate our results with a time-varyingaltitude The ground is assumed to be relatively flat (no morethan 5 degrees of inclination within a 10-meter perimeter)Our algorithmhas capability to adapt to inclines if the cameratilt can be controlled we have equipped some of our testplatforms with this capability
21 Landmark Extraction Step I Feature Extraction A land-mark in the SLAM context is a conspicuous distinguishinglandscape feature marking a location A minimal landmarkcan consist of two measurements with respect to robot posi-tion range and bearing Our landmark extraction strategy isa three step automatic process All three-steps are performedon a frame 119868
119905 before moving onto the next frame 119868
119905+ 1
The first step involves finding prominent parts of 119868119905that
tend to be more attractive than other parts in terms oftexture dissimilarity and convergence These parts tend tobe immune to rotation scale illumination and image noiseandwe refer to them as features which have the form119891
119899(119906 V)
We utilize two algorithms for this procedure For flying
Journal of Electrical and Computer Engineering 3
Navigation computer
Communications
Mission controlMission planning
Battery Manual override Yaw gyroscope Flight surfaces Power plant
RAM
Inertial measurement unit
USB20
Orthogonalitypresent
SLAMHelix bearing algorithm
VGA video
Compass
Mass storage
Fuselage
Autopilot
GPS
Altimeter Airspeed
Manual overrideWireless RS232Wireless RS232
Custom Linux kernel
IEEE 80211
acquisition and edge filteringLandmark extraction
Range-bearing measurements
New heading
NoYes
Landmarks
Range bearingH
eading
Line-slope extraction
Hallw
ay lines
2 MP monocular image
Figure 2 Block diagram illustrating the operational steps of the monocular vision navigation and ranging at high level and its relations withthe flight systems The scheme is directly applicable to other mobile platforms
platforms considering the limited computational resourcesavailable we prefer the the algorithm proposed by Shi andTomasi [20] in which sections of 119868 with large eigenvaluesare extracted into a set Ψ such that Ψ = 119891
1 1198912 119891
119899
Although there is virtually no limit for 119899 it is impossible atthis time in the procedure to make an educated distinctionbetween a useless feature for the map (ie one that cannotbe used for ranging and bearing) and a potential landmark(ie one that provides reliable range and bearing informationand thus can be included in the map) For ground basedplatforms we prefer the SURF algorithm (Figure 3) due tothe directionality its detected features offer [21] Directionalfeatures are particularly useful where the platform dynamicsare diverse such as human body or MAV applications ingusty environments directional features are more robust interms of associating them with architectural lines whereinstead of a single distance threshold the direction of featureitself also becomes a metric It is also useful when ceilings areused where lines are usually segmented and more difficult todetect This being an expensive algorithm we consider fasterimplementations such as ASURF
In following steps we describe how to extract a sparse setof reliable landmarks from a populated set of questionablefeatures
22 Landmark Extraction Step II Line and Slope ExtractionConceptually landmarks exist in the 3D inertial frame andthey are distinctive whereas features in Ψ = 119891
1 1198912 119891
119899
exist on a 2D image plane and they contain ambiguity Inother words our knowledge of their range and bearing infor-mation with respect to the camera is uniformly distributedacross 119868
119905 Considering the limited mobility of our platform
in the particular environment parallax among the features isvery limited Thus we attempt to correlate the contents of Ψwith the real world via their relationship with the perspectivelines
On a well-lit well-contrasting noncluttered hallway per-spective lines are obvious Practical hallways have randomobjects that segment or even falsely mimic these lines More-over on a monocular camera objects are aliased with dis-tance making it more difficult to find consistent ends of per-spective lines as they tend to be considerably far from thecamera For these reasons the construction of those linesshould be an adaptive approach
We begin the adaptive procedure by edge filtering theimage 119868 through a discrete differentiation operator withmore weight on the horizontal convolution such as
1198681015840
119909= 119865ℎlowast 119868 119868
1015840
119910= 119865V lowast 119868 (1)
4 Journal of Electrical and Computer Engineering
where lowast denotes the convolution operator and 119865 is a 3 times 3kernel for horizontal and vertical derivative approximations1198681015840
119909and 1198681015840119910are combined with weights whose ratio determines
the range of angles through which edges will be filtered Thisin effect returns a binary image plane 1198681015840 with potential edgesthat are more horizontal than vertical It is possible to reversethis effect to detect other edges of interest such as ceilinglines or door frames At this point edges will disintegratethe more vertical they get (see Figure 3 for an illustration)Application of the Hough Transform to 1198681015840 will return allpossible lines automatically excluding discrete point sets outof which it is possible to sort out lines with a finite slope 120601 = 0
and curvature 120581 = 0 This is a significantly expensive oper-ation (ie considering the limited computational resourcesof an MAV) to perform on a real-time video feed since thetransform has to run over the entire frame including theredundant parts
To improve the overall performance in terms of efficiencywe have investigated replacing Hough Transform with analgorithm that only runs on parts of 1198681015840 that contain dataThis approach begins by dividing 1198681015840 into square blocks 119861
119909119910
Optimal block size is the smallest block that can still capturethe texture elements in 1198681015840 Camera resolution and filteringmethods used to obtain 1198681015840 affect the resulting texture elementstructure The blocks are sorted to bring the highest numberof data points with the lowest entropy (2) first as this is ablock most likely to contain lines Blocks that are empty orhave a few scattered points in them are excluded from furtheranalysis Entropy is the characteristic of an image patch thatmakes it more ambiguous by means of disorder in a closedsystem This assumes that disorder is more probable thanorder and thereby lower disorder has higher likelihood ofcontaining an architectural feature such as a line Entropy canbe expressed as
minussum
119909119910
119861119909119910
log119861119909119910 (2)
The set of candidate blocks resulting at this point are to besearched for lines Although a block 119861
119899is a binary matrix it
can be thought as a coordinate system which contains a set ofpoints (ie pixels) with (119909 119910) coordinates such that positive119909is right and positive 119910 is down Since we are more interestedin lines that are more horizontal than vertical it is safe toassume that the errors in the 119910 values outweigh those in the 119909values Equation for a ground line is in the form 119910 = 119898119909 + 119887and the deviations of data points in the block from this line are119889119894= 119910119894minus(119898119909
119894+119887)Therefore themost likely line is the one that
is composed of data points that minimize the deviation suchthat 1198892
119894= (119910119894minus 119898119909119894minus 119887)2 Using determinants the deviation
can be obtained as in (3)
119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum119909
119894
sum119909119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119898 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (119909119894sdot 119910119894) sum119909
119894
sum119910119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119887 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum (119909
119894sdot 119910119894)
sum119909119894
sum119910119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
(3)
Since our rangemeasurementmethods depend on these linesthe overall line-slope accuracy is affected by the reliabilityin detecting and measuring the hallway lines (or road linessidewalk lines depending on context) The high measure-ment noise in slopes has adverse effects on SLAM and shouldbe minimized to prevent inflating the uncertainty in 119871
1=
tan1206011and 119871
2= tan120601
2or the infinity point (119875
119909 119875119910) To
reduce this noise lines are cross-validated for the longestcollinearity via pixel neighborhood based line extraction inwhich the results obtained rely only on a local analysis Theircoherence is further improved using a postprocessing stepvia exploiting the texture gradient With an assumption ofthe orthogonality of the environment lines from the groundedges are extracted Note that this is also applicable to ceilinglines Although ground lines (and ceiling lines if applicable)are virtually parallel in the real world on the image plane theyintersectThe horizontal coordinate of this intersection pointis later used as a heading guide for the MAV as illustrated inFigure 5 Features that happen to coincide with these lines arepotential landmark candidates When this step is complete aset of features cross-validated with the perspective lines Ψ1015840which is a subset of Ψ with the nonuseful features removedis passed to the third step
23 Landmark Extraction Step III Range Measurement bythe Infinity-Point Method This step accurately measures theabsolute distance to features inΨ1015840 by integrating local patchesof the ground information into a global surface referenceframe This new method significantly differs from opticalflows in that the depth measurement does not require a suc-cessive history of images
Our strategy here assumes that the height of the camerafrom the ground 119867 is known a priori (see Figure 1) MAVprovides real-time altitude information to the camera Wealso assume that the camera is initially pointed at the generaldirection of the far end of the corridor This later assumptionis not a requirement if the camera is pointed at a wall thesystem will switch to visual steering mode and attempt torecover camera path withoutmapping until hallway structurebecomes available
The camera is tilted down (or up depending on pref-erence) with an angle 120573 to facilitate continuous capture offeaturemovement across perspective linesThe infinity point(119875119909 119875119910) is an imaginary concept where the projections of
the two parallel perspective lines appear to intersect onthe image plane Since this intersection point is in theoryinfinitely far from the camera it should present no parallax inresponse to the translations of the camera It does howevereffectively represent the yaw and the pitch of the camera(note the crosshair in Figure 5) Assume that the end pointsof the perspective lines are 119864
1198671= (119897 119889 minus119867)
119879 and 1198641198672
=
(119897 119889 minus 119908 minus119867)119879 where 119897 is length and 119908 is the width of the
hallway 119889 is the horizontal displacement of the camera fromthe left wall and 119867 is the MAV altitude (see Figure 4 fora visual description) The Euler rotation matrix to convert
Journal of Electrical and Computer Engineering 5
Figure 3 Initial stages after filtering for line extraction in which the line segments are being formed Note that the horizontal lines acrossthe image denote the artificial horizon for the MAV these are not architectural detections but the on-screen display provided by the MAVThis procedure is robust to transient disturbances such as people walking by or trees occluding the architecture
from the camera frame to the hallway frame is given in(4)
119860 =[[
[
119888120595119888120573 119888120573119904120595 minus119904120573
119888120595119904120601119904120573 minus 119888120601119904120595 119888120601119888120595 + 119904120601119904120595119904120573 119888120573119904120601
119904120601119904120595 + 119888120601119888120595119904120573 119888120601119904120595119904120573 minus 119888120595119904120601 119888120601119888120573
]]
]
(4)
where 119888 and 119904 are abbreviations for cos and sin functionsrespectively The vehicle yaw angle is denoted by 120595 thepitch by 120573 and the roll by 120601 Since the roll angle is con-trolled by the onboard autopilot system it can be set to bezero
The points 1198641198671
and 1198641198672
are transformed into the cameraframe via multiplication with the transpose of 119860 in (4)
1198641198621= 119860119879sdot (119897 119889 minus119867)
119879 119864
1198622= 119860119879sdot (119897 119889 minus 119908 minus119867)
119879
(5)
This 3D system is then transformed into the 2D image planevia
119906 =119910119891
119909 V =
119911119891
119909 (6)
where 119906 is the pixel horizontal position from center (right ispositive) V is the pixel vertical position from center (up ispositive) and 119891 is the focal length (37mm for the particularcamera we have used)The end points of the perspective lineshave now transformed from 119864
1198671and 119864
1198672to (119875119909
1 1198751199101)119879 and
(1198751199092 1198751199102)119879 respectively An infinitely long hallway can be
represented by
lim119897rarrinfin
1198751199091= lim119897rarrinfin
1198751199092= 119891 tan120595
lim119897rarrinfin
1198751199101= lim119897rarrinfin
1198751199102= minus
119891 tan120573cos120595
(7)
which is conceptually the same as extending the perspectivelines to infinity The fact that 119875119909
1= 119875119909
2and 119875119910
1= 119875119910
2
indicates that the intersection of the lines in the image planeis the end of such an infinitely long hallway Solving theresulting equations for 120595 and 120573 yields the camera yaw andpitch respectively
120595 = tanminus1 (119875119909
119891) 120573 = minustanminus1 (
119875119910cos120595119891
) (8)
A generic form of the transformation from the pixel position(119906 V) to (119909 119910 119911) can be derived in a similar fashion [3]The equations for 119906 and V also provide general coordinatesin the camera frame as (119911
119888119891V 119906119911
119888V 119911119888) where 119911
119888is the 119911
position of the object in the camera frame Multiplying with(4) transforms the hallway frame coordinates (119909 119910 119911) intofunctions of 119906 V and 119911
119888 Solving the new 119911 equation for 119911
119888
and substituting into the equations for 119909 and 119910 yields
119909 = ((11988612119906 + 11988613V + 11988611119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
119910 = ((11988622119906 + 11988623V + 11988621119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
(9)
where 119886119894119895denotes the elements of the matrix in (4) See
Figure 1 for the descriptions of 119909 and 119910For objects likely to be on the floor the height of the
camera above the ground is the 119911 position of the object Alsoif the platform roll can be measured or assumed negligiblethen the combination of the infinity point with the heightcan be used to obtain the range to any object on the floorof the hallway This same concept applies to objects whichare likely to be on the same wall or the ceiling By exploitingthe geometry of the corners present in the corridor our
6 Journal of Electrical and Computer Engineering
119908
1198641198671 = [119897 119889 minus119867]
1198641198621 = 119860119879 middot [119897 119889 minus119867]
119897119889120601
119867
120573
120595
(0 0 0)
1198641198672 = [119897 119889 minus 119908 minus119867]
1198641198622 = 119860119879 middot [119897 119889 minus 119908 minus119867]
Figure 4 A visual description the environment as perceived by theinfinity-point method
method computes the absolute range and bearing of thefeatures effectively turning them into landmarks needed forthe SLAM formulation See Figure 5which illustrates the finalappearance of the ranging algorithm
The graph in Figure 6 illustrates the disagreement bet-ween the line-perspectives and the infinity-point method(Section 23) in an experiment in which both algorithms exe-cuted simultaneously on the same video feedWith the partic-ular camera we used in the experiments (Logitech C905) theinfinity-point method yielded a 93 accuracy These num-bers are functions of camera resolution camera noise and theconsequent line extraction noise Therefore disagreementsnot exceeding 05 meters are in the favor of it with respectto accuracy Disagreements from the ground truth includeall transient measurement errors such as camera shake oroccasional introduction of moving objects that deceptivelymimic the environment and other anomaliesThe divergencebetween the two ranges that is visible between samples 20and 40 in Figure 6 is caused by a hallway line anomaly fromthe line extraction process independent of ranging In thisparticular case both the hallway lines have shifted causingthe infinity point to move left Horizontal translations of theinfinity point have a minimal effect on the measurementperformance of the infinity-point method being one of itsmain advantages Refer to Figure 7 for the demonstrationof the performance of these algorithms in a wide variety ofenvironments
The bias between the two measurements shown inFigure 6 is due to shifts in camera calibration parameters inbetween different experiments Certain environmental fac-tors have dramatic effects on lens precision such as accelera-tion corrosive atmosphere acoustic noise fluid contamina-tion low pressure vibration ballistic shock electromagneticradiation temperature and humidity Most of those condi-tions readily occur on an MAV (and most other platformsincluding human body) due to parts rotating at high speedspowerful air currents static electricity radio interferenceand so on Autocalibration concept is wide and beyond
the scope of this paper We present a novel mathematicalprocedure that addresses the issue of maintaining monocularcamera calibration automatically in hostile environments inanother paper of ours and we encourage the reader to refer toit [22]
3 Helix Bearing Algorithm
When the MAV approaches a turn an exit a T-section ora dead-end both ground lines tend to disappear simul-taneously Consequently range and heading measurementmethods cease to function A set of features might still bedetected and theMAV canmake a confident estimate of theirspatial pose However in the absence of depth informationa one-dimensional probability density over the depth isrepresented by a two-dimensional particle distribution
In this section we propose a turn-sensing algorithm toestimate120595 in the absence of orthogonality cuesThis situationautomatically triggers the turn-explorationmode in theMAVA yaw rotation of the body frame is initiated until anotherpassage is found The challenge is to estimate 120595 accuratelyenough to update the SLAM map correctly This proce-dure combines machine vision with the data matching anddynamic estimation problem For instance if the MAVapproaches a left-turn after exploring one leg of an ldquoLrdquo shapedhallway turns left 90 degrees and continues through the nextleg the map is expected to display two hallways joined at a90-degree angle Similarly a 180-degree turn before findinganother hallway would indicate a dead end This way theMAV can also determine where turns are located the nexttime they are visited
The newmeasurement problem at turns is to compute theinstantaneous velocity (119906 V) of every helix (moving feature)that the MAV is able to detect as shown in Figure 9 Inother words an attempt is made to recover 119881(119909 119910 119905) =
(119906(119909 119910 119905) (V(119909 119910 119905)) = (119889119909119889119905 119889119910119889119905) using a variation ofthe pyramidal Lucas-Kanade method This recovery leads toa 2D vector field obtained via perspective projection of the3D velocity field onto the image plane At discrete time stepsthe next frame is defined as a function of a previous frame as119868119905+1(119909 119910 119911 119905) = 119868
119905(119909 + 119889119909 119910 + 119889119910 119911 + 119889119911 119905 + 119889119905) By applying
the Taylor series expansion
119868 (119909 119910 119911 119905) +120597119868
120597119909120575119909 +
120597119868
120597119910120575119910 +
120597119868
120597119911120575119911 +
120597119868
120597119905120575119905 (10)
then by differentiating with respect to time yields the helixvelocity is obtained in terms of pixel distance per time step 119896
At this point each helix is assumed to be identicallydistributed and independently positioned on the image planeAnd each helix is associated with a velocity vector 119881
119894=
(V 120593)119879 where 120593 is the angular displacement of velocitydirection from the north of the image plane where 1205872 iseast 120587 is south and 31205872 is west Although the associateddepths of the helix set appearing at stochastic points on theimage plane are unknown assuming a constant there is arelationship between distance of a helix from the camera andits instantaneous velocity on the image plane This suggeststhat a helix cluster with respect to closeness of individual
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
2 Journal of Electrical and Computer Engineering
Right hallway line
Range B
Range A
120573119867
120595
119897
119903119906119903
119906119897119910 119911
119909∙
∙
∙
∙120579
Figure 1 A three-dimensional representation of the corridor with respect to the MAV Note that the width of the hallway is not provided tothe algorithm and the MAV does not have any sensors that can detect walls
as monoSLAM [10 11] which depend on movement fordepth estimation and offer a relative recovered scale may notprovide reliable object avoidance for an agile MAV in anindoor environment A rotorcraft MAV needs to bank tomove the camera sideways a movement severely limited in ahallway for helicopter dynamics it has to be able to performdepth measurement from a still or nearly-still platform
In SLAM Extended Kalman Filter based approaches withfull covariance have a limitation for the size of a manageablemap in real time considering the quadratic nature of thealgorithm versus computational resources of anMAV Globallocalization techniques such as Condensation SLAM [12]require a full map to be provided to the robot a-prioriAzimuth learning based techniques such as Cognitive SLAM[13] are parametric and locations are centered on the robotwhich naturally becomes incompatible with ambiguous land-marksmdashsuch as the landmarks our MAV has to work withImage registration basedmethods such as [14] propose a dif-ferent formulation of the vision-based SLAM problem basedon motion structure and illumination parameters withoutfirst having to find feature correspondences For a real-timeimplementation however a local optimization procedure isrequired and there is a possibility of getting trapped ina local minimum Further without merging regions witha similar structure the method becomes computationallyintensive for an MAV Structure extraction methods [15]have some limitations since an incorrect incorporation ofpoints into higher level features will have an adverse effecton consistency Further these systems depend on a successfulselection of thresholds
12 Comparison with Prior Work and Organization Thispaper addresses the above shortcomings using an unmodifiedconsumer-grade monocular web camera By exploiting thearchitectural orthogonality of the indoor and urban outdoorenvironments we introduce a novel method for monocularvision-based SLAMby computing absolute range and bearinginformationwithout using active ranging sensorsMore thor-ough algorithm formulations and newer experimental resultswith a unique indoor-flying helicopter are discussed in thispaper than in our prior conference articles [16ndash19] Section 2explains the procedures for perception of world geometryas pre-requisites for SLAM such as range measurement
methods as well as performance evaluations of proposedmethodsWhile a visual turn-sensing algorithm is introducedin Section 3 SLAM formulations are provided in Section 4Results of experimental validation as well as a descriptionof the MAV hardware platform are presented in Section 5Figure 2 can be used as a guide to sections as well as to theprocess flow of our proposed method
2 Problem and Algorithm Formulation
We propose a novel method to estimate the absolute depth offeatures using a monocular camera as a sole means of navi-gation The camera is mounted on the platform with a slightdownward tilt Landmarks are assumed to be stationaryMov-ing targets are also detected however they are not consideredas landmarks and therefore ignored by the map Altitude ismeasured in real time via the on-board ultrasonic altimeteron our MAV or in the case of a ground robot it can beprovided to the system via various methods depending onwhere the camera is installed It is acceptable that the cameratranslates or tilts with respect to the robot such as mountedon a robotic arm as long as the mount is properly encoded toindicate altitude We validate our results with a time-varyingaltitude The ground is assumed to be relatively flat (no morethan 5 degrees of inclination within a 10-meter perimeter)Our algorithmhas capability to adapt to inclines if the cameratilt can be controlled we have equipped some of our testplatforms with this capability
21 Landmark Extraction Step I Feature Extraction A land-mark in the SLAM context is a conspicuous distinguishinglandscape feature marking a location A minimal landmarkcan consist of two measurements with respect to robot posi-tion range and bearing Our landmark extraction strategy isa three step automatic process All three-steps are performedon a frame 119868
119905 before moving onto the next frame 119868
119905+ 1
The first step involves finding prominent parts of 119868119905that
tend to be more attractive than other parts in terms oftexture dissimilarity and convergence These parts tend tobe immune to rotation scale illumination and image noiseandwe refer to them as features which have the form119891
119899(119906 V)
We utilize two algorithms for this procedure For flying
Journal of Electrical and Computer Engineering 3
Navigation computer
Communications
Mission controlMission planning
Battery Manual override Yaw gyroscope Flight surfaces Power plant
RAM
Inertial measurement unit
USB20
Orthogonalitypresent
SLAMHelix bearing algorithm
VGA video
Compass
Mass storage
Fuselage
Autopilot
GPS
Altimeter Airspeed
Manual overrideWireless RS232Wireless RS232
Custom Linux kernel
IEEE 80211
acquisition and edge filteringLandmark extraction
Range-bearing measurements
New heading
NoYes
Landmarks
Range bearingH
eading
Line-slope extraction
Hallw
ay lines
2 MP monocular image
Figure 2 Block diagram illustrating the operational steps of the monocular vision navigation and ranging at high level and its relations withthe flight systems The scheme is directly applicable to other mobile platforms
platforms considering the limited computational resourcesavailable we prefer the the algorithm proposed by Shi andTomasi [20] in which sections of 119868 with large eigenvaluesare extracted into a set Ψ such that Ψ = 119891
1 1198912 119891
119899
Although there is virtually no limit for 119899 it is impossible atthis time in the procedure to make an educated distinctionbetween a useless feature for the map (ie one that cannotbe used for ranging and bearing) and a potential landmark(ie one that provides reliable range and bearing informationand thus can be included in the map) For ground basedplatforms we prefer the SURF algorithm (Figure 3) due tothe directionality its detected features offer [21] Directionalfeatures are particularly useful where the platform dynamicsare diverse such as human body or MAV applications ingusty environments directional features are more robust interms of associating them with architectural lines whereinstead of a single distance threshold the direction of featureitself also becomes a metric It is also useful when ceilings areused where lines are usually segmented and more difficult todetect This being an expensive algorithm we consider fasterimplementations such as ASURF
In following steps we describe how to extract a sparse setof reliable landmarks from a populated set of questionablefeatures
22 Landmark Extraction Step II Line and Slope ExtractionConceptually landmarks exist in the 3D inertial frame andthey are distinctive whereas features in Ψ = 119891
1 1198912 119891
119899
exist on a 2D image plane and they contain ambiguity Inother words our knowledge of their range and bearing infor-mation with respect to the camera is uniformly distributedacross 119868
119905 Considering the limited mobility of our platform
in the particular environment parallax among the features isvery limited Thus we attempt to correlate the contents of Ψwith the real world via their relationship with the perspectivelines
On a well-lit well-contrasting noncluttered hallway per-spective lines are obvious Practical hallways have randomobjects that segment or even falsely mimic these lines More-over on a monocular camera objects are aliased with dis-tance making it more difficult to find consistent ends of per-spective lines as they tend to be considerably far from thecamera For these reasons the construction of those linesshould be an adaptive approach
We begin the adaptive procedure by edge filtering theimage 119868 through a discrete differentiation operator withmore weight on the horizontal convolution such as
1198681015840
119909= 119865ℎlowast 119868 119868
1015840
119910= 119865V lowast 119868 (1)
4 Journal of Electrical and Computer Engineering
where lowast denotes the convolution operator and 119865 is a 3 times 3kernel for horizontal and vertical derivative approximations1198681015840
119909and 1198681015840119910are combined with weights whose ratio determines
the range of angles through which edges will be filtered Thisin effect returns a binary image plane 1198681015840 with potential edgesthat are more horizontal than vertical It is possible to reversethis effect to detect other edges of interest such as ceilinglines or door frames At this point edges will disintegratethe more vertical they get (see Figure 3 for an illustration)Application of the Hough Transform to 1198681015840 will return allpossible lines automatically excluding discrete point sets outof which it is possible to sort out lines with a finite slope 120601 = 0
and curvature 120581 = 0 This is a significantly expensive oper-ation (ie considering the limited computational resourcesof an MAV) to perform on a real-time video feed since thetransform has to run over the entire frame including theredundant parts
To improve the overall performance in terms of efficiencywe have investigated replacing Hough Transform with analgorithm that only runs on parts of 1198681015840 that contain dataThis approach begins by dividing 1198681015840 into square blocks 119861
119909119910
Optimal block size is the smallest block that can still capturethe texture elements in 1198681015840 Camera resolution and filteringmethods used to obtain 1198681015840 affect the resulting texture elementstructure The blocks are sorted to bring the highest numberof data points with the lowest entropy (2) first as this is ablock most likely to contain lines Blocks that are empty orhave a few scattered points in them are excluded from furtheranalysis Entropy is the characteristic of an image patch thatmakes it more ambiguous by means of disorder in a closedsystem This assumes that disorder is more probable thanorder and thereby lower disorder has higher likelihood ofcontaining an architectural feature such as a line Entropy canbe expressed as
minussum
119909119910
119861119909119910
log119861119909119910 (2)
The set of candidate blocks resulting at this point are to besearched for lines Although a block 119861
119899is a binary matrix it
can be thought as a coordinate system which contains a set ofpoints (ie pixels) with (119909 119910) coordinates such that positive119909is right and positive 119910 is down Since we are more interestedin lines that are more horizontal than vertical it is safe toassume that the errors in the 119910 values outweigh those in the 119909values Equation for a ground line is in the form 119910 = 119898119909 + 119887and the deviations of data points in the block from this line are119889119894= 119910119894minus(119898119909
119894+119887)Therefore themost likely line is the one that
is composed of data points that minimize the deviation suchthat 1198892
119894= (119910119894minus 119898119909119894minus 119887)2 Using determinants the deviation
can be obtained as in (3)
119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum119909
119894
sum119909119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119898 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (119909119894sdot 119910119894) sum119909
119894
sum119910119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119887 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum (119909
119894sdot 119910119894)
sum119909119894
sum119910119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
(3)
Since our rangemeasurementmethods depend on these linesthe overall line-slope accuracy is affected by the reliabilityin detecting and measuring the hallway lines (or road linessidewalk lines depending on context) The high measure-ment noise in slopes has adverse effects on SLAM and shouldbe minimized to prevent inflating the uncertainty in 119871
1=
tan1206011and 119871
2= tan120601
2or the infinity point (119875
119909 119875119910) To
reduce this noise lines are cross-validated for the longestcollinearity via pixel neighborhood based line extraction inwhich the results obtained rely only on a local analysis Theircoherence is further improved using a postprocessing stepvia exploiting the texture gradient With an assumption ofthe orthogonality of the environment lines from the groundedges are extracted Note that this is also applicable to ceilinglines Although ground lines (and ceiling lines if applicable)are virtually parallel in the real world on the image plane theyintersectThe horizontal coordinate of this intersection pointis later used as a heading guide for the MAV as illustrated inFigure 5 Features that happen to coincide with these lines arepotential landmark candidates When this step is complete aset of features cross-validated with the perspective lines Ψ1015840which is a subset of Ψ with the nonuseful features removedis passed to the third step
23 Landmark Extraction Step III Range Measurement bythe Infinity-Point Method This step accurately measures theabsolute distance to features inΨ1015840 by integrating local patchesof the ground information into a global surface referenceframe This new method significantly differs from opticalflows in that the depth measurement does not require a suc-cessive history of images
Our strategy here assumes that the height of the camerafrom the ground 119867 is known a priori (see Figure 1) MAVprovides real-time altitude information to the camera Wealso assume that the camera is initially pointed at the generaldirection of the far end of the corridor This later assumptionis not a requirement if the camera is pointed at a wall thesystem will switch to visual steering mode and attempt torecover camera path withoutmapping until hallway structurebecomes available
The camera is tilted down (or up depending on pref-erence) with an angle 120573 to facilitate continuous capture offeaturemovement across perspective linesThe infinity point(119875119909 119875119910) is an imaginary concept where the projections of
the two parallel perspective lines appear to intersect onthe image plane Since this intersection point is in theoryinfinitely far from the camera it should present no parallax inresponse to the translations of the camera It does howevereffectively represent the yaw and the pitch of the camera(note the crosshair in Figure 5) Assume that the end pointsof the perspective lines are 119864
1198671= (119897 119889 minus119867)
119879 and 1198641198672
=
(119897 119889 minus 119908 minus119867)119879 where 119897 is length and 119908 is the width of the
hallway 119889 is the horizontal displacement of the camera fromthe left wall and 119867 is the MAV altitude (see Figure 4 fora visual description) The Euler rotation matrix to convert
Journal of Electrical and Computer Engineering 5
Figure 3 Initial stages after filtering for line extraction in which the line segments are being formed Note that the horizontal lines acrossthe image denote the artificial horizon for the MAV these are not architectural detections but the on-screen display provided by the MAVThis procedure is robust to transient disturbances such as people walking by or trees occluding the architecture
from the camera frame to the hallway frame is given in(4)
119860 =[[
[
119888120595119888120573 119888120573119904120595 minus119904120573
119888120595119904120601119904120573 minus 119888120601119904120595 119888120601119888120595 + 119904120601119904120595119904120573 119888120573119904120601
119904120601119904120595 + 119888120601119888120595119904120573 119888120601119904120595119904120573 minus 119888120595119904120601 119888120601119888120573
]]
]
(4)
where 119888 and 119904 are abbreviations for cos and sin functionsrespectively The vehicle yaw angle is denoted by 120595 thepitch by 120573 and the roll by 120601 Since the roll angle is con-trolled by the onboard autopilot system it can be set to bezero
The points 1198641198671
and 1198641198672
are transformed into the cameraframe via multiplication with the transpose of 119860 in (4)
1198641198621= 119860119879sdot (119897 119889 minus119867)
119879 119864
1198622= 119860119879sdot (119897 119889 minus 119908 minus119867)
119879
(5)
This 3D system is then transformed into the 2D image planevia
119906 =119910119891
119909 V =
119911119891
119909 (6)
where 119906 is the pixel horizontal position from center (right ispositive) V is the pixel vertical position from center (up ispositive) and 119891 is the focal length (37mm for the particularcamera we have used)The end points of the perspective lineshave now transformed from 119864
1198671and 119864
1198672to (119875119909
1 1198751199101)119879 and
(1198751199092 1198751199102)119879 respectively An infinitely long hallway can be
represented by
lim119897rarrinfin
1198751199091= lim119897rarrinfin
1198751199092= 119891 tan120595
lim119897rarrinfin
1198751199101= lim119897rarrinfin
1198751199102= minus
119891 tan120573cos120595
(7)
which is conceptually the same as extending the perspectivelines to infinity The fact that 119875119909
1= 119875119909
2and 119875119910
1= 119875119910
2
indicates that the intersection of the lines in the image planeis the end of such an infinitely long hallway Solving theresulting equations for 120595 and 120573 yields the camera yaw andpitch respectively
120595 = tanminus1 (119875119909
119891) 120573 = minustanminus1 (
119875119910cos120595119891
) (8)
A generic form of the transformation from the pixel position(119906 V) to (119909 119910 119911) can be derived in a similar fashion [3]The equations for 119906 and V also provide general coordinatesin the camera frame as (119911
119888119891V 119906119911
119888V 119911119888) where 119911
119888is the 119911
position of the object in the camera frame Multiplying with(4) transforms the hallway frame coordinates (119909 119910 119911) intofunctions of 119906 V and 119911
119888 Solving the new 119911 equation for 119911
119888
and substituting into the equations for 119909 and 119910 yields
119909 = ((11988612119906 + 11988613V + 11988611119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
119910 = ((11988622119906 + 11988623V + 11988621119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
(9)
where 119886119894119895denotes the elements of the matrix in (4) See
Figure 1 for the descriptions of 119909 and 119910For objects likely to be on the floor the height of the
camera above the ground is the 119911 position of the object Alsoif the platform roll can be measured or assumed negligiblethen the combination of the infinity point with the heightcan be used to obtain the range to any object on the floorof the hallway This same concept applies to objects whichare likely to be on the same wall or the ceiling By exploitingthe geometry of the corners present in the corridor our
6 Journal of Electrical and Computer Engineering
119908
1198641198671 = [119897 119889 minus119867]
1198641198621 = 119860119879 middot [119897 119889 minus119867]
119897119889120601
119867
120573
120595
(0 0 0)
1198641198672 = [119897 119889 minus 119908 minus119867]
1198641198622 = 119860119879 middot [119897 119889 minus 119908 minus119867]
Figure 4 A visual description the environment as perceived by theinfinity-point method
method computes the absolute range and bearing of thefeatures effectively turning them into landmarks needed forthe SLAM formulation See Figure 5which illustrates the finalappearance of the ranging algorithm
The graph in Figure 6 illustrates the disagreement bet-ween the line-perspectives and the infinity-point method(Section 23) in an experiment in which both algorithms exe-cuted simultaneously on the same video feedWith the partic-ular camera we used in the experiments (Logitech C905) theinfinity-point method yielded a 93 accuracy These num-bers are functions of camera resolution camera noise and theconsequent line extraction noise Therefore disagreementsnot exceeding 05 meters are in the favor of it with respectto accuracy Disagreements from the ground truth includeall transient measurement errors such as camera shake oroccasional introduction of moving objects that deceptivelymimic the environment and other anomaliesThe divergencebetween the two ranges that is visible between samples 20and 40 in Figure 6 is caused by a hallway line anomaly fromthe line extraction process independent of ranging In thisparticular case both the hallway lines have shifted causingthe infinity point to move left Horizontal translations of theinfinity point have a minimal effect on the measurementperformance of the infinity-point method being one of itsmain advantages Refer to Figure 7 for the demonstrationof the performance of these algorithms in a wide variety ofenvironments
The bias between the two measurements shown inFigure 6 is due to shifts in camera calibration parameters inbetween different experiments Certain environmental fac-tors have dramatic effects on lens precision such as accelera-tion corrosive atmosphere acoustic noise fluid contamina-tion low pressure vibration ballistic shock electromagneticradiation temperature and humidity Most of those condi-tions readily occur on an MAV (and most other platformsincluding human body) due to parts rotating at high speedspowerful air currents static electricity radio interferenceand so on Autocalibration concept is wide and beyond
the scope of this paper We present a novel mathematicalprocedure that addresses the issue of maintaining monocularcamera calibration automatically in hostile environments inanother paper of ours and we encourage the reader to refer toit [22]
3 Helix Bearing Algorithm
When the MAV approaches a turn an exit a T-section ora dead-end both ground lines tend to disappear simul-taneously Consequently range and heading measurementmethods cease to function A set of features might still bedetected and theMAV canmake a confident estimate of theirspatial pose However in the absence of depth informationa one-dimensional probability density over the depth isrepresented by a two-dimensional particle distribution
In this section we propose a turn-sensing algorithm toestimate120595 in the absence of orthogonality cuesThis situationautomatically triggers the turn-explorationmode in theMAVA yaw rotation of the body frame is initiated until anotherpassage is found The challenge is to estimate 120595 accuratelyenough to update the SLAM map correctly This proce-dure combines machine vision with the data matching anddynamic estimation problem For instance if the MAVapproaches a left-turn after exploring one leg of an ldquoLrdquo shapedhallway turns left 90 degrees and continues through the nextleg the map is expected to display two hallways joined at a90-degree angle Similarly a 180-degree turn before findinganother hallway would indicate a dead end This way theMAV can also determine where turns are located the nexttime they are visited
The newmeasurement problem at turns is to compute theinstantaneous velocity (119906 V) of every helix (moving feature)that the MAV is able to detect as shown in Figure 9 Inother words an attempt is made to recover 119881(119909 119910 119905) =
(119906(119909 119910 119905) (V(119909 119910 119905)) = (119889119909119889119905 119889119910119889119905) using a variation ofthe pyramidal Lucas-Kanade method This recovery leads toa 2D vector field obtained via perspective projection of the3D velocity field onto the image plane At discrete time stepsthe next frame is defined as a function of a previous frame as119868119905+1(119909 119910 119911 119905) = 119868
119905(119909 + 119889119909 119910 + 119889119910 119911 + 119889119911 119905 + 119889119905) By applying
the Taylor series expansion
119868 (119909 119910 119911 119905) +120597119868
120597119909120575119909 +
120597119868
120597119910120575119910 +
120597119868
120597119911120575119911 +
120597119868
120597119905120575119905 (10)
then by differentiating with respect to time yields the helixvelocity is obtained in terms of pixel distance per time step 119896
At this point each helix is assumed to be identicallydistributed and independently positioned on the image planeAnd each helix is associated with a velocity vector 119881
119894=
(V 120593)119879 where 120593 is the angular displacement of velocitydirection from the north of the image plane where 1205872 iseast 120587 is south and 31205872 is west Although the associateddepths of the helix set appearing at stochastic points on theimage plane are unknown assuming a constant there is arelationship between distance of a helix from the camera andits instantaneous velocity on the image plane This suggeststhat a helix cluster with respect to closeness of individual
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 3
Navigation computer
Communications
Mission controlMission planning
Battery Manual override Yaw gyroscope Flight surfaces Power plant
RAM
Inertial measurement unit
USB20
Orthogonalitypresent
SLAMHelix bearing algorithm
VGA video
Compass
Mass storage
Fuselage
Autopilot
GPS
Altimeter Airspeed
Manual overrideWireless RS232Wireless RS232
Custom Linux kernel
IEEE 80211
acquisition and edge filteringLandmark extraction
Range-bearing measurements
New heading
NoYes
Landmarks
Range bearingH
eading
Line-slope extraction
Hallw
ay lines
2 MP monocular image
Figure 2 Block diagram illustrating the operational steps of the monocular vision navigation and ranging at high level and its relations withthe flight systems The scheme is directly applicable to other mobile platforms
platforms considering the limited computational resourcesavailable we prefer the the algorithm proposed by Shi andTomasi [20] in which sections of 119868 with large eigenvaluesare extracted into a set Ψ such that Ψ = 119891
1 1198912 119891
119899
Although there is virtually no limit for 119899 it is impossible atthis time in the procedure to make an educated distinctionbetween a useless feature for the map (ie one that cannotbe used for ranging and bearing) and a potential landmark(ie one that provides reliable range and bearing informationand thus can be included in the map) For ground basedplatforms we prefer the SURF algorithm (Figure 3) due tothe directionality its detected features offer [21] Directionalfeatures are particularly useful where the platform dynamicsare diverse such as human body or MAV applications ingusty environments directional features are more robust interms of associating them with architectural lines whereinstead of a single distance threshold the direction of featureitself also becomes a metric It is also useful when ceilings areused where lines are usually segmented and more difficult todetect This being an expensive algorithm we consider fasterimplementations such as ASURF
In following steps we describe how to extract a sparse setof reliable landmarks from a populated set of questionablefeatures
22 Landmark Extraction Step II Line and Slope ExtractionConceptually landmarks exist in the 3D inertial frame andthey are distinctive whereas features in Ψ = 119891
1 1198912 119891
119899
exist on a 2D image plane and they contain ambiguity Inother words our knowledge of their range and bearing infor-mation with respect to the camera is uniformly distributedacross 119868
119905 Considering the limited mobility of our platform
in the particular environment parallax among the features isvery limited Thus we attempt to correlate the contents of Ψwith the real world via their relationship with the perspectivelines
On a well-lit well-contrasting noncluttered hallway per-spective lines are obvious Practical hallways have randomobjects that segment or even falsely mimic these lines More-over on a monocular camera objects are aliased with dis-tance making it more difficult to find consistent ends of per-spective lines as they tend to be considerably far from thecamera For these reasons the construction of those linesshould be an adaptive approach
We begin the adaptive procedure by edge filtering theimage 119868 through a discrete differentiation operator withmore weight on the horizontal convolution such as
1198681015840
119909= 119865ℎlowast 119868 119868
1015840
119910= 119865V lowast 119868 (1)
4 Journal of Electrical and Computer Engineering
where lowast denotes the convolution operator and 119865 is a 3 times 3kernel for horizontal and vertical derivative approximations1198681015840
119909and 1198681015840119910are combined with weights whose ratio determines
the range of angles through which edges will be filtered Thisin effect returns a binary image plane 1198681015840 with potential edgesthat are more horizontal than vertical It is possible to reversethis effect to detect other edges of interest such as ceilinglines or door frames At this point edges will disintegratethe more vertical they get (see Figure 3 for an illustration)Application of the Hough Transform to 1198681015840 will return allpossible lines automatically excluding discrete point sets outof which it is possible to sort out lines with a finite slope 120601 = 0
and curvature 120581 = 0 This is a significantly expensive oper-ation (ie considering the limited computational resourcesof an MAV) to perform on a real-time video feed since thetransform has to run over the entire frame including theredundant parts
To improve the overall performance in terms of efficiencywe have investigated replacing Hough Transform with analgorithm that only runs on parts of 1198681015840 that contain dataThis approach begins by dividing 1198681015840 into square blocks 119861
119909119910
Optimal block size is the smallest block that can still capturethe texture elements in 1198681015840 Camera resolution and filteringmethods used to obtain 1198681015840 affect the resulting texture elementstructure The blocks are sorted to bring the highest numberof data points with the lowest entropy (2) first as this is ablock most likely to contain lines Blocks that are empty orhave a few scattered points in them are excluded from furtheranalysis Entropy is the characteristic of an image patch thatmakes it more ambiguous by means of disorder in a closedsystem This assumes that disorder is more probable thanorder and thereby lower disorder has higher likelihood ofcontaining an architectural feature such as a line Entropy canbe expressed as
minussum
119909119910
119861119909119910
log119861119909119910 (2)
The set of candidate blocks resulting at this point are to besearched for lines Although a block 119861
119899is a binary matrix it
can be thought as a coordinate system which contains a set ofpoints (ie pixels) with (119909 119910) coordinates such that positive119909is right and positive 119910 is down Since we are more interestedin lines that are more horizontal than vertical it is safe toassume that the errors in the 119910 values outweigh those in the 119909values Equation for a ground line is in the form 119910 = 119898119909 + 119887and the deviations of data points in the block from this line are119889119894= 119910119894minus(119898119909
119894+119887)Therefore themost likely line is the one that
is composed of data points that minimize the deviation suchthat 1198892
119894= (119910119894minus 119898119909119894minus 119887)2 Using determinants the deviation
can be obtained as in (3)
119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum119909
119894
sum119909119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119898 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (119909119894sdot 119910119894) sum119909
119894
sum119910119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119887 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum (119909
119894sdot 119910119894)
sum119909119894
sum119910119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
(3)
Since our rangemeasurementmethods depend on these linesthe overall line-slope accuracy is affected by the reliabilityin detecting and measuring the hallway lines (or road linessidewalk lines depending on context) The high measure-ment noise in slopes has adverse effects on SLAM and shouldbe minimized to prevent inflating the uncertainty in 119871
1=
tan1206011and 119871
2= tan120601
2or the infinity point (119875
119909 119875119910) To
reduce this noise lines are cross-validated for the longestcollinearity via pixel neighborhood based line extraction inwhich the results obtained rely only on a local analysis Theircoherence is further improved using a postprocessing stepvia exploiting the texture gradient With an assumption ofthe orthogonality of the environment lines from the groundedges are extracted Note that this is also applicable to ceilinglines Although ground lines (and ceiling lines if applicable)are virtually parallel in the real world on the image plane theyintersectThe horizontal coordinate of this intersection pointis later used as a heading guide for the MAV as illustrated inFigure 5 Features that happen to coincide with these lines arepotential landmark candidates When this step is complete aset of features cross-validated with the perspective lines Ψ1015840which is a subset of Ψ with the nonuseful features removedis passed to the third step
23 Landmark Extraction Step III Range Measurement bythe Infinity-Point Method This step accurately measures theabsolute distance to features inΨ1015840 by integrating local patchesof the ground information into a global surface referenceframe This new method significantly differs from opticalflows in that the depth measurement does not require a suc-cessive history of images
Our strategy here assumes that the height of the camerafrom the ground 119867 is known a priori (see Figure 1) MAVprovides real-time altitude information to the camera Wealso assume that the camera is initially pointed at the generaldirection of the far end of the corridor This later assumptionis not a requirement if the camera is pointed at a wall thesystem will switch to visual steering mode and attempt torecover camera path withoutmapping until hallway structurebecomes available
The camera is tilted down (or up depending on pref-erence) with an angle 120573 to facilitate continuous capture offeaturemovement across perspective linesThe infinity point(119875119909 119875119910) is an imaginary concept where the projections of
the two parallel perspective lines appear to intersect onthe image plane Since this intersection point is in theoryinfinitely far from the camera it should present no parallax inresponse to the translations of the camera It does howevereffectively represent the yaw and the pitch of the camera(note the crosshair in Figure 5) Assume that the end pointsof the perspective lines are 119864
1198671= (119897 119889 minus119867)
119879 and 1198641198672
=
(119897 119889 minus 119908 minus119867)119879 where 119897 is length and 119908 is the width of the
hallway 119889 is the horizontal displacement of the camera fromthe left wall and 119867 is the MAV altitude (see Figure 4 fora visual description) The Euler rotation matrix to convert
Journal of Electrical and Computer Engineering 5
Figure 3 Initial stages after filtering for line extraction in which the line segments are being formed Note that the horizontal lines acrossthe image denote the artificial horizon for the MAV these are not architectural detections but the on-screen display provided by the MAVThis procedure is robust to transient disturbances such as people walking by or trees occluding the architecture
from the camera frame to the hallway frame is given in(4)
119860 =[[
[
119888120595119888120573 119888120573119904120595 minus119904120573
119888120595119904120601119904120573 minus 119888120601119904120595 119888120601119888120595 + 119904120601119904120595119904120573 119888120573119904120601
119904120601119904120595 + 119888120601119888120595119904120573 119888120601119904120595119904120573 minus 119888120595119904120601 119888120601119888120573
]]
]
(4)
where 119888 and 119904 are abbreviations for cos and sin functionsrespectively The vehicle yaw angle is denoted by 120595 thepitch by 120573 and the roll by 120601 Since the roll angle is con-trolled by the onboard autopilot system it can be set to bezero
The points 1198641198671
and 1198641198672
are transformed into the cameraframe via multiplication with the transpose of 119860 in (4)
1198641198621= 119860119879sdot (119897 119889 minus119867)
119879 119864
1198622= 119860119879sdot (119897 119889 minus 119908 minus119867)
119879
(5)
This 3D system is then transformed into the 2D image planevia
119906 =119910119891
119909 V =
119911119891
119909 (6)
where 119906 is the pixel horizontal position from center (right ispositive) V is the pixel vertical position from center (up ispositive) and 119891 is the focal length (37mm for the particularcamera we have used)The end points of the perspective lineshave now transformed from 119864
1198671and 119864
1198672to (119875119909
1 1198751199101)119879 and
(1198751199092 1198751199102)119879 respectively An infinitely long hallway can be
represented by
lim119897rarrinfin
1198751199091= lim119897rarrinfin
1198751199092= 119891 tan120595
lim119897rarrinfin
1198751199101= lim119897rarrinfin
1198751199102= minus
119891 tan120573cos120595
(7)
which is conceptually the same as extending the perspectivelines to infinity The fact that 119875119909
1= 119875119909
2and 119875119910
1= 119875119910
2
indicates that the intersection of the lines in the image planeis the end of such an infinitely long hallway Solving theresulting equations for 120595 and 120573 yields the camera yaw andpitch respectively
120595 = tanminus1 (119875119909
119891) 120573 = minustanminus1 (
119875119910cos120595119891
) (8)
A generic form of the transformation from the pixel position(119906 V) to (119909 119910 119911) can be derived in a similar fashion [3]The equations for 119906 and V also provide general coordinatesin the camera frame as (119911
119888119891V 119906119911
119888V 119911119888) where 119911
119888is the 119911
position of the object in the camera frame Multiplying with(4) transforms the hallway frame coordinates (119909 119910 119911) intofunctions of 119906 V and 119911
119888 Solving the new 119911 equation for 119911
119888
and substituting into the equations for 119909 and 119910 yields
119909 = ((11988612119906 + 11988613V + 11988611119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
119910 = ((11988622119906 + 11988623V + 11988621119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
(9)
where 119886119894119895denotes the elements of the matrix in (4) See
Figure 1 for the descriptions of 119909 and 119910For objects likely to be on the floor the height of the
camera above the ground is the 119911 position of the object Alsoif the platform roll can be measured or assumed negligiblethen the combination of the infinity point with the heightcan be used to obtain the range to any object on the floorof the hallway This same concept applies to objects whichare likely to be on the same wall or the ceiling By exploitingthe geometry of the corners present in the corridor our
6 Journal of Electrical and Computer Engineering
119908
1198641198671 = [119897 119889 minus119867]
1198641198621 = 119860119879 middot [119897 119889 minus119867]
119897119889120601
119867
120573
120595
(0 0 0)
1198641198672 = [119897 119889 minus 119908 minus119867]
1198641198622 = 119860119879 middot [119897 119889 minus 119908 minus119867]
Figure 4 A visual description the environment as perceived by theinfinity-point method
method computes the absolute range and bearing of thefeatures effectively turning them into landmarks needed forthe SLAM formulation See Figure 5which illustrates the finalappearance of the ranging algorithm
The graph in Figure 6 illustrates the disagreement bet-ween the line-perspectives and the infinity-point method(Section 23) in an experiment in which both algorithms exe-cuted simultaneously on the same video feedWith the partic-ular camera we used in the experiments (Logitech C905) theinfinity-point method yielded a 93 accuracy These num-bers are functions of camera resolution camera noise and theconsequent line extraction noise Therefore disagreementsnot exceeding 05 meters are in the favor of it with respectto accuracy Disagreements from the ground truth includeall transient measurement errors such as camera shake oroccasional introduction of moving objects that deceptivelymimic the environment and other anomaliesThe divergencebetween the two ranges that is visible between samples 20and 40 in Figure 6 is caused by a hallway line anomaly fromthe line extraction process independent of ranging In thisparticular case both the hallway lines have shifted causingthe infinity point to move left Horizontal translations of theinfinity point have a minimal effect on the measurementperformance of the infinity-point method being one of itsmain advantages Refer to Figure 7 for the demonstrationof the performance of these algorithms in a wide variety ofenvironments
The bias between the two measurements shown inFigure 6 is due to shifts in camera calibration parameters inbetween different experiments Certain environmental fac-tors have dramatic effects on lens precision such as accelera-tion corrosive atmosphere acoustic noise fluid contamina-tion low pressure vibration ballistic shock electromagneticradiation temperature and humidity Most of those condi-tions readily occur on an MAV (and most other platformsincluding human body) due to parts rotating at high speedspowerful air currents static electricity radio interferenceand so on Autocalibration concept is wide and beyond
the scope of this paper We present a novel mathematicalprocedure that addresses the issue of maintaining monocularcamera calibration automatically in hostile environments inanother paper of ours and we encourage the reader to refer toit [22]
3 Helix Bearing Algorithm
When the MAV approaches a turn an exit a T-section ora dead-end both ground lines tend to disappear simul-taneously Consequently range and heading measurementmethods cease to function A set of features might still bedetected and theMAV canmake a confident estimate of theirspatial pose However in the absence of depth informationa one-dimensional probability density over the depth isrepresented by a two-dimensional particle distribution
In this section we propose a turn-sensing algorithm toestimate120595 in the absence of orthogonality cuesThis situationautomatically triggers the turn-explorationmode in theMAVA yaw rotation of the body frame is initiated until anotherpassage is found The challenge is to estimate 120595 accuratelyenough to update the SLAM map correctly This proce-dure combines machine vision with the data matching anddynamic estimation problem For instance if the MAVapproaches a left-turn after exploring one leg of an ldquoLrdquo shapedhallway turns left 90 degrees and continues through the nextleg the map is expected to display two hallways joined at a90-degree angle Similarly a 180-degree turn before findinganother hallway would indicate a dead end This way theMAV can also determine where turns are located the nexttime they are visited
The newmeasurement problem at turns is to compute theinstantaneous velocity (119906 V) of every helix (moving feature)that the MAV is able to detect as shown in Figure 9 Inother words an attempt is made to recover 119881(119909 119910 119905) =
(119906(119909 119910 119905) (V(119909 119910 119905)) = (119889119909119889119905 119889119910119889119905) using a variation ofthe pyramidal Lucas-Kanade method This recovery leads toa 2D vector field obtained via perspective projection of the3D velocity field onto the image plane At discrete time stepsthe next frame is defined as a function of a previous frame as119868119905+1(119909 119910 119911 119905) = 119868
119905(119909 + 119889119909 119910 + 119889119910 119911 + 119889119911 119905 + 119889119905) By applying
the Taylor series expansion
119868 (119909 119910 119911 119905) +120597119868
120597119909120575119909 +
120597119868
120597119910120575119910 +
120597119868
120597119911120575119911 +
120597119868
120597119905120575119905 (10)
then by differentiating with respect to time yields the helixvelocity is obtained in terms of pixel distance per time step 119896
At this point each helix is assumed to be identicallydistributed and independently positioned on the image planeAnd each helix is associated with a velocity vector 119881
119894=
(V 120593)119879 where 120593 is the angular displacement of velocitydirection from the north of the image plane where 1205872 iseast 120587 is south and 31205872 is west Although the associateddepths of the helix set appearing at stochastic points on theimage plane are unknown assuming a constant there is arelationship between distance of a helix from the camera andits instantaneous velocity on the image plane This suggeststhat a helix cluster with respect to closeness of individual
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
4 Journal of Electrical and Computer Engineering
where lowast denotes the convolution operator and 119865 is a 3 times 3kernel for horizontal and vertical derivative approximations1198681015840
119909and 1198681015840119910are combined with weights whose ratio determines
the range of angles through which edges will be filtered Thisin effect returns a binary image plane 1198681015840 with potential edgesthat are more horizontal than vertical It is possible to reversethis effect to detect other edges of interest such as ceilinglines or door frames At this point edges will disintegratethe more vertical they get (see Figure 3 for an illustration)Application of the Hough Transform to 1198681015840 will return allpossible lines automatically excluding discrete point sets outof which it is possible to sort out lines with a finite slope 120601 = 0
and curvature 120581 = 0 This is a significantly expensive oper-ation (ie considering the limited computational resourcesof an MAV) to perform on a real-time video feed since thetransform has to run over the entire frame including theredundant parts
To improve the overall performance in terms of efficiencywe have investigated replacing Hough Transform with analgorithm that only runs on parts of 1198681015840 that contain dataThis approach begins by dividing 1198681015840 into square blocks 119861
119909119910
Optimal block size is the smallest block that can still capturethe texture elements in 1198681015840 Camera resolution and filteringmethods used to obtain 1198681015840 affect the resulting texture elementstructure The blocks are sorted to bring the highest numberof data points with the lowest entropy (2) first as this is ablock most likely to contain lines Blocks that are empty orhave a few scattered points in them are excluded from furtheranalysis Entropy is the characteristic of an image patch thatmakes it more ambiguous by means of disorder in a closedsystem This assumes that disorder is more probable thanorder and thereby lower disorder has higher likelihood ofcontaining an architectural feature such as a line Entropy canbe expressed as
minussum
119909119910
119861119909119910
log119861119909119910 (2)
The set of candidate blocks resulting at this point are to besearched for lines Although a block 119861
119899is a binary matrix it
can be thought as a coordinate system which contains a set ofpoints (ie pixels) with (119909 119910) coordinates such that positive119909is right and positive 119910 is down Since we are more interestedin lines that are more horizontal than vertical it is safe toassume that the errors in the 119910 values outweigh those in the 119909values Equation for a ground line is in the form 119910 = 119898119909 + 119887and the deviations of data points in the block from this line are119889119894= 119910119894minus(119898119909
119894+119887)Therefore themost likely line is the one that
is composed of data points that minimize the deviation suchthat 1198892
119894= (119910119894minus 119898119909119894minus 119887)2 Using determinants the deviation
can be obtained as in (3)
119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum119909
119894
sum119909119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119898 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (119909119894sdot 119910119894) sum119909
119894
sum119910119894
119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
119887 times 119889119894=
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
sum (1199092
119894) sum (119909
119894sdot 119910119894)
sum119909119894
sum119910119894
100381610038161003816100381610038161003816100381610038161003816100381610038161003816
(3)
Since our rangemeasurementmethods depend on these linesthe overall line-slope accuracy is affected by the reliabilityin detecting and measuring the hallway lines (or road linessidewalk lines depending on context) The high measure-ment noise in slopes has adverse effects on SLAM and shouldbe minimized to prevent inflating the uncertainty in 119871
1=
tan1206011and 119871
2= tan120601
2or the infinity point (119875
119909 119875119910) To
reduce this noise lines are cross-validated for the longestcollinearity via pixel neighborhood based line extraction inwhich the results obtained rely only on a local analysis Theircoherence is further improved using a postprocessing stepvia exploiting the texture gradient With an assumption ofthe orthogonality of the environment lines from the groundedges are extracted Note that this is also applicable to ceilinglines Although ground lines (and ceiling lines if applicable)are virtually parallel in the real world on the image plane theyintersectThe horizontal coordinate of this intersection pointis later used as a heading guide for the MAV as illustrated inFigure 5 Features that happen to coincide with these lines arepotential landmark candidates When this step is complete aset of features cross-validated with the perspective lines Ψ1015840which is a subset of Ψ with the nonuseful features removedis passed to the third step
23 Landmark Extraction Step III Range Measurement bythe Infinity-Point Method This step accurately measures theabsolute distance to features inΨ1015840 by integrating local patchesof the ground information into a global surface referenceframe This new method significantly differs from opticalflows in that the depth measurement does not require a suc-cessive history of images
Our strategy here assumes that the height of the camerafrom the ground 119867 is known a priori (see Figure 1) MAVprovides real-time altitude information to the camera Wealso assume that the camera is initially pointed at the generaldirection of the far end of the corridor This later assumptionis not a requirement if the camera is pointed at a wall thesystem will switch to visual steering mode and attempt torecover camera path withoutmapping until hallway structurebecomes available
The camera is tilted down (or up depending on pref-erence) with an angle 120573 to facilitate continuous capture offeaturemovement across perspective linesThe infinity point(119875119909 119875119910) is an imaginary concept where the projections of
the two parallel perspective lines appear to intersect onthe image plane Since this intersection point is in theoryinfinitely far from the camera it should present no parallax inresponse to the translations of the camera It does howevereffectively represent the yaw and the pitch of the camera(note the crosshair in Figure 5) Assume that the end pointsof the perspective lines are 119864
1198671= (119897 119889 minus119867)
119879 and 1198641198672
=
(119897 119889 minus 119908 minus119867)119879 where 119897 is length and 119908 is the width of the
hallway 119889 is the horizontal displacement of the camera fromthe left wall and 119867 is the MAV altitude (see Figure 4 fora visual description) The Euler rotation matrix to convert
Journal of Electrical and Computer Engineering 5
Figure 3 Initial stages after filtering for line extraction in which the line segments are being formed Note that the horizontal lines acrossthe image denote the artificial horizon for the MAV these are not architectural detections but the on-screen display provided by the MAVThis procedure is robust to transient disturbances such as people walking by or trees occluding the architecture
from the camera frame to the hallway frame is given in(4)
119860 =[[
[
119888120595119888120573 119888120573119904120595 minus119904120573
119888120595119904120601119904120573 minus 119888120601119904120595 119888120601119888120595 + 119904120601119904120595119904120573 119888120573119904120601
119904120601119904120595 + 119888120601119888120595119904120573 119888120601119904120595119904120573 minus 119888120595119904120601 119888120601119888120573
]]
]
(4)
where 119888 and 119904 are abbreviations for cos and sin functionsrespectively The vehicle yaw angle is denoted by 120595 thepitch by 120573 and the roll by 120601 Since the roll angle is con-trolled by the onboard autopilot system it can be set to bezero
The points 1198641198671
and 1198641198672
are transformed into the cameraframe via multiplication with the transpose of 119860 in (4)
1198641198621= 119860119879sdot (119897 119889 minus119867)
119879 119864
1198622= 119860119879sdot (119897 119889 minus 119908 minus119867)
119879
(5)
This 3D system is then transformed into the 2D image planevia
119906 =119910119891
119909 V =
119911119891
119909 (6)
where 119906 is the pixel horizontal position from center (right ispositive) V is the pixel vertical position from center (up ispositive) and 119891 is the focal length (37mm for the particularcamera we have used)The end points of the perspective lineshave now transformed from 119864
1198671and 119864
1198672to (119875119909
1 1198751199101)119879 and
(1198751199092 1198751199102)119879 respectively An infinitely long hallway can be
represented by
lim119897rarrinfin
1198751199091= lim119897rarrinfin
1198751199092= 119891 tan120595
lim119897rarrinfin
1198751199101= lim119897rarrinfin
1198751199102= minus
119891 tan120573cos120595
(7)
which is conceptually the same as extending the perspectivelines to infinity The fact that 119875119909
1= 119875119909
2and 119875119910
1= 119875119910
2
indicates that the intersection of the lines in the image planeis the end of such an infinitely long hallway Solving theresulting equations for 120595 and 120573 yields the camera yaw andpitch respectively
120595 = tanminus1 (119875119909
119891) 120573 = minustanminus1 (
119875119910cos120595119891
) (8)
A generic form of the transformation from the pixel position(119906 V) to (119909 119910 119911) can be derived in a similar fashion [3]The equations for 119906 and V also provide general coordinatesin the camera frame as (119911
119888119891V 119906119911
119888V 119911119888) where 119911
119888is the 119911
position of the object in the camera frame Multiplying with(4) transforms the hallway frame coordinates (119909 119910 119911) intofunctions of 119906 V and 119911
119888 Solving the new 119911 equation for 119911
119888
and substituting into the equations for 119909 and 119910 yields
119909 = ((11988612119906 + 11988613V + 11988611119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
119910 = ((11988622119906 + 11988623V + 11988621119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
(9)
where 119886119894119895denotes the elements of the matrix in (4) See
Figure 1 for the descriptions of 119909 and 119910For objects likely to be on the floor the height of the
camera above the ground is the 119911 position of the object Alsoif the platform roll can be measured or assumed negligiblethen the combination of the infinity point with the heightcan be used to obtain the range to any object on the floorof the hallway This same concept applies to objects whichare likely to be on the same wall or the ceiling By exploitingthe geometry of the corners present in the corridor our
6 Journal of Electrical and Computer Engineering
119908
1198641198671 = [119897 119889 minus119867]
1198641198621 = 119860119879 middot [119897 119889 minus119867]
119897119889120601
119867
120573
120595
(0 0 0)
1198641198672 = [119897 119889 minus 119908 minus119867]
1198641198622 = 119860119879 middot [119897 119889 minus 119908 minus119867]
Figure 4 A visual description the environment as perceived by theinfinity-point method
method computes the absolute range and bearing of thefeatures effectively turning them into landmarks needed forthe SLAM formulation See Figure 5which illustrates the finalappearance of the ranging algorithm
The graph in Figure 6 illustrates the disagreement bet-ween the line-perspectives and the infinity-point method(Section 23) in an experiment in which both algorithms exe-cuted simultaneously on the same video feedWith the partic-ular camera we used in the experiments (Logitech C905) theinfinity-point method yielded a 93 accuracy These num-bers are functions of camera resolution camera noise and theconsequent line extraction noise Therefore disagreementsnot exceeding 05 meters are in the favor of it with respectto accuracy Disagreements from the ground truth includeall transient measurement errors such as camera shake oroccasional introduction of moving objects that deceptivelymimic the environment and other anomaliesThe divergencebetween the two ranges that is visible between samples 20and 40 in Figure 6 is caused by a hallway line anomaly fromthe line extraction process independent of ranging In thisparticular case both the hallway lines have shifted causingthe infinity point to move left Horizontal translations of theinfinity point have a minimal effect on the measurementperformance of the infinity-point method being one of itsmain advantages Refer to Figure 7 for the demonstrationof the performance of these algorithms in a wide variety ofenvironments
The bias between the two measurements shown inFigure 6 is due to shifts in camera calibration parameters inbetween different experiments Certain environmental fac-tors have dramatic effects on lens precision such as accelera-tion corrosive atmosphere acoustic noise fluid contamina-tion low pressure vibration ballistic shock electromagneticradiation temperature and humidity Most of those condi-tions readily occur on an MAV (and most other platformsincluding human body) due to parts rotating at high speedspowerful air currents static electricity radio interferenceand so on Autocalibration concept is wide and beyond
the scope of this paper We present a novel mathematicalprocedure that addresses the issue of maintaining monocularcamera calibration automatically in hostile environments inanother paper of ours and we encourage the reader to refer toit [22]
3 Helix Bearing Algorithm
When the MAV approaches a turn an exit a T-section ora dead-end both ground lines tend to disappear simul-taneously Consequently range and heading measurementmethods cease to function A set of features might still bedetected and theMAV canmake a confident estimate of theirspatial pose However in the absence of depth informationa one-dimensional probability density over the depth isrepresented by a two-dimensional particle distribution
In this section we propose a turn-sensing algorithm toestimate120595 in the absence of orthogonality cuesThis situationautomatically triggers the turn-explorationmode in theMAVA yaw rotation of the body frame is initiated until anotherpassage is found The challenge is to estimate 120595 accuratelyenough to update the SLAM map correctly This proce-dure combines machine vision with the data matching anddynamic estimation problem For instance if the MAVapproaches a left-turn after exploring one leg of an ldquoLrdquo shapedhallway turns left 90 degrees and continues through the nextleg the map is expected to display two hallways joined at a90-degree angle Similarly a 180-degree turn before findinganother hallway would indicate a dead end This way theMAV can also determine where turns are located the nexttime they are visited
The newmeasurement problem at turns is to compute theinstantaneous velocity (119906 V) of every helix (moving feature)that the MAV is able to detect as shown in Figure 9 Inother words an attempt is made to recover 119881(119909 119910 119905) =
(119906(119909 119910 119905) (V(119909 119910 119905)) = (119889119909119889119905 119889119910119889119905) using a variation ofthe pyramidal Lucas-Kanade method This recovery leads toa 2D vector field obtained via perspective projection of the3D velocity field onto the image plane At discrete time stepsthe next frame is defined as a function of a previous frame as119868119905+1(119909 119910 119911 119905) = 119868
119905(119909 + 119889119909 119910 + 119889119910 119911 + 119889119911 119905 + 119889119905) By applying
the Taylor series expansion
119868 (119909 119910 119911 119905) +120597119868
120597119909120575119909 +
120597119868
120597119910120575119910 +
120597119868
120597119911120575119911 +
120597119868
120597119905120575119905 (10)
then by differentiating with respect to time yields the helixvelocity is obtained in terms of pixel distance per time step 119896
At this point each helix is assumed to be identicallydistributed and independently positioned on the image planeAnd each helix is associated with a velocity vector 119881
119894=
(V 120593)119879 where 120593 is the angular displacement of velocitydirection from the north of the image plane where 1205872 iseast 120587 is south and 31205872 is west Although the associateddepths of the helix set appearing at stochastic points on theimage plane are unknown assuming a constant there is arelationship between distance of a helix from the camera andits instantaneous velocity on the image plane This suggeststhat a helix cluster with respect to closeness of individual
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 5
Figure 3 Initial stages after filtering for line extraction in which the line segments are being formed Note that the horizontal lines acrossthe image denote the artificial horizon for the MAV these are not architectural detections but the on-screen display provided by the MAVThis procedure is robust to transient disturbances such as people walking by or trees occluding the architecture
from the camera frame to the hallway frame is given in(4)
119860 =[[
[
119888120595119888120573 119888120573119904120595 minus119904120573
119888120595119904120601119904120573 minus 119888120601119904120595 119888120601119888120595 + 119904120601119904120595119904120573 119888120573119904120601
119904120601119904120595 + 119888120601119888120595119904120573 119888120601119904120595119904120573 minus 119888120595119904120601 119888120601119888120573
]]
]
(4)
where 119888 and 119904 are abbreviations for cos and sin functionsrespectively The vehicle yaw angle is denoted by 120595 thepitch by 120573 and the roll by 120601 Since the roll angle is con-trolled by the onboard autopilot system it can be set to bezero
The points 1198641198671
and 1198641198672
are transformed into the cameraframe via multiplication with the transpose of 119860 in (4)
1198641198621= 119860119879sdot (119897 119889 minus119867)
119879 119864
1198622= 119860119879sdot (119897 119889 minus 119908 minus119867)
119879
(5)
This 3D system is then transformed into the 2D image planevia
119906 =119910119891
119909 V =
119911119891
119909 (6)
where 119906 is the pixel horizontal position from center (right ispositive) V is the pixel vertical position from center (up ispositive) and 119891 is the focal length (37mm for the particularcamera we have used)The end points of the perspective lineshave now transformed from 119864
1198671and 119864
1198672to (119875119909
1 1198751199101)119879 and
(1198751199092 1198751199102)119879 respectively An infinitely long hallway can be
represented by
lim119897rarrinfin
1198751199091= lim119897rarrinfin
1198751199092= 119891 tan120595
lim119897rarrinfin
1198751199101= lim119897rarrinfin
1198751199102= minus
119891 tan120573cos120595
(7)
which is conceptually the same as extending the perspectivelines to infinity The fact that 119875119909
1= 119875119909
2and 119875119910
1= 119875119910
2
indicates that the intersection of the lines in the image planeis the end of such an infinitely long hallway Solving theresulting equations for 120595 and 120573 yields the camera yaw andpitch respectively
120595 = tanminus1 (119875119909
119891) 120573 = minustanminus1 (
119875119910cos120595119891
) (8)
A generic form of the transformation from the pixel position(119906 V) to (119909 119910 119911) can be derived in a similar fashion [3]The equations for 119906 and V also provide general coordinatesin the camera frame as (119911
119888119891V 119906119911
119888V 119911119888) where 119911
119888is the 119911
position of the object in the camera frame Multiplying with(4) transforms the hallway frame coordinates (119909 119910 119911) intofunctions of 119906 V and 119911
119888 Solving the new 119911 equation for 119911
119888
and substituting into the equations for 119909 and 119910 yields
119909 = ((11988612119906 + 11988613V + 11988611119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
119910 = ((11988622119906 + 11988623V + 11988621119891)
(11988632119906 + 11988633V + 11988631119891)) 119911
(9)
where 119886119894119895denotes the elements of the matrix in (4) See
Figure 1 for the descriptions of 119909 and 119910For objects likely to be on the floor the height of the
camera above the ground is the 119911 position of the object Alsoif the platform roll can be measured or assumed negligiblethen the combination of the infinity point with the heightcan be used to obtain the range to any object on the floorof the hallway This same concept applies to objects whichare likely to be on the same wall or the ceiling By exploitingthe geometry of the corners present in the corridor our
6 Journal of Electrical and Computer Engineering
119908
1198641198671 = [119897 119889 minus119867]
1198641198621 = 119860119879 middot [119897 119889 minus119867]
119897119889120601
119867
120573
120595
(0 0 0)
1198641198672 = [119897 119889 minus 119908 minus119867]
1198641198622 = 119860119879 middot [119897 119889 minus 119908 minus119867]
Figure 4 A visual description the environment as perceived by theinfinity-point method
method computes the absolute range and bearing of thefeatures effectively turning them into landmarks needed forthe SLAM formulation See Figure 5which illustrates the finalappearance of the ranging algorithm
The graph in Figure 6 illustrates the disagreement bet-ween the line-perspectives and the infinity-point method(Section 23) in an experiment in which both algorithms exe-cuted simultaneously on the same video feedWith the partic-ular camera we used in the experiments (Logitech C905) theinfinity-point method yielded a 93 accuracy These num-bers are functions of camera resolution camera noise and theconsequent line extraction noise Therefore disagreementsnot exceeding 05 meters are in the favor of it with respectto accuracy Disagreements from the ground truth includeall transient measurement errors such as camera shake oroccasional introduction of moving objects that deceptivelymimic the environment and other anomaliesThe divergencebetween the two ranges that is visible between samples 20and 40 in Figure 6 is caused by a hallway line anomaly fromthe line extraction process independent of ranging In thisparticular case both the hallway lines have shifted causingthe infinity point to move left Horizontal translations of theinfinity point have a minimal effect on the measurementperformance of the infinity-point method being one of itsmain advantages Refer to Figure 7 for the demonstrationof the performance of these algorithms in a wide variety ofenvironments
The bias between the two measurements shown inFigure 6 is due to shifts in camera calibration parameters inbetween different experiments Certain environmental fac-tors have dramatic effects on lens precision such as accelera-tion corrosive atmosphere acoustic noise fluid contamina-tion low pressure vibration ballistic shock electromagneticradiation temperature and humidity Most of those condi-tions readily occur on an MAV (and most other platformsincluding human body) due to parts rotating at high speedspowerful air currents static electricity radio interferenceand so on Autocalibration concept is wide and beyond
the scope of this paper We present a novel mathematicalprocedure that addresses the issue of maintaining monocularcamera calibration automatically in hostile environments inanother paper of ours and we encourage the reader to refer toit [22]
3 Helix Bearing Algorithm
When the MAV approaches a turn an exit a T-section ora dead-end both ground lines tend to disappear simul-taneously Consequently range and heading measurementmethods cease to function A set of features might still bedetected and theMAV canmake a confident estimate of theirspatial pose However in the absence of depth informationa one-dimensional probability density over the depth isrepresented by a two-dimensional particle distribution
In this section we propose a turn-sensing algorithm toestimate120595 in the absence of orthogonality cuesThis situationautomatically triggers the turn-explorationmode in theMAVA yaw rotation of the body frame is initiated until anotherpassage is found The challenge is to estimate 120595 accuratelyenough to update the SLAM map correctly This proce-dure combines machine vision with the data matching anddynamic estimation problem For instance if the MAVapproaches a left-turn after exploring one leg of an ldquoLrdquo shapedhallway turns left 90 degrees and continues through the nextleg the map is expected to display two hallways joined at a90-degree angle Similarly a 180-degree turn before findinganother hallway would indicate a dead end This way theMAV can also determine where turns are located the nexttime they are visited
The newmeasurement problem at turns is to compute theinstantaneous velocity (119906 V) of every helix (moving feature)that the MAV is able to detect as shown in Figure 9 Inother words an attempt is made to recover 119881(119909 119910 119905) =
(119906(119909 119910 119905) (V(119909 119910 119905)) = (119889119909119889119905 119889119910119889119905) using a variation ofthe pyramidal Lucas-Kanade method This recovery leads toa 2D vector field obtained via perspective projection of the3D velocity field onto the image plane At discrete time stepsthe next frame is defined as a function of a previous frame as119868119905+1(119909 119910 119911 119905) = 119868
119905(119909 + 119889119909 119910 + 119889119910 119911 + 119889119911 119905 + 119889119905) By applying
the Taylor series expansion
119868 (119909 119910 119911 119905) +120597119868
120597119909120575119909 +
120597119868
120597119910120575119910 +
120597119868
120597119911120575119911 +
120597119868
120597119905120575119905 (10)
then by differentiating with respect to time yields the helixvelocity is obtained in terms of pixel distance per time step 119896
At this point each helix is assumed to be identicallydistributed and independently positioned on the image planeAnd each helix is associated with a velocity vector 119881
119894=
(V 120593)119879 where 120593 is the angular displacement of velocitydirection from the north of the image plane where 1205872 iseast 120587 is south and 31205872 is west Although the associateddepths of the helix set appearing at stochastic points on theimage plane are unknown assuming a constant there is arelationship between distance of a helix from the camera andits instantaneous velocity on the image plane This suggeststhat a helix cluster with respect to closeness of individual
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
6 Journal of Electrical and Computer Engineering
119908
1198641198671 = [119897 119889 minus119867]
1198641198621 = 119860119879 middot [119897 119889 minus119867]
119897119889120601
119867
120573
120595
(0 0 0)
1198641198672 = [119897 119889 minus 119908 minus119867]
1198641198622 = 119860119879 middot [119897 119889 minus 119908 minus119867]
Figure 4 A visual description the environment as perceived by theinfinity-point method
method computes the absolute range and bearing of thefeatures effectively turning them into landmarks needed forthe SLAM formulation See Figure 5which illustrates the finalappearance of the ranging algorithm
The graph in Figure 6 illustrates the disagreement bet-ween the line-perspectives and the infinity-point method(Section 23) in an experiment in which both algorithms exe-cuted simultaneously on the same video feedWith the partic-ular camera we used in the experiments (Logitech C905) theinfinity-point method yielded a 93 accuracy These num-bers are functions of camera resolution camera noise and theconsequent line extraction noise Therefore disagreementsnot exceeding 05 meters are in the favor of it with respectto accuracy Disagreements from the ground truth includeall transient measurement errors such as camera shake oroccasional introduction of moving objects that deceptivelymimic the environment and other anomaliesThe divergencebetween the two ranges that is visible between samples 20and 40 in Figure 6 is caused by a hallway line anomaly fromthe line extraction process independent of ranging In thisparticular case both the hallway lines have shifted causingthe infinity point to move left Horizontal translations of theinfinity point have a minimal effect on the measurementperformance of the infinity-point method being one of itsmain advantages Refer to Figure 7 for the demonstrationof the performance of these algorithms in a wide variety ofenvironments
The bias between the two measurements shown inFigure 6 is due to shifts in camera calibration parameters inbetween different experiments Certain environmental fac-tors have dramatic effects on lens precision such as accelera-tion corrosive atmosphere acoustic noise fluid contamina-tion low pressure vibration ballistic shock electromagneticradiation temperature and humidity Most of those condi-tions readily occur on an MAV (and most other platformsincluding human body) due to parts rotating at high speedspowerful air currents static electricity radio interferenceand so on Autocalibration concept is wide and beyond
the scope of this paper We present a novel mathematicalprocedure that addresses the issue of maintaining monocularcamera calibration automatically in hostile environments inanother paper of ours and we encourage the reader to refer toit [22]
3 Helix Bearing Algorithm
When the MAV approaches a turn an exit a T-section ora dead-end both ground lines tend to disappear simul-taneously Consequently range and heading measurementmethods cease to function A set of features might still bedetected and theMAV canmake a confident estimate of theirspatial pose However in the absence of depth informationa one-dimensional probability density over the depth isrepresented by a two-dimensional particle distribution
In this section we propose a turn-sensing algorithm toestimate120595 in the absence of orthogonality cuesThis situationautomatically triggers the turn-explorationmode in theMAVA yaw rotation of the body frame is initiated until anotherpassage is found The challenge is to estimate 120595 accuratelyenough to update the SLAM map correctly This proce-dure combines machine vision with the data matching anddynamic estimation problem For instance if the MAVapproaches a left-turn after exploring one leg of an ldquoLrdquo shapedhallway turns left 90 degrees and continues through the nextleg the map is expected to display two hallways joined at a90-degree angle Similarly a 180-degree turn before findinganother hallway would indicate a dead end This way theMAV can also determine where turns are located the nexttime they are visited
The newmeasurement problem at turns is to compute theinstantaneous velocity (119906 V) of every helix (moving feature)that the MAV is able to detect as shown in Figure 9 Inother words an attempt is made to recover 119881(119909 119910 119905) =
(119906(119909 119910 119905) (V(119909 119910 119905)) = (119889119909119889119905 119889119910119889119905) using a variation ofthe pyramidal Lucas-Kanade method This recovery leads toa 2D vector field obtained via perspective projection of the3D velocity field onto the image plane At discrete time stepsthe next frame is defined as a function of a previous frame as119868119905+1(119909 119910 119911 119905) = 119868
119905(119909 + 119889119909 119910 + 119889119910 119911 + 119889119911 119905 + 119889119905) By applying
the Taylor series expansion
119868 (119909 119910 119911 119905) +120597119868
120597119909120575119909 +
120597119868
120597119910120575119910 +
120597119868
120597119911120575119911 +
120597119868
120597119905120575119905 (10)
then by differentiating with respect to time yields the helixvelocity is obtained in terms of pixel distance per time step 119896
At this point each helix is assumed to be identicallydistributed and independently positioned on the image planeAnd each helix is associated with a velocity vector 119881
119894=
(V 120593)119879 where 120593 is the angular displacement of velocitydirection from the north of the image plane where 1205872 iseast 120587 is south and 31205872 is west Although the associateddepths of the helix set appearing at stochastic points on theimage plane are unknown assuming a constant there is arelationship between distance of a helix from the camera andits instantaneous velocity on the image plane This suggeststhat a helix cluster with respect to closeness of individual
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 7
(1) Start from level 119871(0) = 0 and sequence119898 = 0(2) Find 119889 = min(ℎ
119886minus ℎ119887) in119872 where ℎ
119886= ℎ119887
(3) 119898 = 119898 + 1 Ψ101584010158401015840(119896) = merge([ℎ119886 ℎ119887]) 119871(119898) = 119889
(4) Delete from 119872 rows and columns corresponding to Ψ101584010158401015840(119896)(5) Add to 119872 a row and a column representing Ψ101584010158401015840(119896)(6) if (forallℎ
119894isin Ψ101584010158401015840(119896)) stop
(7) else go to (2)
Algorithm 1 Disjoint cluster identification from heat MAP119872
Figure 5 On-the-fly range measurements Note the crosshair indicating the algorithm is currently using the infinity point for heading
Sample number
Rang
e (m
)
0 20 40 60 80 100 120 140
858
757
656
Infinity point method
(a)
minus05
minus1
minus15
Sample number
Diff
eren
ce (m
)
0 20 40 60 80 100 120 140
050
(b)
Figure 6 (a) Illustrates the accuracy of the two-rangemeasurementmethodswith respect to ground truth (flat line) (b) Residuals for thetop figure
instantaneous velocities is likely to belong on the surface ofone planar object such as a door frame Let a helix with adirectional velocity be the triple ℎ
119894= (119881119894 119906119894 V119894)119879where (119906
119894 V119894)
represents the position of this particle on the image plane Atany given time (119896) let Ψ be a set containing all these featureson the image plane such that Ψ(119896) = ℎ
1 ℎ2 ℎ
119899 The 119911
component of velocity as obtained in (10) is the determining
factor for 120593 Since we are most interested in the set of helix inwhich this component is minimized Ψ(119896) is resampled suchthat
Ψ1015840(119896) = forallℎ
119894 120593 asymp
120587
2 cup 120593 asymp
3120587
2 (11)
sorted in increasing velocity order Ψ1015840(119896) is then processedthrough histogram sorting to reveal the modal helix set suchthat
Ψ10158401015840(119896) = max
if (ℎ119894= ℎ119894+1)
119899
sum
119894=0
119894
else 0
(12)
Ψ10158401015840(119896) is likely to contain clusters that tend to be distributed
with respect to objects in the scene whereas the rest of theinitial helix set fromΨ(119896)may not fit this model An agglom-erative hierarchical tree 119879 is used to identify the clustersTo construct the tree Ψ10158401015840(119896) is heat mapped represented asa symmetric matrix 119872 with respect to Manhattan distancebetween each individual helixes
119872 =[[
[
ℎ0minus ℎ0sdot sdot sdot ℎ0minus ℎ119899
d
ℎ119899minus ℎ0sdot sdot sdot ℎ119899minus ℎ119899
]]
]
(13)
The algorithm to construct the tree from 119872 is given inAlgorithm 1
The tree should be cut at the sequence119898 such that119898 + 1does not provide significant benefit in terms of modeling
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
8 Journal of Electrical and Computer Engineering
Figure 7 While we emphasize hallway like indoor environments our range measurement strategy is compatible with a variety of otherenvironments including outdoors office environments ceilings sidewalks and building sides where orthogonality in architecture is presentA minimum of one perspective line and one feature intersection is sufficient
the clusters After this step the set of velocities in Ψ101584010158401015840(119896)represent the largest planar object in the field of view withthe most consistent rate of pixel displacement in time Thesystem is updated such that Ψ(119896 + 1) = Ψ(119896) + 120583(Ψ101584010158401015840(119896)) asthe best effort estimate as shown in Figure 8
It is a future goal to improve the accuracy of this algo-rithm by exploiting known properties of typical objects Forinstance single doors are typically a meter-wide It is trivialto build an internal object database with templates for typicalconsistent objects found indoors If such an object of interestcould be identified by an arbitrary object detection algorithmand that world object of known dimensions dim = (119909 119910)
119879and a cluster Ψ101584010158401015840(119896) may sufficiently coincide cluster depthcan be measured via dim(119891dim1015840) where dim is the actualobject dimensions 119891 is the focal length and dim1015840 representsobject dimensions on image plane
4 SLAM Formulation
Our previous experiments [16 17] showed that due to thehighly nonlinear nature of the observation equations tra-ditional nonlinear observers such as EKF do not scale toSLAM in larger environments containing a vast number ofpotential landmarks Measurement updates in EKF requirequadratic time complexity due to the covariance matrixrendering the data association increasingly difficult as the
0 20 40 60 80 100 120 140 160 180 20080859095
100
Figure 8 This graph illustrates the accuracy of the Helix bearingalgorithm estimating 200 samples of perfect 95 degree turns (cali-brated with a digital protractor) performed at various locations withincreasing clutter at random angular rates not exceeding 1 radian-per-second in the absence of known objects
map grows AnMAVwith limited computational resources isparticularly impacted from this complexity behavior SLAMutilizing Rao-Blackwellized particle filter similar to [23]is a dynamic Bayesian approach to SLAM exploiting theconditional independence of measurements A random set ofparticles is generated using the noise model and dynamics ofthe vehicle in which each particle is considered a potentiallocation for the vehicle A reduced Kalman filter per particleis then associated with each of the current measurementsConsidering the limited computational resources of anMAVmaintaining a set of landmarks large enough to allow foraccurate motion estimations yet sparse enough so as not toproduce a negative impact on the system performance isimperativeThe noise model of the measurements along with
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 9
120596119899119881119899
120596 = (119889119889119905)120579Hallway-1 line-L
Hallway-1 line-R Hallway-2 line-R
Figure 9 The helix bearing algorithm exploits the optical flow fieldresulting from the features not associated with architectural lines Areduced helix association set is shown for clarityHelix velocities thatform statistically identifiable clusters indicate the presence of largeobjects such as doors that can provide estimation for the angularrate of the MAV during the turn
the new measurement and old position of the feature areused to generate a statistical weight This weight in essenceis ameasure of howwell the landmarks in the previous sensorposition correlate with the measured position taking noiseinto account Since each of the particles has a different esti-mate of the vehicle position resulting in a different perspec-tive for the measurement each particle is assigned differentweights Particles are resampled every iteration such thatthe lower weight particles are removed and higher weightparticles are replicated This results in a cloud of randomparticles of track towards the best estimation results whichare the positions that yield the best correlation between theprevious position of the features and the new measurementdata
The positions of landmarks are stored by the particlessuch as Par
119899= (119883119879
119871 119875)where119883
119871= (119909119888119894 119910119888119894) and 119875 is the 2times2
covariance matrix for the particular Kalman Filter containedby Par
119899 The 6DOF vehicle state vector 119909V can be updated
in discrete time steps of (119896) as shown in (14) where 119877 =
(119909119903 119910119903 119867)119879 is the position in inertial frame from which the
velocity in inertial frame can be derived as = V119864 The
vector V119861= (V119909 V119910 V119911)119879 represents linear velocity of the
body frame and 120596 = (119901 119902 119903)119879 represents the body angular
rate Γ = (120601 120579 120595)119879 is the Euler angle vector and 119871119864119861
is theEuler angle transformation matrix for (120601 120579 120595) The 3 times 3matrix 119879 converts (119901 119902 119903)119879 to ( 120601 120579 ) At every step theMAV is assumed to experience unknown linear and angularaccelerations 119881
119861= 119886119861Δ119905 andΩ = 120572
119861Δ119905 respectively
119909V (119896 + 1) =(
119877(119896) + 119871119864119861(120601 120579 120595) (V
119861+ 119881119861) Δ119905
Γ (119896) + 119879 (120601 120579 120595) (120596 + Ω)Δ119905
V119861(119896) + 119881
119861
120596 (119896) + Ω
)
(14)
There is only a limited set of orientations a helicopter iscapable of sustaining in the air at any given time withoutpartial or complete loss of control For instance no usefullift is generated when the rotor disc is oriented sidewayswith respect to gravity Moreover the on-board autopilotincorporates IMU and compass measurements in a best-effort scheme to keep the MAV at hover in the absence ofexternal control inputs Therefore we can simplify the 6DOFsystem dynamics to simplified 2D system dynamics with anautopilot Accordingly the particle filter then simultaneouslylocates the landmarks and updates the vehicle states 119909
119903 119910119903 120579119903
described by
xV (119896 + 1) = (cos 120579119903(119896) 1199061(119896) + 119909
119903(119896)
sin 120579119903(119896) 1199061(119896) + 119910
119903(119896)
1199062(119896) + 120579
119903(119896)
) + 120574 (119896) (15)
where 120574(119896) is the linearized input signal noise 1199061(119896) is the
forward speed and 1199062(119896) the angular velocity Let us consider
one instantaneous field of view of the camera in which thecenter of two ground corners on opposite walls is shiftedFrom the distance measurements described earlier we canderive the relative range and bearing of a corner of interest(index 119894) as follows
y119894= h (x) = (radic1199092
119894+ 1199102
119894 tanminus1 [plusmn
119910119894
119909119894
] 120595)
119879
(16)
where 120595 measurement is provided by the infinity-pointmethod
This measurement equation can be related with the statesof the vehicle and the 119894th landmark at each time stamp (119896)as shown in (17) where xV(119896) = (119909
119903(119896) 119910119903(119896) 120579119903(119896))119879 is the
vehicle state vector of the 2D vehicle kinematic model Themeasurement equation h
119894(x(119896)) can be related with the states
of the vehicle and the 119894th corner (landmark) at each timestamp (119896) as given in (17)
h119894(x (119896)) = (
radic(119909119903(119896) minus 119909
119888119894(119896))2
+ (119910119903(119896) minus 119910
119888119894(119896))2
tanminus1 (119910119903(119896) minus 119910
119888119894(119896)
119909119903(119896) minus 119909
119888119894(119896)) minus 120579119903(119896)
120579119903
)
(17)
where 119909119888119894and 119910
119888119894denote the position of the 119894th landmark
41 Data Association Recently detected landmarks need tobe associated with the existing landmarks in the map suchthat each newmeasurement either corresponds to the correctexistent landmark or else registers as a not-before-seenlandmark This is a requirement for any SLAM approach tofunction properly (ie Figure 11) Typically the associationmetric depends on the measurement innovation vector Anexhaustive search algorithm that compares every measure-ment with every feature on the map associates landmarks ifthe newlymeasured landmarks is sufficiently close to an exist-ing oneThis not only leads to landmark ambiguity but also is
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
10 Journal of Electrical and Computer Engineering
computationally intractable for large maps Moreover sincethe measurement is relative the error of the vehicle positionis additive with the absolute location of the measurement
We present a new faster and more accurate solutionwhich takes advantage of predicted landmark locations onthe image plane Figure 5 gives a reference of how landmarksappear on the image plane to move along the ground linesas the MAV moves Assume that 119901119896
(119909119910) 119896 = 0 1 2 3 119899
represents a pixel in time which happens to be contained bya landmark and this pixel moves along a ground line at thevelocity V
119901 Although landmarks often contain a cluster of
pixels size of which is inversely proportional with landmarkdistance here the center pixel of a landmark is referred Giventhat the expectedmaximum velocity119881
119861max is known a pixelis expected to appear at
119901119896+1
(119909119910)= 119891((119901
119896
(119909119910)+ (V119861+ 119881119861) Δ119905)) (18)
where
radic(119901119896+1
(119909)minus 119901119896
(119909))2
+ (119901119896+1
(119910)minus 119901119896
(119910))
2
(19)
cannot be larger than 119881119861maxΔ119905 while 119891(sdot) is a function that
converts a landmark range to a position on the image planeA landmark appearing at time 119896 + 1 is to be associated
with a landmark that has appeared at time 119896 if and onlyif their pixel locations are within the association thresholdIn other words the association information from 119896 is usedOtherwise if the maximum expected change in pixel loca-tion is exceeded the landmark is considered new We savecomputational resources by using the association data from 119896when a match is found instead of searching the large globalmap In addition since the pixel location of a landmark isindependent of the noise in theMAVposition the associationhas an improved accuracy To further improve the accuracythere is also a maximum range beyond which the MAV willnot consider for data association This range is determinedtaking the camera resolution into consideration The farthera landmark is the fewer pixels it has in its cluster thus themore ambiguity and noise it may contain Considering thephysical camera parameters resolution shutter speed andnoise model of the Logitech-C905 camera the MAV is set toignore landmarks farther than 8 meters Note that this is alimitation of the camera not our proposed methods
Although representing the map as a tree based datastructure which in theory yields an association time of119874(119873 log119873) our pixel-neighborhood based approach alreadycovers over 90 of the features at any time therefore a treebased solution does not offer a significant benefit
We also use a viewing transformation invariant scenematching algorithm based on spatial relationships amongobjects in the images and illumination parameters in thescene This is to determine if two frames acquired under dif-ferent extrinsic camera parameters have indeed captured thesame scene Therefore if the MAV visits a particular placemore than once it can distinguish whether it has been to thatspot before
Our approach maps the features (ie corners lines) andillumination parameters from one view in the past to theother in the present via affine-invariant image descriptorsA descriptor 119863
119905consists of an image region in a scene that
contains a high amount of disorder This reduces the proba-bility of finding multiple targets later The system will pick aregion on the image plane with the most crowded cluster oflandmarks to look for a descriptor which is likely to be thepart of the image where there is most clutters hence creatinga more unique signature Descriptor generation is automaticand triggered when turns are encountered (ie Helix BearingAlgorithm) A turn is a significant repeatable event in thelife of a map which makes it interesting for data associationpurposes The starting of the algorithm is also a significantevent for which the first descriptor 119863
0is collected which
helps the MAV in recognizing the starting location if it isrevisited
Every time a descriptor 119863119905is recorded it contains the
current time 119905 in terms of frame number the disorderlyregion 119868
119909119910of size 119909 times 119910 and the estimate of the position and
orientation of the MAV at frame 119905 Thus every time a turnis encountered the system can check if it happened beforeFor instance if it indeed has happened at time 119905 = 119896 where119905 gt 119896 119863
119896is compared with that of 119863
119905in terms of descriptor
and landmarks and the map positions of the MAV at times 119905and 119896 are expected to match closely else it means the map isdiverging in a quantifiable manner
The comparison formulation can be summarized as
119877 (119909 119910) =
sum11990910158401199101015840 (119879 (119909
1015840 1199101015840) minus 119868 (119909 + 119909
1015840 119910 + 119910
1015840))2
radicsum11990910158401199101015840 119879(1199091015840 1199101015840)2
sdot sum11990910158401199101015840 119868(119909 + 119909
1015840 119910 + 1199101015840)2
(20)
where a perfect match is 0 and poor matches are representedby larger values up to 1We use this to determine the degree towhich two descriptors are related as it represents the fractionof the variation in one descriptor that may be explained bythe other Figure 10 illustrates how this concept works
5 Experimental Results
As illustrated in Figures 12 13 and 14 our monocular visionSLAM correctly locates and associates landmarks to the realworld Figure 15 shows the results obtained in an outdoorexperiment with urban roads A 3D map is built by the addi-tion of time-varying altitude and wall positions as shown inFigure 16 The proposed methods prove robust to transientdisturbances since features inconsistent about their positionare removed from the map
The MAV assumes that it is positioned at (0 0 0) Carte-sian coordinates at the start of a mission with the camerapointed at the positive 119909-axis therefore the width of thecorridor is represented by the 119910-axis At anytime during themission a partial map can be requested from the MAV viaInternet The MAV also stores the map and important videoframes (ie when a new landmark is discovered) on-boardfor a later retrieval Video frames are time linked to themap Itis therefore possible to obtain a still image of the surroundings
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 11
Figure 10 Data association metric where a descriptor is shown on the middle
0 10 20 30(m)
Figure 11 Map drift is one of the classic errors introduced by poordata association or lack thereof negatively impacting the loop-closing performance
of any landmark for the surveillance and identification pur-poses
In Figure 12 the traveled distance is on the kilometerscale When the system completes the mission and returns tothe starting point the belief is within one meter of where themission had originally started
51 The Microaerial Vehicle Hardware Configuration SaintVertigo our autonomous MAV helicopter serves as theprimary robotic test platform for the development of thisstudy (see Figure 17) In contrast with other prior works thatpredominantly used wireless video feeds and Vicon visiontracking system for vehicle state estimation [24] SaintVertigoperforms all image processing and SLAM computations on-board with a 1 GHz CPU 1 GB RAM and 4GB storageThe unit measures 50 cm with a ready-to-fly weight of 09 kg
0 10 20(m)
300 10 20(m)
30
Figure 12 Experimental results of the proposed ranging and SLAMalgorithm showing the landmarks added to the map representingthe structure of the environment All measurements are in metersThe experiment was conducted under incandescent ambient light-ning
and 09 kg of payload for adaptability to different missionsIn essence the MAV features two independent computersThe flight computer is responsible for flight stabilizationflight automation and sensory management The navigationcomputer is responsible for image processing range mea-surement SLAM computations networking mass storageand as a future goal path planning The pathway betweenthem is a dedicated on-board link throughwhich the sensoryfeedback and supervisory control commands are sharedThese commands are simple directives which are convertedto the appropriate helicopter flight surface responses by theflight computer The aircraft is IEEE 80211 enabled and all
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
12 Journal of Electrical and Computer Engineering
0 10 20 30(m)
0 10 20 30(m)
(a)
(b)
Figure 13 (a) Experimental results of the proposed ranging andSLAM algorithm with state observer odometer trail Actual floor-plan of the building is superimposed later on a mature map toillustrate the accuracy of our method Note that the floor plan wasnot provided to the system a priori (b) The same environmentmapped by a ground robotwith a different starting point to illustratethat our algorithm is compatible with different platforms
0 10 20 30(m)
0 10 20 30(m)
Figure 14 Results of the proposed ranging and SLAM algorithmfrom a different experiment with state observer ground truth Allmeasurements are in meters The experiment was conducted underfluorescent ambient lightning and sunlight where applicable
0(m)50 1000
(m)50 100
Figure 15 Results of the proposed ranging and SLAM algorithmfrom an outdoor experiment in an urban area A small map ofthe area is provided for reference purposes (not provided to thealgorithm) and it indicates the robot path All measurements arein meters The experiment was conducted under sunlight ambientconditions and dry weather
Hallway length (m)
4035 30
25
25
2020
1515
05
10 10
0
5 50 0
Hallway width (m
)
151
minus5
altit
ude (
m)
Heli
copt
er
Figure 16 Cartesian (119909 119910 119911) position of the MAV in a hallwayas reported by proposed ranging and SLAM algorithm with time-varying altitude The altitude is represented by the 119911-axis andit is initially at 25 cm as this is the ground clearance of theultrasonic altimeter when the aircraft has landed MAV altitude wasintentionally varied by large amounts to demonstrate the robustnessof our method to the climb and descent of the aircraft whereas ina typical mission natural altitude changes are in the range of a fewcentimeters
A
B
C
D
Figure 17 Saint Vertigo the autonomous MAV helicopter consistsof four decksTheAdeck contains collective pitch rotor headmecha-nics The B deck comprises the fuselage which houses the powerplant transmission main batteries actuators gyroscope and thetail rotor The C deck is the autopilot compartment which containsthe inertial measurement unit all communication systems andall sensors The D deck carries the navigation computer which isattached to a digital video camera visible at the front
its features are accessible over the internet or an ad hoc TCP-IP network Among the other platforms shown in Figure 18Saint Vertigo has the most limited computational resources
52 Processing Requirements In order to effectively managethe computational resources on a light weight MAV com-puter we keep track of the CPU utilization for the algorithmsproposed in this paper Table 1 shows a typical breakdown ofthe average processor utilization per one video frame Eachcorresponding task elucidated in this paper is visualized inFigure 2
The numbers in Table 1 are gathered after the map hasmatured Methods highlighted with dagger are mutually exclusivefor example the Helix Bearing algorithm runs only when theMAV is performing turns while ranging task is on standbyParticle filtering has a roughly constant load on the system
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 13
Figure 18 Our algorithms have been tested on a diverse set of mobile platforms shown here Picture courtesy of Space Systems and ControlsLab Aerospace Robotics Lab Digitalsmithy Lab and Rockwell Collins Advanced technology Center
once the map is populated We only consider a limitedpoint cloud with landmarks in the front detection range ofthe MAV (see Section 41) The MAV typically operates at80ndash90 utilization range It should be stressed that thisnumerical figure includes operating system kernel processeswhich involve video-memory procedures as the MAV is notequipped with a dedicated graphics processor The MAVis programmed to construct the SLAM results and othermiscellaneous on-screen display information inside the videomemory in real time This is used to monitor the system forour own debugging purposes but not required for the MAVoperation Disabling this feature reduces the load and freesup processor time for other tasks that may be implementedsuch as path planning and closed-loop position control
6 Conclusion and Future Work
In this paper we investigated the performance of monocularcamera based vision SLAM with minimal assumptions aswell as minimal aid from other sensors (altimeter only) in acorridor-following-flight application which requires preciselocalization and absolute range measurement This is trueeven for outdoor cases because our MAV is capable of build-ing high speeds and covering large distances very rapidly andsome of the ground robots we have tested were large enoughto become a concern for traffic and pedestriansWhile widelyrecognized SLAM methods have been mainly developedfor use with laser range finders this paper presented newalgorithms formonocular vision-based depth perception and
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
14 Journal of Electrical and Computer Engineering
Table 1 CPU utilization of the proposed algorithms
Image acquisition and edge filtering 10Line and slope extraction 2Landmark extraction 20dagger
Helix bearing 20dagger
Ranging algorithms Below 1Rao-Blackwellized particle filter 50
bearing sensing to accurately mimic the operation of such anadvanced device We were able to integrate our design withpopular SLAM algorithms originally meant for laser rangefinders and we have experimentally validated its operationfor autonomous indoor and outdoor flight and navigationwith a small fully self-contained MAV helicopter as well asother robotic platforms Our algorithms successfully adapt tovarious situations while successfully performing the transi-tion between (eg turns presence of external objects andtime-varying altitude)
Since the proposed monocular camera vision SLAMmethod does not need initialization procedures the missioncan start at an arbitrary point Therefore our MAV can bedeployed to infiltrate an unknown building One future taskis to add the capability to fly through doors and windowsIndeed the system is only limited by the capabilities of thecamera such as resolution shutter speed and reaction timeAll of those limitations can be overcome with the properuse of lenses and higher fidelity imaging sensors despite wehave used a consumer-grade USB camera Since the ability toextract good landmarks is a function of the camera capabili-ties a purpose-built camera is suggested for futurework Sucha camera would also allow development of efficient visionSLAM and data association algorithms that take advantageof the intermediate image processing data
Our future vision-based SLAM and navigation strategyfor an indoorMAV helicopter through hallways of a buildingalso includes the ability to recognize staircases and thustraversemultiple floors to generate a comprehensive volumet-ric map of the building This will also permit vision-based3D path planning and closed-loop position control of MAVbased on SLAM Considering our MAV helicopter is capableof outdoor flight we can extend our method to the outdoorperimeter of buildings and similar urban environments byexploiting the similarities between hallways and downtowncity maps Further considering the reduction in weight andindependence from GPS coverage our work also permitsthe development of portable navigation devices for a widerarray of applications such as small-scale mobile robotics andhelmet or vest mounted navigation systems
Certain environments and environmental factors provechallenging to our proposed method bright lights reflectivesurfaces haze and shadows These artifacts introduce twomain problems (1) they can alter chromatic clarity localmicrocontrast and exposure due to their unpredictable high-energy nature and (2) they can appear as false objectsespeciallywhen there is bloom surrounding objects in front ofproblem light source Further reduction in contrast is possible
if scattering particles in the air are dense We have come toobserve that preventative and defensive approaches to suchissues are promising Antireflective treatment on lenses canreduce light bouncing off of the lens and programming theaircraft to move for a very small distance upon detection ofglare can eliminate the unwanted effects Innovative andadaptive application of servo-controlled filters before thelenses can minimize or eliminate most if not all reflectionsThe light that causes glare is elliptically polarized due tostrong phase correlation This is as opposed to essential lightwhich is circularly polarized Filters can detect and blockpolarized light from entering the camera thereby blockingunwanted effects Application of purpose designed digitalimaging sensors that do not involve a Bayes filter can alsohelp Most of the glare occurs in green light region andtraditional digital imaging sensors have twice as many greenreceptors as red and blue Bayes design has been inspiredfrom human eye which sees green better as green is themost structurally descriptive light for edges and cornersThispaper has supplementary material (see Supplementary Mate-rial available online at httpdxdoiorg1011552013374165)available from the authors which show experimental resultsof the paper
Acknowledgments
The research reported in this paper was in part supportedby the National Science Foundation (Grant ECCS-0428040)Information Infrastructure Institute (1198683) Department ofAerospace Engineering and Virtual Reality Application Cen-ter at Iowa State University Rockwell Collins and Air ForceOffice of Scientific Research
References
[1] DHHubel and TNWiesel ldquoReceptive fields binocular inter-action and functional architecture in the catrsquos visual cortexrdquoTheJournal of Physiology vol 160 pp 106ndash154 1962
[2] N Isoda K Terada S Oe and K IKaida ldquoImprovement ofaccuracy for distance measurement method by using movableCCDrdquo in Proceedings of the 36th SICE Annual Conference (SICErsquo97) pp 29ndash31 Tokushima Japan July 1997
[3] R Hartley and A ZissermanMultiple View Geometry in Com-puter Vision Cambridge University Press 2nd edition 2003
[4] F Ruffier and N Franceschini ldquoVisually guided micro-aerialvehicle automatic take off terrain following landing and windreactionrdquo in Proceedings of the IEEE International Conferenceon Robotics and Automation pp 2339ndash2346 New Orleans LoUSA May 2004
[5] F Ruffier S Viollet S Amic and N Franceschini ldquoBio-inspired optical flow circuits for the visual guidance of micro-air vehiclesrdquo in Proceedings of the International Symposium onCircuits and Systems (ISCAS rsquo03) vol 3 pp 846ndash849 BangkokThailand May 2003
[6] J Michels A Saxena and A Y Ng ldquoHigh speed obstacle avoid-ance using monocular vision and reinforcement learningrdquo inProceedings of the 22nd International Conference on MachineLearning (ICML rsquo05) vol 119 pp 593ndash600 August 2005
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
Journal of Electrical and Computer Engineering 15
[7] A Saxena J Schulte and A Y Ng ldquoDepth estimation usingmonocular and stereo cuesrdquo in Proceedings of the 20th inter-national joint conference on Artifical intelligence (IJCAI rsquo07) pp2197ndash2203 2007
[8] N Snavely S M Seitz and R Szeliski ldquoPhoto tourism explor-ing photo collections in 3DrdquoACMTransactions onGraphics vol25 no 3 2006
[9] A W Fitzgibbon and A Zisserman ldquoAutomatic camera recov-ery for closed or open image sequencesrdquo in Proceedings of theEuropean Conference on Computer Vision pp 311ndash326 June1998
[10] ADavisonMNicholas and SOlivier ldquoMonoSLAM real-timesingle camera SLAMrdquo IEEE Transactions on Pattern Analysisand Machine Intelligence vol 29 no 6 pp 1052ndash1067 2007
[11] L Clemente A Davison I Reid J Neira and J Tardos ldquoMap-ping large loops with a single hand-held camerardquo in Proceedingsof the Robotics Science and Systems Conference June 2007
[12] F Dellaert W Burgard D Fox and S Thrun ldquoUsing thecondensation algorithm for robust vision-based mobile robotlocalizationrdquo in Proceedings of the IEEE Computer Society Con-ference onComputer Vision and Pattern Recognition (CVPR rsquo99)pp 588ndash594 June 1999
[13] N Cuperlier M Quoy P Gaussier and C Giovanangeli ldquoNav-igation and planning in an unknown environment using visionand a cognitive maprdquo in Proceedings of the IJCAI WorkshopReasoning with Uncertainty in Robotics 2005
[14] G Silveira E Malis and P Rives ldquoAn efficient direct approachto visual SLAMrdquo IEEE Transactions on Robotics vol 24 no 5pp 969ndash979 2008
[15] A P Gee D Chekhlov A Calway and W Mayol-CuevasldquoDiscovering higher level structure in visual SLAMrdquo IEEETransactions on Robotics vol 24 no 5 pp 980ndash990 2008
[16] K Celik S-J Chung and A K Somani ldquoMono-vision cornerSLAM for indoor navigationrdquo in Proceedings of the IEEE Inter-national Conference on ElectroInformation Technology (EITrsquo08) pp 343ndash348 Ames Iowa USA May 2008
[17] K Celik S-J Chung and A K Somani ldquoMVCSLAM mono-vision corner SLAM for autonomous micro-helicopters in GPSdenied environmentsrdquo in Proceedings of the AIAA GuidanceNavigation and Control Conference Honolulu Hawaii USAAugust 2008
[18] K Celik S J Chung and A K Somani ldquoBiologically inspiredmonocular vision based navigation and mapping in GPS-denied environmentsrdquo in Proceedings of the AIAA Infotech atAerospace Conference and Exhibit and AIAA UnmannedUnli-mited Conference Seattle Wash USA April 2009
[19] K Celik S-J ChungM Clausman andA K Somani ldquoMonoc-ular vision SLAM for indoor aerial vehiclesrdquo in Proceedings ofthe IEEERSJ International Conference on Intelligent Robots andSystems St Louis Mo USA October 2009
[20] J Shi and C Tomasi ldquoGood features to trackrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern Recognition pp 593ndash600 June 1994
[21] H Bay A Ess T Tuytelaars and L van Gool ldquoSpeeded-UpRobust Features (SURF)rdquo Computer Vision and Image Under-standing vol 110 no 3 pp 346ndash359 2008
[22] K Celik and A K Somani ldquoWandless realtime autocalibrationof tactical monocular camerasrdquo in Proceedings of the Interna-tional Conference on Image Processing Computer Vision andPattern Recognition (IPCV rsquo12) Las Vegas Nev USA 2012
[23] M Montemerlo S Thrun D Koller and B Wegbreit ldquoFast-SLAM a factored solution to the simultaneous localization andmapping problemrdquo in Proceedings of the AAAI National Con-ference on Artificial Intelligence pp 593ndash598 2002
[24] J P How B Bethke A Frank D Dale and J Vian ldquoReal-timeindoor autonnomous vehicle test environmentrdquo IEEE ControlSystems Magazine vol 28 no 2 pp 51ndash64 2008
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of