design and implementation of autonomous vision …€¦ · to obtain additional information...

DESIGN AND IMPLEMENTATION OF AUTONOMOUS VISION-GUIDED MICRO AIR VEHICLES

BY

SCOTT M. ETTINGER

A THESIS PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCE

UNIVERSITY OF FLORIDA

2001

ii

ACKNOWLEDGMENTS

I would like to sincerely thank both Dr. Peter Ifju and Dr. Michael Nechyba for the invalu-

able opportunity to work alongside them and to share their knowledge, philosophies, and ideas. I

also thank them for the great creative freedom granted to me in pursuing this work and for their

endless enthusiasm and hands-on involvement during the course of my study. The value of their

influence cannot be overstated. I would also like to thank my family for their long-standing sup-

port through all of my decisions.

iii

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

ABSTRACT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

CHAPTERS

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Micro Air Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 VISION ALGORITHMS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1 Vision System for Aircraft Control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Initial Algorithm Attempts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Revised Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.4 Mathematical Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 REAL TIME IMPLEMENTATION OF THE VISION SYSTEM . . . . . . . . . . . . . . . 19

3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Statistics Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3 Algorithm Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4 CONTROL ALGORITHMS AND FLIGHT TESTING. . . . . . . . . . . . . . . . . . . . . . . 30

4.1 Extreme Attitude Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.2 Kalman Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3 Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4 Flight Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 FUTURE WORK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1 Stability Subsystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

iv

5.2 Vision Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

v

Abstract of Thesis Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of the

Requirements for the Degree of Master of Science

DESIGN AND IMPLEMENTATION OF AUTONOMOUS VISION-GUIDED MICRO AIR VEHCILES

By

Scott M. Ettinger

August 2001

Chairman: Dr. Michael C. NechybaMajor Department: Electrical and Computer Engineering

Our goal in this thesis is to develop Micro Air Vehicle (MAV) technologies that lead to prac-

tical applications. To accomplish this, we focus our work along two principal directions: (1) the

design of practical flight vehicles, and (2) the development of a control subsystem capable of sup-

porting autonomous flight missions. Our control system is based on a novel computer-vision-

based algorithm for attitude estimation of the flight vehicle. In this thesis, we first describe an

algorithm for robust detection and tracking of the horizon. This algorithm has been shown to per-

form well in varying weather conditions and over rapidly changing ground cover. We next discuss

algorithmic optimizations that allow the algorithm to be run at full frame rate. Then, we describe

how we use vision-based horizon tracking to develop flight stability control and error detection/

recovery for practical MAV flights. Our overall control approach has been successfully tested in

multiple autonomous MAV flights above scenery at the University of Florida and Fort Campbell,

Kentucky.

1

CHAPTER 1INTRODUCTION

1.1 Micro Air Vehicles

Ever since man s first powered flight, research efforts have continually pushed the envelope

to create machines that are faster and/or larger than ever before. Now, however, there is an effort to

design aircraft at the other, largely unexplored end of the spectrum, to design aircraft that are as

small and slow as the laws of aerodynamics will permit. The desire for portable, low altitude aerial

surveillance has driven the development of aircraft on the scale of small birds. Vehicles in this

class of small-scale aircraft are known as

Micro Air Vehicles

or

MAVs

, and have great potential for

applications in surveillance and monitoring tasks in areas either too remote or too dangerous to

send human agents. Equipped with small video cameras and transmitters, MAVs can image targets

that would otherwise remain inaccessible. MAVs are also capable of carrying an array of sensors

to obtain additional information including, for example, airborne chemical or radiation levels.

Current industry trends toward miniaturization of both electronics and communications

devices have enabled many recent advances in MAVs. As the technology improves further, more

and more tasks are being considered for potential MAV applications. Operational MAVs would

enable a number of important civilian missions, including chemical/radiation spill monitoring, for-

est fire reconnaissance, visual monitoring of volcanic activity, surveys of natural disaster areas,

and even inexpensive traffic and accident monitoring.

In the military, one of the primary roles for MAVs will be as small-unit battlefield surveil-

lance agents. As such, MAVs can act as an extended set of eyes in the sky for military units in the

field. This use of MAV technology is intended to reduce the risk to military personnel and to sig-

2

nificantly enhance intelligence capabilities. MAVs are particularly suited for such surveillance

tasks, as they are virtually undetectable from the ground. Even within visual range, they often go

unnoticed due to their resemblance to birds. This stealth property also lends itself to non-mili-

tary applications that require unobtrusive surveillance such as wildlife monitoring.

1.2 Challenges

In 1997, the United States Defense Advanced Research Projects Agency (DARPA) launched

an initiative to develop MAVs for the battlefield arena. This initiative prompted the development

and design of a number of flying vehicles including the six-inch wingspan Black Widow by

Aerovironment [5], the MicroStar project by Lockheed Sanders, and the MITE prototype vehicle

by Naval Research Laboratories[8]. At least one commercial company, MLB, has ventured into the

design of MAVs for sale on the consumer market[9]. These vehicles demonstrated that powered

flight is in fact possible on a very small scale.

There are a number of formidable challenges to designing aircraft at the MAV scale that are

not present when designing larger scale vehicles. These challenges fall into three broad categories:

(1) aerodynamic efficiency, (2) increased wing loading, and (3) stability and control. As vehicle

size decreases, the viscous effects of the airflow, which are generally ignored in the design of

large-scale aircraft, begin to have a significant impact on aerodynamic performance. On the MAV

scale, the laminar flow that prevails is easily separated, creating large separation bubbles, espe-

cially at higher angles of attack [17]. Even the best airfoils on the MAV scale have lift to drag

ratios almost an order of magnitude smaller than their larger scale counterparts [15].

The challenges related to wing loading are a direct result of the scale of these aircraft. As the

wingspan of flying vehicles decreases, the mass of the required structures for the vehicle increase

relative to the wing area. Biological studies have shown that wing loading of birds and insects

increases linearly with a reduction wing area due to the fundamental material properties of their

3

structures. This feature of scaling down requires all subsystems of MAVs to be as lightweight as

possible.

Stability and control presents perhaps the most difficult challenge in deploying operational

and usable MAVs. The low moments of inertia of MAVs make them vulnerable to rapid angular

accelerations, a problem further complicated by the fact that aerodynamic damping of angular

rates decreases with a reduction in wingspan. Another potential source of instability for MAVs is

the relative magnitudes of wind gusts, which are much higher at the MAV scale than for larger air-

craft. In fact, wind gusts can typically be equal to or greater than the forward airspeed of the MAV

itself. Thus, an average wind gust can immediately affect a dramatic change in the flight path of

these vehicles. From our own early flight tests, it has become clear that a very robust control sys-

tem is indeed required for practical flight missions on the MAV scale.

The design of an effective MAV control system is further complicated by the limits of cur-

rent sensor technology. The technologies used in rate and acceleration sensors on larger aircraft are

not currently available at the MAV scale. It has proven very difficult, if not impossible, to scale

these technologies down to meet the very low payload requirements of MAVs. While a number of

sensor technologies do currently exist in small enough packages to be used in MAV systems, there

is a tradeoff between accuracy and size. Take, for example, MEMs (Micro Electro-Mechanical

Systems) rate gyros and accelerometers. MEMs piezoelectric gyros, while only weighing approxi-

mately one gram, have drift rates on the order of per minute and are highly sensitive to

changes in temperature. While elaborate temperature calibration procedures can improve their

accuracy to some degree, their use in inertial navigation is difficult at best.

1.3 Solutions

Our overall goal in this work is to develop MAV technologies that lead to deployable vehi-

cles for practical applications. Given the challenges discussed in the previous section, this goal

100°

4

necessitates work along two principal directions: (1) the design of practical flight vehicles, and (2)

the development of a control subsystem capable of supporting autonomous flight missions. It is

these two tasks that are the main focus of this thesis.

1.3.1 Micro Air Vehicle Design

Our goal in airframe design is to create an efficient flight vehicle with as much inherent sta-

bility as possible. By far, the best flying vehicles on the MAV scale are those found in nature.

Small birds can fly for great distances on a limited amount of fuel and have excellent ability to

maintain stable flight and navigate in even the most turbulent wind conditions. Thus, in our design

process we look to biological systems, not to directly emulate, but to gain insight from nature s

very successful and highly optimized designs.

In the design of conventional aircrafts, wings are designed to be as rigid as possible in order

to avoid potentially catastrophic failures due to structural dynamics. Birds, however, do not have

such rigid wings; the wings of birds, and especially bats, exhibit a great deal of flexibility in their

structure. This suggests that in the design of MAVs, we may be able to take advantage of wing

deflections, or aeroelastic effects, caused by aerodynamic loading. The use of flexible wings for

MAVs introduces advantages for both aerodynamic efficiency and stability. Through a passive

mechanism of the flexible wing we call adaptive washout, wind gusts can be suppressed to reduce

their impacts on stability. A similar mechanism is used in the design of most modern wind surfing

sails to reduce the changes in force felt by the wind surfer s mast, allowing the rider to hold on in

gusty conditions.

In order to implement this flexible wing concept, we have developed new construction tech-

niques which make use of carbon fiber composite materials, resulting in structures that are very

durable and lightweight yet simple to construct.

5

1.3.2 Vision-Based Control System

In designing a control system, we require a sufficiently robust control strategy to support

autonomous MAV flights, while addressing the problems associated with current sensor technol-

ogy. Moreover, such a control system should conserve both weight and payload volume to accom-

modate the needs of ever smaller MAVs. Once again, we look towards biological systems for

inspiration. In studying the nervous (control) system of birds, one basic observation holds true for

virtually all of the thousands of different bird species: Birds rely heavily on sharp eyes and vision

to guide almost every aspect of their behavior [18].

The eyes of birds tend to be large in relation to the size of their head; in fact, for some birds,

such as the European starling, eyes make up fully 15% of the mass of their head, compared to 1%

for humans [14]. Not only is the relative size of avian eyes impressive, so too is their color percep-

tion and sharpness of vision. Photoreceptor (cone) densities in the foveae can be as high as 65,000

per square millimeter in bird eyes, compared to 38,000 per square millimeter for humans [18].

Some birds exhibit visual acuity three times that of humans; for example, the American Kestrel

can recognize an insect two millimeters in length from a distance as far away as 18 meters [4].

Moreover, a substantial portion of a bird s brain is devoted to processing visual stimuli. While

birds utilize other sensory input for control, including inner ear structures and tactile sensory

nerves distributed throughout the body, vision appears most critical to the capabilities of birds.

Inspired by their biological counterparts, we have developed a

computer-vision

based con-

trol system for MAV flight stability and autonomy. Our system offers a number of advantages over

other approaches. A computer-vision based control system means that no additional sensors are

required on board the MAV itself, since small, lightweight video cameras

1

have been included on

almost all previous MAV designs to fulfill their primary mission of surveillance. It is important to

1. Full color, high resolution cameras weighing less than two grams have been fabricated ontosilicon wafers using CMOS technology. These cameras are a maturing technology and arecurrently commercially available.

6

note, however, that up to now, on-board cameras had been used, not to control MAVs, but only to

relay images back to the remote MAV pilot.

A computer-vision based approach will also facilitate a number of autonomous MAV behav-

iors that have, as of yet, not been achieved; these behaviors include robust stability, target identifi-

cation and tracking, and landmark navigation. Moreover, many potential MAV missions include

the use of groups or swarms of MAVs, which requires the additional ability to determine the rel-

ative position of other vehicles within the group. All of these issues can be addressed through the

use of a vision-based control system.

Computer vision as a technology is only in its infancy. Up until recently, computer imple-

mentations of real-time vision systems were all but impossible due to the enormous amount of pro-

cessing power needed. Computer vision is, however, becoming more practical every day. Current

technology desktop computers are powerful enough to implement moderately complex computer

vision algorithms today. The power of extremely small microprocessors is expected to grow expo-

nentially to meet the needs of the rapidly growing handheld device industry (PDAs, cellular

phones, etc.); as such, embedded microprocessor systems small enough to be used on the MAV

scale will soon be powerful enough to implement computer vision control strategies on board.

Prior to this work, a University of Florida student, Gabriel Torres, made initial investigations

into a rudimentary form of vision control for flight using Cadmium Sulfide cells to sense the gen-

eral orientation of the horizon on a television monitor. While little or no other work exists in com-

puter-vision based control of aerial vehicles, recent vision-based systems have been developed for

the autonomous control of ground vehicles. Two examples of such systems are RALPH (Rapid

Lateral Position Handler) [12] and Demeter [11]. In RALPH, a forward-looking camera in a pas-

senger car detects the position of the road ahead, and adjusts its steering to stay on the road. This

system has been demonstrated on a journey from Washington, D.C. to San Diego, 98%+ of which

was driven autonomously. Demeter was implemented on board a farm tractor for automated har-

7

vesting of alfalfa. This system was used to automatically drive a harvester through crop fields by

seeing the line that separates cut from uncut crop.

Of course, the main difference between ground-based systems and flight-based systems is in

the static stability and degrees of freedom of the vehicles. Ground vehicles are constrained to three

degrees of freedom and are statically stable, while aerial vehicles operate with six degrees of free-

dom and may not be statically stable. As such, autonomous vision-based control of MAVs or other

aerial vehicles presents challenges that ground vehicles simply do not.

1.4 Overview

In this thesis, we first describe the vision-based flight stability system that we have devel-

oped, implemented and tested on MAVs. Chapter 2 describes the vision algorithm for horizon

detection and tracking, which lies at the core of the vision-based control system. Chapter 3 dis-

cusses algorithmic speed-ups that allow the algorithm to be run in real-time. Chapter 4 describes

the horizon-based control algorithm, and illustrates examples of autonomous flights. Finally,

Chapter 5 offers some concluding thoughts and ideas for future work.

8

CHAPTER 2VISION ALGORITHMS

2.1 Vision System for Aircraft Control

In order to design a stability controller for any aircraft, a measurement of the vehicle s angu-

lar orientation is required for feedback control. Normally this is estimated through the integration

of the vehicle s angular rates or accelerations. A vision-based system, in contrast, can provide a

direct measurement of the aircraft s orientation with respect to the ground. The two degrees of

freedom critical for stability are the bank angle (

Φ

) and the pitch angle (

Θ

). These two angles can

be directly determined from a line representing the horizon on an image from a forward facing

camera on the aircraft. The bank angle is easily determined as the inverse tangent of the slope of

the horizon line. The pitch angle cannot be exactly calculated from an arbitrary horizon line, how-

ever the pitch angle will be closely proportional to the percentage of the image above or below the

line. The percentage above or below will be represented by the symbols

σ

a

and

σ

b

respectively

1

.

Since

σ

a

can be trivially calculated from

σ

b

( ), we will concern ourselves with the

use of

σ

b.

Using a rectangular image, the relationship between

σ

b

and the actual pitch angle is non-

linear and can be lightly coupled with

Φ

and possibly vary with camera distortions. A calibration

table can be used to resolve the discrepancy, however our flight tests have shown adequate perfor-

mance using a linear relationship between

Θ

and

σ

b.

At the heart of the vision system is the horizon fitting algorithm. As the name implies, this

algorithm finds the line with the highest probability of separating ground from sky in a given

image. As with most computer vision endeavors, this problem appears at first glance to be almost

1. These symbols were chosen as a result of the comprehensive use of virtually every letter in both theArabic and Greek alphabets within aerospace terminology

σa 1 σb–=

9

embarrassingly simple. Unfortunately, however, we take for granted the tremendous amount of

processing responsible for our own perception of even something as simple as a horizon line.

While it may be common sense for us to identify the sky and the ground in an image, this com-

mon sense is the result of complex systems throughout the brain combined with years of experi-

ence and learning. This is never more evident than when attempting to devise an algorithm for a

sequential computer to try to perform even a simple task with computer vision. In the case of hori-

zon tracking, there are a number of additional challenges introduced by some of the characteristics

of typical MAV missions. As opposed to larger aircraft flights, MAV missions typically fly at very

low altitudes over potentially rapidly changing terrain. Missions in urban areas can introduce a

wide variety of objects, each of which look different. This situation significantly increases the

difficulty of identifying a horizon line as described below.

2.2 Initial Algorithm Attempts

When approaching this problem, it is tempting to assume that the sky will be high intensity

(light) while the ground will be low intensity (dark) and simply look for the edge between the two,

however this approach will rapidly break down in the presence of many of the types of ground

objects typically encountered during MAV flights. While this approach will work adequately when

flying over a uniform forest on a sunny day, it is doomed to failure for most useful MAV missions.

Unfortunately, our first attempt at a horizon tracking algorithm made exactly this type of assump-

tion.

This initial algorithm attempt was based upon the work done by Mark Ollis on the Demeter

Harvester system at Carnegie Mellon University [11]. The purpose of the Demeter system was to

use computer vision to detect the line between harvested and non-harvested crops in a field in

order to automate the harvesting process. The overall approach of the Demeter algorithm was to fit

a step function to each horizontal raster line in a two dimensional image consisting of 8 bit values.

10

Our initial attempt at an algorithm employed a similar approach. The basic idea behind the algo-

rithm was to fit a step function to the intensity values of each vertical line in the image. This idea is

illustrated in Figure 2-1. Once the positions of the best-fit transitions for each vertical line in the

image are known, a linear regression was performed on the set of (x, y) positions of these points to

hopefully fit a line to the horizon. This process could be repeated using horizontal raster lines for

more accuracy at bank angles greater than 45 degrees. Each point along the line is evaluated by

calculating the mean and standard deviation values for the set of points on either side of the line.

The best-fit step position is the position that minimizes the variance on either side of the line. The

step fitting was performed using an efficient algorithm described in Ollis and Stentz [11].

Unlike in the Demeter harvester application, however, this algorithm cannot assume that every

vertical line in the image will include a transition from ground to sky. In order to function properly,

Inte

nsity

Y position

Figure 2-1: Vertical-line step fitting.

O n( )

11

the algorithm must be able to determine when a transition has not occurred and to discard the point

generated along that line. This is done by using the means calculated for each side of the step func-

tion. If the means are not sufficiently separated from one another, the line is assumed not to con-

tain a horizon transition. A threshold value was used to make the determination. This algorithm

showed promise on the initial set of test data used to evaluate it. This data consisted of images with

only trees in the ground region taken on a sunny day with a high quality digital camera. The algo-

rithm was shown to work very well on this set of test data, even at greatly reduced resolutions.

When data from real MAV flights were tested, it immediately became clear that this algorithm

would perform poorly. Objects on the ground with similar intensities to that of the sky cause large

errors in the horizon estimates using this algorithm. An example of this type of error is shown in

Figure 2-2. A number of attempts were made to improve the algorithm using robust outlier tech-

niques and other heuristics with only slight improvements. While the algorithm could provide

Figure 2-2: Failure of initial algorithm.

12

good estimates under certain conditions, it would be of very limited use for practical MAV mis-

sions.

2.3 Revised Algorithm

There are two major deficiencies in the previously described algorithm. First, and most det-

rimental, is the fact that it makes a priori assumptions about the image data — specifically about the

intensities of sky and ground. Another significant flaw with that algorithm is the fact that while

color information is available, it combines color information into intensity values thus immedi-

ately discarding at least two-thirds of the information available. In light of these issues, a new

algorithm was developed. This new algorithm makes no prior assumptions about the image data;

in fact it is able to build a model for use in further control algorithms as described below. The only

real assumption made for this algorithm is that the horizon can be approximated by a straight line.

The new technique is partially derived from an introspective analysis of the way humans may

approach this problem. It is clear that humans do not attempt to fit a step function to every vertical

line in an image. While we may employ some form of edge detection, it is hypothesized that in

general, using our knowledge of the physical world, we look for two distinct, continuous regions

in the image - each of which have their own distinct properties of color and texture. To be sure,

there are also far more complex mechanisms at work in the human mind. For example, our knowl-

edge of the world can also give us clues for proper identification (e.g. , buildings are not usually in

the sky). For the purposes of this work, it was desired to develop an algorithm able to run in real

time on current desktop PC technology and on embedded processor technology in the near future.

For this reason, algorithm complexity is reduced by limiting ourselves to the use of color informa-

tion, as texture information requires considerable computation to obtain while color information is

immediately available from image data.

13

The approach described above lends itself well to statistical modeling methods. The basic

principle behind the new algorithm is to model both sky and ground as a statistical distribution in

color space. The task is to then find the sets of points within the image that have the highest likeli-

hood of fitting the given distributions. Using our knowledge that the horizon can be approximated

by a line, we can limit our search to the sets of points consisting of continuous regions in the image

separated by a straight line. This approach can be described as fitting two polygons to both the sky

and ground regions. In the case of a straight-line horizon, the sum of the sides of the two polygon

regions will always equal 8. As an aside, a more computationally intensive algorithm could use the

same approach to fit N-sided polygons to the two regions to more accurately describe a non-linear

horizon.

2.4 Mathematical Representations

In this algorithm, two representations of a horizon line are used. For the stability and control

section of the system, the (

Φ

,

σ

b

) definition of a line is more convenient while for image process-

ing, the more familiar slope (m)-intercept (b) form is necessary. A transformation between the two

representations was created to convert between the two. Since closed form solutions for these

transformations are difficult if not impossible to derive, the transformations are evaluated on a

case-by-case basis according to the 8 possible cases of a line intersecting with a rectangle. Each

case is distinguished by which sides of the rectangle are intersected by the line. These cases are

summarized in Figure 2-3. While these transforms are not necessarily elegant, they execute-

quickly. The transformation from (m, b) form to (

Φ

,

σ

b

) is the more straightforward of the two.

Once the particular intersection case is determined, the area below the line is calculated.

σ

b

is sim-

ply the ratio of this area to the total area of the rectangle while

Φ

is calculated as the inverse tan-

gent of b. The transform from (

Φ

,

σ

b

) to (m, b) is less intuitive. First, the slope m is easily

determined by taking the tangent of

Φ

. The value of b must then be evaluated on a case-by-case

14

basis. There are six different cases to consider concerning the slope of the line. These cases are

shown in Figure 2-4(a). The slope can be less than, equal to, or greater than the aspect ratio of the

image and it can be positive or negative resulting in 6 total cases. For each of these six cases, there

are 3 possible sets of polygons defined by the rectangle bounding the image and the line in ques-

tion as shown in Figure 2-4(b). Which of these sets the line actually defines can be determined by

evaluating the value of

σ

b

for the two cases of the lines which pass through the corners of the

image with the given slope as shown in Figure 2-4(b). If the target value of

σ

b

is less than the

value of the minimum corner line, then the line corresponds to case 1 in Figure 2-4(b). If the value

is in between the values at the two corner lines, then the line corresponds to case 2, otherwise, the

Figure 2-3: 8 cases of a line intersecting a rectangle

m>AR

m<AR

m=AR

case 1 case 2

case 3

(a) (b)

Figure 2-4: Transform to slope intercept form.

15

line corresponds to case 3. Once the configuration of polygons is known, an equation can be writ-

ten to equate the area of the polygon(s) below the line to the value of b. To reduce the volume of

code, the number of cases is cut in half by first solving the transform for a positive slope and later

adjusting the intercept value if the slope is indeed negative. While all of this transformation case

evaluation may seem to add undesirable complexity to the code, it is far simpler than the alterna-

tive of trying to represent a circular image in a linear block of memory. These transforms incur

negligible overhead while any attempt to transform the image into a more convenient form would

require a significant reorganization of memory. For this work, the limits of the (

Φ

,

σ

b

) space are

defined to be

Φ

= [-

π

/2,

π

/2] and

σ

b

=[0%, 100%].

As mentioned above, the sky and ground regions of each image frame are modeled as a sta-

tistical distribution. While a more complex distribution will yield a better fitting of the data, there

is a trade-off with speed of execution. For this work, the sky and ground regions were chosen to be

represented as Gaussian distributions in RGB space. The choice of RGB space was again for speed

of execution reasons, as RGB data is immediately available from most video capture systems (as

opposed to other color spaces — HSV, etc.). The principle of the new algorithm is to perform a

search through the potential set of lines in (

Φ

,

σ

b

) space to find the line with the highest likelihood

of being the best-fit horizon line. In order to do this, a cost function is required to evaluate the fit-

ness of a given line. Using the Gaussian model, the sky and ground regions are represented as two

sets of points, each distributed about a separate mean point in RGB space. To create a cost func-

tion, a quantitative method is required to measure how well the two sets of points defined by a

given line fit the Gaussian distributions. This is accomplished by evaluating the means and covari-

ance matrices of the two sets of points. The variances from the means of the two distributions

give a good measure of the fitness of the two sets of points. This can be intuitively rationalized by

considering two different possible horizon lines for a given image. Assuming that the means of the

actual sky and ground regions are distinct (a requirement for a detectable horizon — usually even

16

for humans!), the line that best separates the two regions will have the lowest variance from the

mean. If the line is drawn such that the ground region is contaminated by pixels that are actually

sky (and vice versa), the incorrect points will stray farther from the mean of the region thus

increasing the variance. The incorrect points will also skew the mean slightly — increasing the vari-

ance associated with some of the points that are actually ground as well. In order to produce a

cost function with a scalar value, it is necessary to obtain a single measure of the variance from a

three-dimensional covariance matrix. The three eigenvalues of the covariance matrix represent the

variance of the distribution from the mean along its three principal axes. A natural measurement to

use for the cost function is the product of the eigenvalues of the covariance matrix. This quantity

equates to a measurement of the volume of the variance of the distribution. This function has the

added advantage that it can be calculated very quickly as the determinant of the covariance matrix.

For visualization purposes, we will consider the horizon fitting as a maximization problem and

invert the cost function. Let us call this inverted cost function the fitness function , given by,

(2-1)

for all points below the line (2-2)

for all points above the line (2-3)

for all points below the line (2-4)

for all points above the line (2-5)

F

F 1 ΣG ΣS+( )⁄=

ΣG 1 n 1–( )⁄ x µG–( )Tx µG–( )

i 1=

n

∑=

ΣS 1 n 1–( )⁄ x µS–( )Tx µS–( )

i 1=

n

∑=

µG 1 n⁄ xi

i 1=

n

∑=

µS 1 n⁄ xi

i 1=

n

∑=

17

where denotes the mean vector in RGB space for the ground pixels, denotes the same

mean vector for the sky pixels, denotes the covariance matrix for the ground pixels, and

denotes the covariance matrix for the sky pixels. A typical fitness function surface is shown in Fig-

ure 2-5(a) (for the inset image in Figure 2-5), while Figure 2-5(b) illustrates the pixel distribution

of ground pixels (green circles) and sky pixels (blue crosses) in RGB space for the best-fit line.

Finding the best-fit horizon line now becomes a task of finding the global maximum on this fitness

surface.

Overall, this fitness function performs very well; however, the determinant can become

unstable with near singular matrices. There are some image conditions that can cause this to

become a problem. The solution to this problem is discussed in the implementation section of

Chapter 3, where we also explain algorithmic speed-ups to reduce the number of computations for

the means and covariance matrices for each line.

µG µS

ΣG ΣS

18

Figure 2-5: Typical fitness surface and corresponding pixel distributions of ground and sky pixels.

(a)

(b)

19

CHAPTER 3

REAL TIME IMPLEMENTATION OF THE VISION SYSTEM

3.1 Implementation

With the fitness function now defined, the task is to effectively search the (Φ,σb) space to

find the best line to represent the horizon. For the single peak surface shown in Figure 2-5, an

accelerated gradient based search could quickly find the global maximum on the surface. Unfortu-

nately, the fitness surface is not always so well behaved. Figure 3-1 shows a fitness surface with

multiple peaks. Depending upon initial values, a gradient-based search could end up at one of the

local maxima. It is, however, possible to limit the search space to only those regions where the

Figure 3-1: Fitness surface with local maxima.

20

horizon line can realistically occur. If a dynamic model of the aircraft performance is available, an

estimate of the aircraft s angular orientation (Φ,Θ) can be calculated prior to the horizon fitting of

each frame in the video sequence. The search space can then be limited to a small region around

the point in (Φ,σb) space corresponding to the model estimate thus eliminating a large percentage

of the search area. The confinement of the search space can dramatically reduce the computation

required to find the horizon estimate. This technique can be used in implementations where pro-

cessing power is limited.

For this work, the processing was performed on a ground-based desktop PC. Since a large

amount of processing power is available on this platform, an approach was taken to provide the

most robust performance possible at the cost of increased computation. It must be noted that while

the high-level search algorithm implementation was tailored for a desktop PC, the remainder of

the algorithm was written to execute as quickly as possible with future porting to embedded sys-

tems in mind. The algorithm begins by performing a coarse search over the full range of (Φ,σb)

space. An equally spaced grid of points (Φ,σb) is evaluated across the full search space. The point

found during this search with the highest fitness value is then used as the starting point for an

accelerated search for the maximum. The initial search starts the accelerated search with a coarse

estimate of the global maximum in order to avoid falling into any local maxima in the fitness sur-

face. To reduce the amount of computation required for the initial search, the fitness function is

evaluated after first sub-sampling the image to a lower resolution. The test system used was pow-

ered by a 900 MHz x86 processor running under Mandrake Linux 8.0. The code was written in

C++ and compiled with the gnu compiler suite. On this system, it was possible to run the initial

search at a resolution of 80 by 60 pixels with a grid of 36 by 36 points with a final resolution of

320 by 240 pixels. While this configuration was selected to maximize the use of the processor on

the test system, the algorithm has been tested using resolutions for the initial search down to 32 by

24 pixels with a grid of points down to 12 by 12 — reducing the amount of computation required by

21

a factor of 50. While it is difficult to quantitatively evaluate the performance of the algorithm,

viewing the real-time output of the system reveals only a slight reduction in accuracy when using

the reduced resolution and grid. After the initial search is performed on the downsampled image,

an accelerated, refined search is performed using the full resolution image. An example of this

search is shown in Figure 3-2. The search begins at the point in (Φ,σb) space found during the

coarse search. The search is performed iteratively by evaluating the fitness function at the starting

point and at four additional neighboring points spaced at distances ∆Φ and ∆σb. For the first itera-

tion, the values of ∆Φ and ∆σb are set to a value based on the grid spacing of the initial search. The

point found to have the highest fitness (of the 5 points evaluated per iteration) is then set as the

new starting point, and the values of ∆Φ and ∆σb are divided in half. Pseudo code for this search is

shown in Figure 3-3. This search rapidly converges upon the point that defines the line with the

highest fitness value. For this work, the vast majority of the computation time is spent in the initial

search since a large number of grid points are evaluated.

ΦΦΦΦ

σσσσb

Starting Point

Maximum fitness

Figure 3-2: Accelerated Search Example

22

(point found from initial search)

(defines the search space)

for i = 1 to max_iterations:

if (test > max)

if (test > max)

if (test > max)

if (test > max)

end loop

Φ Φinitial=

σb σb initial,=

∆Φ ∆Φinitial=

∆σb ∆σb initial,=

ax f Φ σb,( )=

Φmax Φ=

σb max, σb=

est f Φ ∆Φ+ σb,( )=

Φmax Φ ∆Φ+=

σb max, σb=

est f Φ ∆Φ– σb,( )=

Φmax Φ ∆Φ–=

σb max, σb=

est f Φ σb ∆∆∆∆σb+,( )=

Φmax Φ=

σb max, σb ∆∆∆∆σb+=

est f Φ σb ∆∆∆∆σb–,( )=

Φmax Φ=

σb max, σb ∆∆∆∆σb–=

Φ Φmax=

σb σb max,=

∆Φ ∆Φ 2⁄=

∆σb ∆σb 2⁄=

Figure 3-3: Accelerated search pseudocode.

23

3.2 Statistics Calculations

3.2.1 Low Level Optimization

The most frequently performed operation in the horizon fitting algorithm is the calculation

of the means and covariances of the 2 sets of points defined by a given line. The system spends by

far the majority of its time (estimated over 85%) performing this operation. It is therefore critical

to optimize this calculation for speed. The first observation that can speed up this calculation is the

fact that the covariance matrix will always be symmetric. Since this is the case, it is only necessary

to calculate the 6 independent values of the 3x3 matrix. (2-1) - (2-5) showed the equations for the

calculation of the covariance matrix. This calculation can be modified by factoring out the subtrac-

tions of the means to allow the calculation to be performed in a single pass through the data. The

resulting equations are shown below.

for all pixels below the line (3-1)

for all pixels above the line (3-2)

for all pixels below the line (3-3)

for all pixels above the line (3-4)

(3-5)

(3-6)

(3-7)

Gc xT

xi 1=

n

∑=

Sc xT

xi 1=

n

∑=

Gm xi 1=

n

∑=

Sm xi 1=

n

∑=

µG Gm ng⁄=

µS Sm ns⁄=

ΣG ng ng 1–( )⁄Gc

ng------ µG

T µG– =

24

(3-8)

where and are the number of pixels in the ground and sky regions, respectively.

Since the covariance matrix is symmetric, only 6 of the 9 elements of the covariance sums

need to be calculated. For the pass through the data, this equates to 9 additions and 6 multiplica-

tions for each pixel in the image. To further increase the speed, these sums are kept as 32 bit inte-

ger values, so the vast majority of the calculation in the algorithm is performed with integer

operations. As image resolution increases, 9 additions and 6 multiplications for each pixel can

become unmanageable. Through higher level optimizations, however, we can eliminate the need to

perform this calculation on every pixel in the image.

3.2.2 Higher Level Optimization

Consider two different lines that are close to each other in (Φ,σb) space. The majority of

pixels on either side of the two lines will be the same. Looking at (3-1), it is evident that when cal-

culating the covariance matrices for the two different lines with only a slight difference in Φ or σb,

the majority of the multiplications and additions performed in (3-1) will be the same since most of

the pixels on either side of the line are the same. A significant reduction in computation can be

gained by keeping track of the values of and between consecutive evaluations of the fitness

function. Once the covariance matrices have been initially calculated for a single line (using all of

the pixels in the image), the covariance matrices can be calculated for any subsequent line by per-

forming the calculations using only the pixels that switch sides from one line to the next. Taking

advantage of this technique, the amount of calculations can be even further reduced by minimizing

the distance in (Φ,σb) space between sequential points to be evaluated. This will provide the mini-

ΣS ns ns 1–( )⁄SC

ns------ µS

TµS– =

ng ns

Gc Sc

25

mum number of differing pixels between sequential line evaluations. For this reason the initial

coarse search of the (Φ,σb) space is performed as shown in Figure 3-4.

3.3 Algorithm Performance

The performance of the new algorithm is far more robust than that of the previous algorithm.

Figures 3-5 through 3-7 show the results of the new algorithm on a number of example video

frames. Figure 3-5 shows the results using the same image that was shown to fail using the earlier

algorithm. Here, the algorithm is able to find the correct horizon line. The correct horizon appears

as a clear peak on the fitness surface. Figure 3-6 illustrates the very robust nature of the algorithm.

Even though the video signal in this image is badly corrupted by video transmission noise, the

algorithm is still able to find the correct horizon line. Figure 3-6 shows an example of the algo-

rithm s performance on an image in which the horizon is not entirely linear. The algorithm is still

able to find a line which best approximates the actual horizon. There are a few types of images,

however, that will cause problems using the current fitness function. These failures are due to the

use of the determinant in the fitness function. The key to understanding the reason for the failure

ΦΦΦΦ

σσσσb

Figure 3-4: Coarse search pattern minimizing distance between evaluations.

26

(a)

(b)

(c)

Figure 3-5: Results of new algorithm on an image that caused the previous algorithm to fail.

27

can be found in the graphs of the pixel distributions. Nearly all of the pixels in these types of

images are distributed very closely to the gray axis (R=G=B). This results in distributions with a

large variance along the principle axis, but very small variances along the other two axes. The

resulting covariance matrix has one large eigenvalue and two that approach zero. For the extreme

case of black and white images, two of the eigenvalues will be exactly equal to zero. This is the

reason for the failure of the algorithm to resolve the horizon line. Although there may be a large

variance along the principle axis of the distributions for many lines being evaluated, the value of

the determinant will be driven to zero since it is equal to the product of the eigenvalues. The solu-

(a)

(b)

Figure 3-6: Results of new algorithm on an image that is badly corrupted by video transmission noise

28

tion to this problem is to add another term to the fitness function. While the determinant is the

proper function to use when enough color information is available, a different measure is required

when the covariance matrix is nearly singular. When this is the case, we want to use the variances

contained in the non-zero eigenvalues of the covariance matrix. This is done by adding a term con-

sisting of the square of the sum of the eigenvalues. This term will not become zero when only 1 or

2 of the eigenvalues is driven to zero. While the determinant term will dominate the fitness func-

tion when enough color information is available, the squared sum of the eigenvalues term will pro-

vide a measure of the variance even when the covariance matrix is nearly singular. The new fitness

function is shown in (3-9).

(3-9)

This fitness function also works on the degenerative case of black and white images. Our

tests show, however, that the use of color information is critical to the correct identification of the

horizon line for many images. Without the use of color there is often simply not enough informa-

tion available to avoid errors due to ground objects. While humans are still able to correctly pro-

cess these images, we believe that it is accomplished through the use of both texture information

and higher order common sense mechanisms. Most of the images causing nearly singular cova-

Figure 3-7: Algorithm performance on a non-linear horizon

F ΣG ΣS λG1 λG2 λG3+ +( )2 λS1 λS2 λS3+ +( )2+ + +[ ] 1–=

29

riance matrices can be attributed to the camera used to collect the data. The camera used in the test

setup was necessarily very small as the payload requirements for MAVs are minimal. The camera

is a single chip CMOS type exhibiting poor color characteristics. With higher quality camera

images, these types of nearly singular frames occur very infrequently, but can still be processed

correctly using the new fitness function. The other source of these types of frames is distortion due

to the transmission of the video signal. For this work, all processing was done on the ground,

requiring a video transmission downlink from the MAV. Color information can occasionally be

lost through the transmission as well, so it is important for these types of frames to be handled

properly.

30

CHAPTER 4

CONTROL ALGORITHMS AND FLIGHT TESTING

4.1 Extreme Attitude Detection

Even given a perfect horizon fitting algorithm, a control system will invariably encounter

the problem of what to do when the camera is not actually looking at the horizon. If the aircraft is

heading straight towards the ground, there will be no horizon visible in the camera image, however

the control system will certainly be required to take action to correct the situation. It is desired

then, to be able to detect when the horizon is not in view of the camera, and if so to determine what

action to take in order to bring the horizon back into view. Humans are able to perform this task

very easily. Getting this to happen on a sequential computer, however, is once again a considerable

challenge. Humans are able to do this by using two types of information. First, they know what the

sky and ground look like. Humans use their memory of the color and texture properties of the

sky and ground (along with higher order reasoning - e.g. clouds are not on the ground) to deter-

mine whether they are looking at all sky or all ground. The other piece of information that a human

pilot uses is the past history of the location of the horizon line. If the horizon line was recently near

the top of the image, it is logical that a subsequent image without a horizon line is probably a view

of the ground. We can use these same two types of information to try to quantitatively determine if

the horizon line exists in the image and if not, to determine whether we are looking at the sky or

the ground. We can also use this same process to check the validity of the horizon fitting on images

that do contain a horizon. This is a valuable check that can eliminate any frame errors that may

occur. This is especially useful when performing ground based processing, since video transmis-

sion errors can cause transient noise in the video signal.

31

To give the computer algorithm some knowledge of what the sky and ground look like, we

can take advantage of the statistics that are generated during the horizon fitting algorithm. After

the successful horizon fitting of an image with a real horizon present, the algorithm has generated

a Gaussian distribution model for both the sky and the ground of that image. The general proce-

dure for the error detection algorithm is to keep a running statistical model for both the sky and the

ground — updating the models with each known good frame. With each new frame, the result of the

horizon fitting can be checked by comparing the sky and ground models for the current frame with

the known statistical models for sky and ground. If the distributions on either side of the line in the

current frame both appear to be more similar to the known ground distribution, then it would

appear that the aircraft is pointing towards the ground. Conversely, if they both match the sky bet-

ter, then it is advisable to nose downward. Interestingly, if the sky in the current frame matches the

ground model while the ground in the current frame matches better with the sky model, we can

potentially conclude that we are flying upside down. This method does require a bootstrapping

technique to insure that the models for sky and ground are initially valid. Upon startup, the camera

must be looking at a valid horizon for a set number of frames.

While these detection techniques can tell us when something appears to be wrong, they are

really only useful in the context of the recent history of the horizon line. For example, if the hori-

zon line has been close to the bottom of the image (nose-up aircraft attitude) for the last few

frames, and the statistical models for the very next frame indicate that the image is all ground, it is

more likely that an error has occurred in either the video transmission or the image processing

since the aircraft dynamics render such a change in pitch attitude impossible. We can use this strat-

egy to determine when to take action to return the horizon back into view. If the horizon line has

been close to the top of the image in recent frames when an all-ground detection is made, then it is

advisable to take action to pull the nose of the aircraft up. When a model matching error occurs

32

that does not make sense in the context of the recent horizon line history, we can assume that an

error has occurred and use this error detection to throw out the bad measurement.

4.1.1 Implementation

In order to implement the extreme attitude detection scheme described above, a mathemati-

cal representation of both the horizon line history and the sky and ground models are required. For

the horizon line history, we are really only concerned with the history of the recent values of σb. A

running average of the measurement of σb over a set number of frames is kept for use as the esti-

mate of the horizon line history. By trial and error, a value of 10 frames was determined to be an

appropriate number. This running average will be referred to as σavg. The models for sky and

ground are kept as separate sets of means and covariance matrices. Upon startup of the system, the

camera is assumed to be pointed such that the horizon is in its view. When the first frame of video

is processed by the system, the means and covariance matrices of the ground and sky models are

set equal to those found by the horizon fitting algorithm. The system then begins to update the

models using the results of the horizon fitting algorithm for a set number of initialization frames.

For the current system, 100 initialization frames were used. It is necessary to continually update

the sky and ground models as the aircraft flies to account for changes in lighting associated with

orientation and changes in landscape, etc. The model updates are performed using the following

equations.

(4-1)

(4-2)

(4-3)

(4-4)

ΣGm αΣGm 1 α–( )ΣG+=

ΣSm αΣSm 1 α–( )ΣS+=

µGm αµGm 1 α–( )µG+=

µSm αµSm 1 α–( )µS+=

33

where , , , and are the model covariance matrices and means, respectively,

while , , , and are the covariance matrices and means for the current frame.

The constant controls how rapidly the models change. After an initial model is estab-

lished, the system can perform its test to assess the validity of the horizon in the current frame. We

now have available the sky and ground model distributions along with the distributions of the

regions chosen by the horizon fitting algorithm for the current frame. These four distributions are

used as input to the valid horizon test. It is desired to determine which of the known sky or ground

models more closely resembles the distributions in the two regions of the current frame. A quan-

titative way to determine the similarity of two Gaussian distributions is required. This similarity

measure is performed by calculating the Mahalanobis distance of the means of the two distribu-

tions from one another. The sum of these distances is a good measure of similarity. This measure is

effectively a distance between the means scaled to reflect the variances of the distributions. The

smaller the value, the more similar are the two distributions. These similarity values are calculated

using the following equations:

(4-5)

(4-6)

(4-7)

(4-8)

The value of D1 is the measure of similarity between the region selected as the sky by the

horizon fitting algorithm in the current frame and the known sky model. D2 is the similarity

between this possible sky region and the known ground model. Likewise, the values of D3 and D4

are the similarity measures for the region selected as ground by the horizon fitting algorithm to the

known sky and ground models respectively. Once these values have been calculated, conclusions

ΣGm ΣSm µGm µSm

ΣG ΣS µG µS

α

D1 µG µGm–( )TΣGm1– µG µGm–( ) µGm µG–( )TΣG

1– µGm µG–( )+=

D2 µS µGm–( )TΣGm1– µS µGm–( ) µGm µS–( )TΣS

1– µGm µS–( )+=

D3 µG µSm–( )TΣSm1– µG µSm–( ) µSm µG–( )TΣG

1– µSm µG–( )+=

D4 µS µSm–( )TΣSm1– µS µSm–( ) µSm µS–( )TΣS

1– µSm µS–( )+=

34

can be drawn about the current frame. For a normal frame where the horizon fitting algorithm cor-

rectly identifies the horizon line, the value of D2 will be greater than the value of D1 and the value

of D3 will be greater than the value of D4. The set of possible outcomes for this comparison are

summarized as follows:

Case 1: D1<D2 and D3>D4 : NormalCase 2: D1>D2 and D3>D4 : All groundCase 3: D1<D2 and D3<D4 : All SkyCase 4: D1>D2 and D3<D4 : Upside down?

The results of this test can now be combined with the past history of the horizon line to

decide which action to take. If the frame is determined to be normal by the validity test, then the

measurements of Φ and σb are considered to be accurate, and commands sent to the airframe are

determined by the normal control system loop described in a following section. The distributions

generated by the validated frame are used to update the sky and ground models. The models are

only updated if the validity test returns normal. If the validity test returns a higher likelihood of all

ground, the past history of the horizon line is consulted to determine which action to take. If the

value of σavg is above a set threshold, then the system goes into a pull-up mode that sends com-

mands to the aircraft to rapidly increase its pitch angle Θ. A value of 0.8 was used for the value of

this threshold. While the system is in pull-up mode, the statistical models are not updated since the

horizon lines will be incorrect (probably ground on both sides). While the system is in pull up

mode, σavg is only updated using the measured value of σb if the validity test proves the current

frame to have a normal horizon. Otherwise, σavg is updated using a value of 1.01. The system will

stay in pull up mode until a valid horizon is detected. Similarly, if the validity test returns a higher

likelihood of all sky and the value of σavg is below a given threshold value, the system goes into a

nose down mode. The value used for this threshold was 0.2. If the validity test returns any of the

non-normal cases and the value of σavg is between the two thresholds, this frame is considered to

be an error. In this case, the measurements of Φ and σb from the previous frame are used as the

35

estimates for the current frame. This detection system performs well. Again it is difficult to quanti-

tatively assess the accuracy of the system on real-time data since there is no correct answer to

compare it to. Both the qualitative viewing of the output, however, along with successful flight

tests indicate that the system works well. Initial tests of the system were performed by playing pre-

viously recorded video of human piloted MAV flights into the system. Figure 4-1 shows a sample

of recorded output data from the system in which the video signal is corrupted by noise. The graph

of the estimates of the pitch angle are overlaid onto the graph of the error detection trigger. When

an error is detected, the error detection trigger has a value of 1. It can be seen from the figure that

the algorithm uses the last known good estimate during the video transmission error instead of

using the results from the bad frames. Figure 4-2 shows an example of extreme attitude detection.

As soon as the horizon disappears off the top of the image, the error detection trigger value jumps

Figure 4-1: Pitch angle estimates during video transmission noise.

36

to 1. The controller goes into pull-up mode until the horizon reappears near the top of the image.

The measurement remains at 1 until the controller exits pull-up mode.

4.2 Kalman Filtering

In order to improve the angle estimates for use in the control system, the angle estimate val-

ues (after being processed by the extreme attitude detector) are passed through a Kalman filter. A

Kalman filter provides an optimal estimate given a dynamic system model and a time series of

measurements. A Kalman filter can perform well even without an accurate dynamic model. In our

case, the Kalman filter has the effect of removing high frequency noise from the system measure-

ments and to eliminate any radical single frame errors not caught by the error detection system.

The main purpose for the Kalman filter in these tests, however, is to eliminate unnecessary small

control surface deflections due to noise. A dynamic model for our MAV airframes is currently

being developed at NASA Langley Research Center. Since the inputs to the system are known, the

Figure 4-2: Extreme attitude detection.

σb

37

model can be used with the Kalman filter to provide an estimate of the airframe orientation with

excellent accuracy. This model was not available in time to incorporate it into this work. The Kal-

man filter provides adequate results, however, using a simple constant velocity model. Using this

model, the filter has the effect of reducing the unwanted noise while eliminating any large jumps

in the measurements. With the code infrastructure in place, the dynamic model can be added when

it is available to improve the accuracy of the angle estimates.

4.3 Feedback Control

A very simple control strategy was implemented flight test validation of the system. A con-

trol system was desired to take a set of angles (Φ,Θ) as inputs to command the actual orientation of

the aircraft. For simplicity, the bank and pitch angles are treated as independent systems. For both

angles, a simple control loop is implemented using the measured angles in the feedback loops.

Both control loops contain both a proportional and a differential term with associated gains. These

control loops are updated with every video frame (30 Hz). The system was tested initially with the

differential gains set to zero. The resulting system was inherently stable and adequate for limited

autonomous flight applications. The differential gains were then introduced to improve perfor-

mance. Without beginning a full discussion of control dynamics, the differential gains allow for an

increase in the proportional gains without introducing instability. As with most dynamic systems,

an increase in proportional feedback gain can lead to undamped oscillations in both the pitch and

bank angles due to the dynamics of the system. The differential terms in the feedback loops act to

damp the oscillations. This allows for a higher value for the proportional gain, leading to improved

response time without unstable dynamics.

4.4 Flight Tests

Figure 4-3 shows a pictorial diagram of the flight test setup. The video signal is transmitted

from the airborne MAV to the ground based PC where the processing is performed. The PC then

38

sends control surface commands to the MAV through a custom designed interface. This interface

allows the PC to control a standard Futaba radio transmitter through an RS-232 serial port. The

MAV used for test flights had a wingspan of 18 inches. This platform was selected for use in initial

testing both for its increased dynamic time constants and its ability to carry a high powered video

transmitter. Figure 4-4 shows a picture of this MAV. The on-board camera is a monolithic CMOS

type camera with a 1/3 inch sensor area. The camera is connected to an 80 mW video transmitter.

The MAV is powered by electric propulsion and has differential elevons for control. The software

is written to support both elevon and rudder-elevator control designs. The PC interface and trans-

mitter are shown in Figure 4-5. This interface uses a PIC microcontroller to translate serial com-

mands from the PC into the pulse width modulated signals required for input to the transmitter. A

carbon fiber housing was constructed to hold the circuit board and port connectors for the inter-

face. Prior to launch, the aircraft was held in an orientation in which the horizon was in view of the

camera. This allows the software to build its initial models of the sky and the ground. Flights were

controlled by a human pilot from launch until the MAV could reach a sufficient altitude. At this

point, control was transferred to the automated system. The radio transmitter is equipped with an

Figure 4-3: Experimental setup for video-based flight control.

video signal

video

servo control (radio link)

desired heading

video signal

vision-based control

MAV

antenna

39

override button to allow the human pilot to regain control at any time if necessary. A joystick con-

nected to the PC was used as the inputs for the control loops. The joystick input effectively com-

mands a bank and pitch angle for the aircraft to follow. Later flights used a pre-programmed set

of maneuvers for truly autonomous flight. Controller gains were set by trial and error over a series

of test flights. Uninterrupted autonomous flights of over 8 minutes were performed, ending only

due to video transmission interference. A sample of data recorded from a typical flight is shown in

Figure 4-6 and Figure 4-7.

After control had been transferred over to the system, a user who had never piloted any type

of aircraft before was able to easily guide the MAV around the field. Qualitatively, the control sys-

tem is able to provide far better stability than a human pilot in both steady, level flight and in coor-

dinated turns. More complex control schemes will further improve upon this performance.

Figure 4-4: MAV used for test flights

40

Figure 4-5: Radio Transmitter and Serial Interface

220 230 240 250 260 270 280 290

−1.5

−1

−0.5

0

0.5

1

Figure 4-6: Bank angle flight data: commanded (red), measured (blue).

41

220 230 240 250 260 270 280 290

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure 4-7: Pitch estimate flight data: commanded (red), measured (green).

42

CHAPTER 5FUTURE WORK

5.1 Stability Subsystem

The systems developed in this work have taken only the first steps toward a user friendly

MAV platform capable of fully autonomous utility missions. The stability subsystem developed in

this work meets all of the goals set in its initial design. Its performance, however, can be increased

with further developments. First, the inclusion of an accurate dynamic model into the existing con-

trol framework will improve the accuracy of the angular estimates. The use of an improved

dynamic model can also allow a further reduction of the required search space to find the best-fit

horizon line and provide a corresponding reduction in the required computation. Future work will

also bring the migration of the system from the desktop PC platform to embedded processor tech-

nology. This will bring the entire system on-board the aircraft. Embedded microprocessors with

the required processing power and form factor are expected to be available in under two years. It is

likely feasible to implement a reduced resolution version of the vision based stability system on

embedded technology available today. Another possibility is to use an embedded multiprocessor

system since the image processing code is easily parallelizable. This movement toward an on-

board system has advantages for some MAV applications. First, it eliminates the errors associated

with a video transmission link. In addition, it eliminates the need for any radio emissions coming

from the MAV itself during autonomous missions. This can be a great advantage when detection is

a key issue. For these types of missions, I envision the use of fully autonomous flights without any

radio emissions from the MAV until it reaches its target destination at which time transmission of

43

the video signal can begin. The use of on-board processing also eliminates the possibility of radio

frequency jamming for disruption of MAV missions.

The performance of the system can also be increased with the use of a model-based control

algorithm. The control strategy used in this work was a very simple proportional-differential feed-

back loop. A model-based controller using state feedback from the vision subsystem should be

implemented once a particular MAV platform is settled upon.This type of controller can be tai-

lored to meet mission performance requirements.

5.2 Vision Algorithms

The possibilities for the further use of vision processing for MAVs are nearly endless. Future

work will build upon the stability subsystem developed in this work to provide a wide range of

behaviors and capabilities. The next phase of work will begin with the development of an object

detection and tracking system. Many higher level capabilities can be derived from an object track-

ing engine. One of the primary capabilities to be implemented will be to make use of the object

detection and tracking system for automated surveillance of a given target. This capability would

involve following an object and keeping it within view of the camera image. An even higher level

behavior can be developed to actively search a given area for objects with a known vision profile

(color model, texture model, etc.) and once found, begin automated surveillance. An object track-

ing engine can also be used to derive a system to provide landmark navigation and possibly head-

ing and yaw rate information. These abilities have the advantage that they can operate in situations

where Global Positioning System (GPS) signals are compromised or unavailable. A third behavior

made possible by an object tracking system is automated targeted landing. Used in conjunction

with the stability subsystem, an algorithm could be developed to visually find a target landing area

and perform a safe, automated landing. This behavior would provide a very user friendly vehicle

for automated launch and return missions.

44

The vision-based control strategy developed in this work holds great potential for use in

future MAV designs. To our knowledge, the test flights performed in this work were the first of

their kind, being autonomously controlled solely through computer vision.

45

REFERENCES

[1] A. Azuma and M. Okamoto, Aerodynamic Characteristics of Wing at Low Reynolds Num-ber, Proc.of the Conf. on Fixed, Flapping, and Rotary Wing Vehicles at Very Low ReynoldsNumbers, Notre Dame University, pp. 216-4, 2000.

[2] P. R. Ehrlich, D. S. Dobkin and D. Wheye, Adaptions for Flight, http://www.stanforda-lumni.org/birdsite/text/essays/Adaptions.html, June, 2001.

[3] P. R. Ehrlich, D. S. Dobkin and D. Wheye, Flying in Vee Formation, http://www.stanforda-lumni.org/birdsite/text/essays/Flying_in_Vee.html, June, 2001.

[4] R. Fox, S. W. Lehmkule and D. H. Westendorf, Falcon Visual Acuity, , Science, vol. 192,pp. 263-5, 1976.

[5] J. M. Grasmeyer and M. T. Keennon, Development of the Black Widow Micro Air Vehicle,AIAA Paper No. 2001-0127, 2001.

[6] P. G. Ifju, S. Ettinger, D. A. Jenkins and L. Martinez, Composite Materials for Micro AirVehicles, Proc. of the SAMPE Annual Conf., Long Beach CA, May 6-10, 2001.

[7] P. G. Ifju, S. Ettinger, D. A. Jenkins, and L. Martinez, Composite Materials for Micro AirVehicles, SAMPE Journal (in press), 2001.

[8] J. Kellogg, C. Bovais, J. Dahlburg, R. Foch, J. Gardner, D. Gordon, R. Hartley, B. Kamgar-Parsi, H. McFarlane, F. Pipitone, R. Ramamurti, A. Sciambi, W. Spears, D. Srull and C. Sul-livan, The NRL Mite Air Vehicle, Proc. Sixteenth Int. Conf. On Unmanned Air Vehicle Sys-tems, pp. 25.1-25.14, Bristol, United Kingdom, April 2001.

[9] S. Morris and M.E. Holden, Design of Micro Air Vehicles and Flight Test Validation,Proc.of the Conf. on Fixed, Flapping, and Rotary Wing Vehicles at Very Low Reynolds Num-bers, Notre Dame University, pp. 153-75, 2000.

[10] Northern Prairie Wildlife Research Center, Migration of Birds: Orientation and Naviga-tion, http://www.npwrc.usgs.gov/resource/othrdata/migration/ori.htm, June, 2001.

[11] M. Ollis and T. Stentz, Vision-Based Perception for an Automated Harvester, Proc. IEEEInt. Conf. on Intelligent Robots and Systems, vol. 3, pp. 1838-44, 1997.

[12] D. A. Pomerleau and T. Jochem, Rapidly Adapting Machine Vision for Automated VehicleSteering, IEEE Expert, vol. 11, no. 2, pp. 19-27, 1996.

46

[13] R. Ramamurti and W. Sandberg, Computation of Aerodynamic Characteristics of a MicroAir Vehicle, Proc.of the Conf. on Fixed, Flapping, and Rotary Wing Vehicles at Very LowReynolds Numbers, Notre Dame University, pp. 128-41, 2000.

[14] G. Ritchison, Ornithology: Nervous System: Brain and Special Senses II, http://www.biol-ogy.eku.edu/RITCHISO/birdbrain2.html, June 2001.

[15] W. Shyy, D. A. Jenkins and R. W. Smith, Study of Adaptive Shape Airfoils at Low ReynoldsNumber in Oscillatory Flows, AIAA Journal, vol. 35, pp.1545-48, 1997.

[16] R. W. Smith and W. Shyy, Computation of Aerodynamics Coefficients for a Flexible Mem-brane Airfoil in Turbulent Flow: A Comparison with Classical Theory, Phys. Fluids, vol. 8,no. 12, 1996.

[17] G. Torres and T. Mueller, Aerodynamic Characteristics of Low Aspect Ratio Wings at LowReynolds Numbers, Proc.of the Conf. on Fixed, Flapping, and Rotary Wing Vehicles at VeryLow Reynolds Numbers, Notre Dame University, pp. 278-305, 2000.

[18] G. C. Whittow, ed., Sturkie s Avian Physiology, Fifth Ed., Academic Press, San Diego, 2000.

47

BIOGRAPHICAL SKETCH

Scott Ettinger was born in Gainesville, Florida, in 1972, but from the age of two grew up in

Stuart, FL. After a brief college experience at the University of Virginia in 1989-90, Scott entered

the workforce as a computer programmer and network engineer among other colorful jobs. In

1995 he returned to Gainesville to pursue his academic career. After transferring from community

college as a junior, Scott earned a Bachelor of Science degree in electrical engineering from the

University of Florida in 1999. He has since worked as a research assistant in the Aerospace Engi-

neering, Mechanics, and Engineering Science Department while earning a Master of Science

degree in electrical engineering.

design and implementation of autonomous vision …€¦ · to obtain additional information...

Documents