xalapa, ver., mexico fpga-embedded driver … driver assistance system based on computer vision...
Post on 07-May-2018
216 Views
Preview:
TRANSCRIPT
FPGA-Embedded Driver Assistance System
Based On Computer Vision
Ricardo Acevedo-Avila, Miguel Gonzalez-Mendoza, Andres-David Garcia-Garcia. Department of Electrical and Electronic Engineering
Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Estado de México.
Atizapán de Zaragoza, Estado de México, México
richacevedo@hotmail.com; {mgonza, garcia.andres}@itesm.mx
Abstract — In this paper we present the design of a Driver
Assistance System based on computer vision whose main purpose
is to assist the driver by providing vital information on the traffic
environment and the vehicle’s actions. This design has been
proposed on an embedded computation platform, due to physical
limitations, memory and computing power, existing image
processing algorithms based on PC platforms are not suitable for
this application. Instead, simple and efficient image processing
algorithms most be developed to fit our embedded architecture.
The Driver Assistance System provides three main
functionalities: Lane Detection, Obstacle Detection and Lane-
Change Detection. The core algorithms have been developed and
simplified using the Matlab Design Environment and then
described into custom hardware components to be implemented
on FPGA hardware.
Keywords- Intelligent Systems, Embedded Driver Assitance
System, Image Processing, Computer Vision, Lane Detection,
Lane-Change Detection, Obstacle Detection, Perspective
Transformation, FPGA Hardware Development.
I. INTRODUCTION
Road traffic accidents are a serious socio-economic problem and one of the top ten causes of death [1] the potential human and economic implications are large and cause continuous government and industry spending. The research in vehicle safety systems is an essential component needed to solve this problem.
Computer vision-based Driver Assistance Systems present technology designed and developed for improving traffic safety, using the existing road infrastructure and computing platforms such as Personal Computers to run many functionalities (road recognition, lane and vehicle detection, tracking, etc.) aimed to safeguard the occupants of the vehicle.
The main objective of this paper is to demonstrate how to implement the functionalities of a full computer-based Driver Assistance System on
FPGA hardware. An Altera Cyclone II EP2C35F672C6N FPGA [2] is used on this work mainly because of its low cost, board-integrated components and design environment. Due to the complexity of image processing algorithms, it is necessary to use simplified models restricted to the resources offered by the development board.
As mentioned before, the proposed system is composed of three main image processing-based algorithms or modules:
1) Lane Detection. 2) Lane-Change Detection. 3) Obstacle Detection.
Lane detection is an important functionality of
any Driver Assistance System, it is used to find lane and road boundaries in given images, it is also used to detect lane-change events that may occur inadvertently for the driver. Nissan has recently developed a lane departure warning system based on this technology [3].
Lane detection is especially difficult in urban environments due to parked, moving vehicles, people, trees, buildings, shadows and other noisy sources on the scene. [4] We present a fast and efficient approach to lane detection restricted only to highway environments, hardware-based image filters and algorithms are designed to take advantage of the simplified environment.
Obstacle Detection is also a vital functionality for vehicle safety. To perform obstacle detection, snap shots of a video stream can be taken and these images can be used as input to a computer which ultimately performs the actual detection [4]. Our main concern is that obstacle detection can often be
ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico
35
quite slow for some real time applications, so we need a simple yet effective method of road detection.
All the different modules of this system are designed as custom hardware components. The source image is smoothed out by a chain of filters, including operations of dilation-erosion, averaging and thresholding.
The core component of the system is a “perspective distortion corrector”; its main objective is to compute a top-view image of the actual scene, based on the Inverse Perspective Mapping (IMP) [5] theory. By running a simple, embedded software algorithm, we can determine lane changes and obstacle detection.
The proposed solution forms part of a first, experimental, iteration of the system. To investigate the viability of this design, experimental tests have been carried out under real scenarios.
Optimization is still needed, but the basic functionality of the core modules has been tested to work.
II. SYSTEM DESIGN
A. Lane detection and Road Model Simplification.
Hough transform is a commonly method used for
detecting lines and circles [6], constantly applied in
lane detection [7] scenarios due to its robustness to
light variations and image noise, however, its
complexity [8] demands significant computer
power, way beyond the restriction of this
application’s hardware platform.
It is possible to simplify the road model if we
extract the perspective distortion introduced by the
camera using Inverse Perspective Mapping. Figure
(1) shows the image distorted by the camera lens,
along the true image that can be obtained if a
perspective correction method is applied. It is
important to note that this operation is only a pixel
mapping between two image planes (image plane
and world plane).
The method used for the perspective distortion
correction requires 4 non-collinear points to create
the mapping between the two planes. The 4 points
correspond to each one of the 4 corners of trapezoid
formed on the image plane and their equivalent
positions on the world plane. These points generate
a set of 8 equations [9]. The equations can be solved
offline and a pair of matrices containing the world
plane pixels can be obtained.
The perspective distortion can be modeled by the
following projective transformation, as written in
equation (1), where and are 3-vectors
representing a point and is a homogeneous non-
singular 3x3 matrix.
(1)
With the 4 point correspondences discussed
above, it is possible to extract 8 equations to
generate matrix H, as shown in equation (2).
[
]
[ ]
[
]
(2)
It is possible to solve this set of linear equations
using any mathematical software.
Image Plane World Plane
Figure 1. Distortion of the world plane and true world plane.
Point 1
X: 244
Y: 194
Point 2
X: 384
Y: 194
Point 3
X: 17
Y: 344
Point 4
X: 617
Y: 344
(a)
(b)
Figure 2. (a) Camera image and (b) rectified image. Notice the
four point correspondences on each image plane.
ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico
36
B. FPGA Implementation
Once matrix H is obtained, the world plane pixel
positions for any image plane pixel can be obtained,
this data can be stored in Read-Only Memory, this
means that in our proposed FPGA architecture, each
pixel composing the distorted image can be re-
assigned to a new position according to the data
stored in ROM, as showed in Figure 3.
Figure 3. Each incoming pixel is located in a new position according
to the data stored in a ROM block.
To fully implement this idea in FPGA hardware
three major components are needed: Read-Only
Memory to store the rectified position data,
Random-access memory to hold each incoming
video frame and a RAM/ROM controller to sync up
the data transmission between the two memory
blocks.
The RAM/ROM controller is designed as a finite-
state machine model with four main states (Figure
4): State 0: Internal variables initialization.
State 1: Data request to ROM. (World plane position of the
incoming Image plane pixel stream).
State 2: ROM data received, data request to RAM.
State 3: RAM data received and sent to rectified image
container (in this case, a VGA controller so we can
visualize the final image on a computer monitor).
C. Lane Coordinates Extraction
Once the image has been rectified, it is necessary
to extract the coordinates of the two lanes along the
horizontal axis. Figure 5 shows the ideal rectified
image obtained after the perspective transformation.
Figure 5. Ideal rectified image. Each lane centroid is marked with a
red rectangle.
Each of these white lanes can be represented as
two perfect rectangles, with width “w” and height
“h”. If we consider the distance “v” from the bottom
left to the start of the white rectangle, then, the
center coordinate “ ” of this lane can be simply
calculated as:
(3)
In a real image obtained after the rectification
(Figure 6), the two lanes hardly resemble perfect
rectangles. It is possible to process each row of the
image independently and then compute the average
of each value to obtain the final center coordinate of
the white lane.
Figure 6. Real rectified image.
Positions Table
(ROM)
Number of positions:
Pixels contained in
The image plane
(Red Trapezium)
Image Plane
Incoming
Pixel
World Plane
Outgoing
Pixel
State 0
State 1
State 2 State 3
ROM
data
Request
RAM
data
reception.
Data to
VGA
Ctrlr.
ROM
data
reception.
RAM
data
Request.
S =1
S =2
S =3
S =4
Initialization
Figure 4. The finite-state machine model that syncs up
and controls the data transmission between the two RAM
and ROM blocks.
ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico
37
Consider each row of the image is processed
from left to right. First, we count each black pixel
and store the final value into an accumulator
variable “Vn”. The white pixels are also counted
and the final sum stored into another accumulator
variable “Wn”.
If a whole row (320 pixels) has been processed,
the row counter variable “H” must be increased in
one unit, if the number of processed rows has
reached an H predefined bound, the processing is
over and we need to compute the average of each
accumulator variable :
For black pixels:
(5)
For white pixels:
(6)
The lane centroid in the horizontal axis can be
computed as:
(7)
Remember that a pure hardware implementation
is the final platform for this algorithm, so we need a
way to deal with those divisions in eqs. (5), (6) and
(7). If we process only 16 (H = 24) image rows, we
can express the divisions in (5) and (6) as arithmetic
right shifts of four positions, for eq. (7) we can right
shift just one position.
D. Data Processing
After computing both centroids, we can finally
determine if a lane is present on the road, a lane-
change is taking place or if an obstacle is lying
between these two points. To detect an obstacle we
can threshold the area between the two lanes, if an
obstacle exists, it will show up as a blob of white
pixels.
If the blob of pixels exceeds a certain security
distance, as shown in figure 7, when can alert the
driver that an obstacle lies directly in front of the
vehicle.
Figure 7. Simple obstacle detection. The binary image shows a blob
of white pixels indicating the presence of an obstacle.
To detect a lane change, the processing is a little
bit trickier, as we need to track both lane centroids
through time; this is accomplished by defining a
“processing window”. The processing window will
evaluate the changes presented in 10 video frames,
then, a simple algorithm will conclude if the vehicle
has, in fact, changed lane.
The basic idea is simple: let’s divide the bottom
of the image in two major areas: Left and Right.
Let’s further divide those areas in two halves, so we
end up with four sub-areas as shown in Figure 8,
each lane centroid will cross each region at different
times, depending on the vehicle’s direction.
Left Right
Figure 8. Lane-change detection. Notice the four sub-
areas at the bottom. The red vertical lines represent each
of the lanes.
ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico
38
If the left centroid moves to the right and the
right centroid disappears from the scene, probably,
a change to the left has occurred. Conversely, if the
right centroid moves to the left and the left centroid
eventually disappears, we could infer that a change
to the right has just occurred. Now, we need a
certain threshold value to be perfectly sure when a
vehicle is really changing direction.
We define the minimum distance thresholds that
both centroids have to cross as the two halves of the
left and right areas in Figure 8 (red lines).
Previously, we define a processing window of
10 frames. If we notice that both centroids are
beyond the minimum distance threshold in at least 6
frames of 10, we can conclude that a real lane-
change is taking place in that instant in time.
This algorithm has been implemented as a simple
program written in C running on a NIOS II soft-
core CPU, the NIOS II is a 32-bit embedded-
processor architecture designed specifically for the
Altera FPGAs. Its configuration can be listed as
follows:
Standad core.
20 KBytes of on-chip memory.
UART JTAG for host communication and
debugging.
III. RESULTS
Tests on real-world scenarios were carried out
with the results evaluated by human designers. Each
test involved the evaluation of 10 frames of video in
a sequence of video of varying length. It is
important to note that the system yields a result
after 10 frames of video are evaluated, that way, if
detection shows up in at least 6 of 10 frames we can
conclude that, in fact, that detection is positive.
The test vehicle changed lanes randomly through
the whole video. Figure 9 shows the detection of a
right lane-change with the design discussed in this
paper.
Figure 9. Lane-change detection test.
The system is actively detecting and correctly identifying the lane and the lane-change provided that the lane markers are correctly painted on the road and the camera is correctly positioned. Tests performed on the lane detection and lane-change sequence we resulted in 100% accuracy for each video sequence. One point to note, however, is that no occlusion occurred during the test sequence and hence performance was expected to improve.
The core hardware implementation of the system is shown in Figure 10.
We have included image pre-processing stages composed of a closing and binarization filter before
Camera Image Rectified Image
Fra
me
11
F
ram
e 1
0
F
ram
e 1
Figure 10. System Core Architecture.
ROM
Memory
Controller
Centroid
Finder
Image
Filters
RAM
CPU
NIOS
II
Right
Change
ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico
39
the actual perspective correction component to increase the system robustness to light variations and environmental noise.
Table 1 summarizes the resource consumption of the proposed design, the proposed design has been proved to work at a maximum frequency of 44. 69 Mhz.
TABLE I. DESIGN RESOURCE CONSUMPTION
FPGA
Circuit
Resource Consumption
LE Memory bits
Fmax
(MHz)
Proposed 2,282 192,384 44.69
EP2C35 33,216 483 840 44.69
EP2C70 68,416 1125 K 44.69
The EP2C35 FPGA circuit from Altera has been used as testing platform, however, the memory bits required by our design exceeds those offered by this particular FPGA. The full design can be implemented on the EP2C70 circuit, as show on Table 1, the requirements of memory blocks are fully met with that circuit.
IV. CONCLUSIONS
In this work we proposed an implementation of a Driver Assistance System running on embedded hardware, which is a real and promising solution for improved traffic and road security. A simplified model of road and lane detection using perspective transformation was developed to take advantage of a hardware-configurable environment.
One of the crucial components of this system is the perspective corrector based on IPM, it is important to note that, as shown in eq. (2), this solution is completely independent of intrinsic and extrinsic camera parameters, as long as the camera is correctly positioned on the vehicle.
This approach has proved to be feasible and reliable according to the experiments conducted,
there is still room for code and resource optimizations, in a near future it will be possible to use the provided modules fully integrated on an actual on-board vehicle system. During development of the system, a processing time of 0.008147 seconds per frame was achieved without fully optimization of all components. The processing time is expected to decrease in future iterations.
ACKNOWLEDGMENT
The authors would like to express their gratitude
to Dr. Cuauhtémoc Carbajal, Dr. Alfredo Santana
and Dr. Sadegh Babaii, for their continuous help
and advice in the discussions leading up to this
work.
REFERENCES
[1] The World Health Organization "The top 10 causes of Death"
http://www.who.int/mediacentre/factsheets/fs310/en/index.html Web. 07 May 2011.
[2] Cyclone II Literature Wbsite by Altera:
http://www.altera.com/literature/lit-cyc2.jsp Web. 02 May 2011.
[3] Nissan's "All Around Collision Free" Prototype To Demonstrate Advanced Accident Avoidance Systems At Its World Congress
http://www.nissannews.com/newsrelease.do;jsessionid=EBF6F1B4AEC92778E31524B6C91816E7?&id=626&mid=1 Web. 23 April 2011.
[4] Aly, M. "Real Time Detection Of Lane Markers In Urban Streets". Computer Vision Lab, Electrical Engineering, California institute of Pasadena, 2008.
[5] Bertozzi, M. "Stereo Inverse Pespective Mapping: Theory and Applications" Dipartimento di Ingegneria dell'Informazione, Universitá di Parma, Parma, Itally, 1998.
[6] Coifman, B. "A Real-Time Computer Vision system for Vehicle Tracking and Traffic Surveillance" Institute of Transportation Studies, University of California, Berkeley, California, 1998.
[7] Mc. Donald, J. "Application of the Hough Transform to Lane Detection and Following on High Speed Roads" Signals & Systems Group, Department of Computer Science, National University of Ireland, Maynooth, Ireland, 2001.
[8] Yang, G., "Computer Vision Hough Transform" Department of Computing, Imperial College London, 2005.
[9] Hartley, R., Zisserman, A. "Multiple View Geometry in Computer Vision" Second Edition, Cambridge University Press, Cambridge, U.K., 2003.
ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico
40
top related