[ieee internet-based systems (sitis 2010) - kuala lumpur (2010.12.15-2010.12.18)] 2010 sixth...
TRANSCRIPT
Car Plate Character Extraction and Recognition Using Stroke Analysis
S. Setumin, U.U. Sheikh, and S.A.R Abu-Bakar, Senior Member, IEEE
Computer Vision, Video and Image Processing Lab, Dept. of Microelectronics and Computer
Engineering, Faculty of Electrical Engineering, Universiti Teknologi Malaysia
[email protected], [email protected]
Abstract—In this paper we address the issue of
recognizing nonstandard Malaysian car license plates.
These plates contain nonstandard characters such as
italic, cursive and connected letters, which most plate
recognition systems are unable to recognize. We propose
a technique using stroke extraction and analysis to
recognize these nonstandard characters. The proposed
technique first extracts the contour of the character and
then the stroke direction which are used for classification.
The advantage of this method is that the system requires
no training. From the experiments performed, the
method has a correct recognition accuracy of 95% and
even works with standard car plates.
Keywords-car plate, extraction, recognition, stroke
analysis
I. INTRODUCTION
In this paper, a Malaysian license plate recognition
algorithm using stroke analysis technique is presented.
There are three reasons behind the motivation of this
work. Firstly, unlike other countries in the world, in
Malaysia, there are special car plates released to
commemorate special events in the country, such as
the launch of certain national car models (e.g. proton,
wira, putra) or certain events that has taken place (e.g.
OIC, Putrajaya, BAMbee). These plates do not
conform to the standard of regular plates and thus
cannot be recognized by current plate recognition
systems. Therefore, a car plate recognition system for
these car plates is required. Secondly, we want to
propose a technique that does not require training of
the system each time a new special plate is released.
Finally, even though the focus is on Malaysian car
plates, the proposed technique should be applicable to
other types of characters such as the Chinese or
Japanese characters.
Generally, strokes can be divided into two groups;
simple strokes and complex strokes (Figure 1). There
are many synonymous terms with stroke analysis; such
as chain code, stroke approximation and boundary
representations. Stroke analysis is a powerful
technique in character recognition but yet, it is a rather
difficult and complicated technique to implement. The
biggest challenge is that of extracting strokes from the
character images. Thus, various researchers have
proposed various different approaches for obtaining
these strokes from the character image. Our proposed
algorithm comprises of two main processes; (1)
character extraction and (2) stroke analysis. In
addition, the stroke analysis itself consists of two
parts; (1) stroke tracing and (2) stroke recognition.
(a) (b)
Figure 1. Stroke groups; (a) simple (b) complex
This paper is organized as follows. Review of
related works is given in the next section. Section 3
presents the extraction of strokes from character
images and validation. Section 4 discusses on the
analysis of the stroke in determining the character it
represents. Section 5 presents the refinement
mechanism in reducing ambiguities and thus
improving the performance of the recognition. Section
6 presents results obtained from our proposed method.
II. REVIEW OF RELATED WORKS
Character extraction is one of the required stages
prior to character recognition. The aim is to separate
each character so that it can be fed into the recognition
stage. One of the character extraction techniques that
is normally used is the horizontal-vertical (HV)
histogram projection that analyzes the spaces between
characters in the plate region by examining total pixels
in each row and column [1, 3, 5, 10, 13]. In another
technique [7], it was shown that the labeling analysis
2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems
978-0-7695-4319-2/10 $26.00 © 2010 IEEE
DOI 10.1109/SITIS.2010.16
30
was very useful in extracting characters with some
angle orientation and different sizes. They showed that
the labeling technique allowed the segmentation of
plate characters independent of the plate orientation,
position, size and distance between the camera and the
plate position. The extracted character is then resized
[1, 3, 5, 13] into a predefined size prior to the
character recognition process. The normal approach is
to resize using bicubic interpolation technique [13].
In the area of character recognition, there have been
two major approaches; template matching [1, 2, 5, 6]
and neural networks [3, 7-9, 13, 15]. Template
matching technique basically compares characters
with stored templates and classifies them using some
distance function. In the template matching process,
the templates used by Lee et. al [1] were the character
images themselves. Lim et. al [2] recognized
characters by comparing the created templates based
on the features of the input characters with the features
stored in the database. In [5], Hotelling transform (HT)
was used whereby the recognition was based on a
distance measure between the Hotelling transformed
counterparts and the Hotelling transformed prototypes.
Neural networks however require huge training set
as inputs during the training process. Characters with
similar features to the training set will have a high
probability of being correctly recognized. Parisi et.al
[3] and Koval et.al [13] proposed the Feed Forward
Neural Network, while works in [7, 8, 15] were based
on Multilayer Neural Network and in [9], a Backdrop
Neural Network [9] was used. On average, both
methods achieved accuracy of more than 90%.
III. CHARACTER EXTRACTION
Character extraction is an important preprocessing
stage prior to stroke analysis. In the stroke analysis
stage, an input image of a character with one pixel
thickness is required. To realize this, character
extraction process is designed so as to provide such
requirement. Figure 2 is a binarized plate number that
needs to be segmented into individual characters.
Connected component is used to measure properties of
each blob for segmentation. Next, each character will
go through morphological thinning to obtain a one-
pixel width representation of the character.
Figure 2. Binarized car plates prior to character extraction process
A. Character Classification
Character classification process is to categorize the
input character to a special plate group or normal plate
group. The character classification basically processes
the received image, performs thinning algorithm and
then analyzes the entire object. The Object Ratio ( )
is computed, that is the Width-to-Height ratio of each
segmented object in the image. An object with < 1.5
is classified into normal plate group, and otherwise
classified into special plate group. This process is
actually classifying between separated characters and
cursive characters based on the object ratio.
Once the object has been classified into either a
normal plate or special plate, it will undergo the
extraction process. Special plate extraction process in
general will take slightly longer processing time than
that of the normal plate. This is because the character‟s
template used for the special plate has bigger size
(cursive letters) than the template for normal plate
(single letters). In the extraction process, each
character is separated by cropping the input image
vertically based on the width of a single object.
B. Character Refinement
During the character extraction process, some tails
from the adjacent characters might be included. This is
particularly true for slanting characters (such as italic
characters). Such inclusion will cause errors during the
tracing algorithm and thus these outliers must be
removed. Since these outliers are normally small in
size, we followed the “leave the biggest blob and
remove others” strategy. This can be achieved by the
following steps:
1. Calculate the area, A for each object (n)
2. Find the largest object and its area, BA.
)) (1)
3. Retain the coordinates,(ri, cj), of all pixels within
the biggest object.
31
( ) {
(2)
where N is the number of pixels in .
C. Character Resizing
Every extracted character is different in its size.
This difference can be caused by the different distance
between the camera and the plate number. Since the
extracted characters need to be compared with
predefined templates, each character must be resized
in order to minimize error during the tracing process.
However, the resizing process is not without a
problem. For example, since the extracted object for a
letter “1” or “I” is smaller than the predefined
template, the algorithm stretches the width so as to
match with the template size. Resizing thus changes
the shape of the character. To solve this, a blank
template image is created prior to the resizing process.
Any object that has a width smaller than the template
will simply be copied and placed at the centre of the
blank image. Hence, no stretching of the character‟s
width is performed.
D. Character Shifting
Some plate numbers need more handling especially
those that have combinations of characters. An
example is the “BAMbee” plate number. Many
problems can occur when the lowercase characters in
the cropped image (the letters „b‟ and „e‟) is retained at
its original position. One of the problem is that the
tracing algorithm might fail in finding the starting
coordinate of the character. This is because the
character is located below half of the height of the
image. Normally such a blob is considered as an
invalid character. Another problem is that, in the
recognition stage the chain code is normally matched
with the codes of capital letters. This results in
recognition failure on lowercase characters. Therefore
the extracted character that behaves like a small
character will be shifted to the centre of the template.
IV. STROKE ANALYSIS
Stroke analysis is used to recognize the extracted
characters. The idea is that it will trace every character
in such a way that it tries to follow how the human
writes a character (alphabets or numbers). Hence, no
matter what font style is used on the plate number, the
algorithm should be able to recognize the character.
The algorithm consists of two main processes; (1)
character tracing and (2) character recognition.
A. Tracing Algorithm
Character tracing is the first stage in stroke analysis.
After the thinning process the stroke of a character is
generated. The stroke is a sequence of pixel direction
during the tracing procedure. To start the tracing of the
character, the first pixel needs to be identified. Here
we assume that humans write a character from top to
bottom. Hence, the first tracing coordinate should be
located at the top of the character image. By using left
to right and top to bottom scanning approach, the
starting point would be the first pixel and is given the
value of 1. Other pixels will be given values
depending on its relative position as given in Figure 3.
Figure 3. Direction structure
For characters such as “2”, “3”, and “5”, the starting
coordinate is not the top most pixel for obvious
reasons. To correct this, starting pixel refinement
process is applied. Based on our assumption, the
starting coordinate should not be located at the bottom
part of a character rather it should be located at either
on the topmost or one third from the top. Therefore,
the character image is first divided into half of its
height and then the refinement algorithm will search
for the best starting pixel. The refinement process will
find a white pixel that has only one neighbor that is
not located at any of the borders (of the half image). If
the scanning cannot find any new pixels, the first pixel
is retained as the starting pixel.
From this starting pixel the scanning algorithm will
scan its neighborhood looking for the next valid pixel.
This pixel is then marked as “X” in the direction
structure as shown in Figure 3. The marked pixel will
be ignored in the next scan so as to avoid redundancy
while tracing. The character is traced by following
white pixels in the image. Every path taken during the
tracing is recorded. The path will record three
variables: Chain ( ) Next ( ), and Done ).
Chain will store the trace code as given in Figure 3.
Next will store all white pixels neighboring X, and
Done will store those pixels that have been marked as
X. These variables will keep the relative position of
relevant pixels.
During the tracing process, if there are more than
Scanning Direction
32
one pixel neighbors (as in the case of intersection)
then a validity check will be performed. In such a case,
the locations of other pixels will be stored in variable
Next for the next scanning step. The tracing is
considered complete when there are no longer any
white pixels found in the image.
B. Matching Algorithm
The next process then is to match the chain code
with a character. We follow a simple brute-force
matching whereby the chain code will be compared
with every template character in a sequential manner.
Once a match is found, the process will stop the search
and return the character.
V. AMBIGUITY CHECK
Some characters will be misclassified as they look
alike. To reduce such errors we introduce a secondary
confirmation process. In this process, the character
undergoes a skeletonizing process rather than thinning
process. By using the skeleton, the edges of a
character can be taken as features to differentiate
between them. We found that through this secondary
confirmation process we can differentiate the
following character pairs; “B and 8”, “D and O”, “5
and S”, and “K and X”.
VI. RESULTS
Figure 4 shows the result for extraction and
recognition of a sample special car plate. We tested
our algorithm on a 122 samples of car plates
consisting of 15 special plates, 93 normal plates, 6 taxi
plates, and 8 foreign plates. The proposed method was
able to correctly classify 95% special car plates, 99%
normal plates, 100% taxi plates and 75% for foreign
car plates. The average accuracy is 95%.
VII. CONCLUSION
We have presented in this paper a technique to
extract and recognize car plates by using stroke
analysis. In doing so, we remove the constraint on the
character style and font used in the plate number. The
technique we used consists of 3 major stages:
character extraction, character tracing, and character
recognition. Combining the three steps, we have
achieved an average accuracy of 95% on various car
plates, including normal and special car plates.
(a)
(b)
(c)
Figure 4. Special plate, (a) input image, Recognition of (b) “P”, (c)
“utra”.
ACKNOWLEDGEMENT
The authors would like to express their gratitude to
Universiti Teknologi Malaysia (UTM) for supporting
this research work.
REFERENCES
[1] Eun Ryung Lee, Pyeoung Kee Kim, and Hang Joon Kim. Automatic Recognition of A Car License Plate Using Color
Image Processing. Proceedings of ICIP, Austin, USA, IEEE.
1994. 94: 301-306.
[2] B.L Lim, Wenzheng Yeo, K.Y. Tan, and C.Y. Teo. A Novel
DSP-Based Real-time Character Classification and
Recognition Algorithm for Car Plate Detection and Recognition. Proceedings of ICSP 1998. Ngee Ann
Polytechnic, China IEEE. 1998 1269-1272.
[3] R. Parisi, E.D.Di Claudio, G. Lucarelli, and G. Orlandi. Car Plate Recognition By Neural Networks and Image processing.
IEEE. 1998. III: 195-198. [4] V. Turchenko, V. Kochan, V. Koval, A.Sachenko, and G.
Markowsky. Smart Vehicle Screening System Using Artificial
Intelligence Methods. IEEE. 2003. [5] Hans A.Hegt, Ron J. De la Haye, and Nadeem A Khan. A
High Performance Licence Plate Recognition System. IEEE.
1998. [6] Dai Yan, Ma Hongqing, Liu Jilin and Li Langang. A high
performance licence plate recognition system based on the web
technique. IEEE. 2001. [7] N.Vazques, M.Nakano and H.Perez Meana. Automatic system
for localization and recognition of vehicle plate numbers.
Journal of Applied Research and Engineering. 2002. 63-77. [8] Siti Norul Huda Sheikh Abdullah, Marzuki Khalid and
Rubiyah Yusof. Comparison of feature extractors in licence
plate recognition. IEEE. 2007. [9] Wei-Khing For, Karianto Leman, How-Lung Eng, Boon-Fong
Chew and Kong-Wah Wan. A Multi-Camera Collaboration
Framework for Real-Time Vehicle Detection and Licence Plate Recognition on Highways. IEEE. 2008.
[10] Luis Salgado, Jose M. Menendez, Enrique Rendon and
Narciso Garsia. Automatic Car Plate Detection and Recognition Through Intelligent Vision Eng. IEEE. 1999.
33
[11] Nor Amizam Jusoh, Dr. Jasni Md Zain and Tuty Asmawaty
Abd Kadir. Enhancing Thinning Method for Malaysian Car Plates Recognition. IEEE. 2007.
[12] Balazs Enyedi, Lajos Konyha and Kalman Fazekas. Real-Time
Number Plate Localization Algorithms. Journal of ELECTRICAL ENGINEERING, Vol. 57, No. 2, 2006, 66-77.
[13] V. Koval, V. Turchenko, V. Kochan, A. Sachenco, G
Markowsky. Smart License Plate Recognition System Based on Image Processing Using Neural Network. IEEE. 2003.
[14] Caser Garcia-Osorio, Jose-Francsico Diez-Pastor, Juan J.
Rodriguez and Jesus Maudes. License Plate Number Recognition.
[15] Rafael C. Gonzalez, Rechard E. Woods, and Steven L. Eddins.
Digital Image Processing Using MATLAB. Prentice Hall. 2003
34