[ieee internet-based systems (sitis 2010) - kuala lumpur (2010.12.15-2010.12.18)] 2010 sixth...

5
Car Plate Character Extraction and Recognition Using Stroke Analysis S. Setumin, U.U. Sheikh, and S.A.R Abu-Bakar, Senior Member, IEEE Computer Vision, Video and Image Processing Lab, Dept. of Microelectronics and Computer Engineering, Faculty of Electrical Engineering, Universiti Teknologi Malaysia [email protected], [email protected] Abstract—In this paper we address the issue of recognizing nonstandard Malaysian car license plates. These plates contain nonstandard characters such as italic, cursive and connected letters, which most plate recognition systems are unable to recognize. We propose a technique using stroke extraction and analysis to recognize these nonstandard characters. The proposed technique first extracts the contour of the character and then the stroke direction which are used for classification. The advantage of this method is that the system requires no training. From the experiments performed, the method has a correct recognition accuracy of 95% and even works with standard car plates. Keywords-car plate, extraction, recognition, stroke analysis I. INTRODUCTION In this paper, a Malaysian license plate recognition algorithm using stroke analysis technique is presented. There are three reasons behind the motivation of this work. Firstly, unlike other countries in the world, in Malaysia, there are special car plates released to commemorate special events in the country, such as the launch of certain national car models (e.g. proton, wira, putra) or certain events that has taken place (e.g. OIC, Putrajaya, BAMbee). These plates do not conform to the standard of regular plates and thus cannot be recognized by current plate recognition systems. Therefore, a car plate recognition system for these car plates is required. Secondly, we want to propose a technique that does not require training of the system each time a new special plate is released. Finally, even though the focus is on Malaysian car plates, the proposed technique should be applicable to other types of characters such as the Chinese or Japanese characters. Generally, strokes can be divided into two groups; simple strokes and complex strokes (Figure 1). There are many synonymous terms with stroke analysis; such as chain code, stroke approximation and boundary representations. Stroke analysis is a powerful technique in character recognition but yet, it is a rather difficult and complicated technique to implement. The biggest challenge is that of extracting strokes from the character images. Thus, various researchers have proposed various different approaches for obtaining these strokes from the character image. Our proposed algorithm comprises of two main processes; (1) character extraction and (2) stroke analysis. In addition, the stroke analysis itself consists of two parts; (1) stroke tracing and (2) stroke recognition. (a) (b) Figure 1. Stroke groups; (a) simple (b) complex This paper is organized as follows. Review of related works is given in the next section. Section 3 presents the extraction of strokes from character images and validation. Section 4 discusses on the analysis of the stroke in determining the character it represents. Section 5 presents the refinement mechanism in reducing ambiguities and thus improving the performance of the recognition. Section 6 presents results obtained from our proposed method. II. REVIEW OF RELATED WORKS Character extraction is one of the required stages prior to character recognition. The aim is to separate each character so that it can be fed into the recognition stage. One of the character extraction techniques that is normally used is the horizontal-vertical (HV) histogram projection that analyzes the spaces between characters in the plate region by examining total pixels in each row and column [1, 3, 5, 10, 13]. In another technique [7], it was shown that the labeling analysis 2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems 978-0-7695-4319-2/10 $26.00 © 2010 IEEE DOI 10.1109/SITIS.2010.16 30

Upload: s-a-r

Post on 09-Feb-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Car Plate Character Extraction and Recognition Using Stroke Analysis

S. Setumin, U.U. Sheikh, and S.A.R Abu-Bakar, Senior Member, IEEE

Computer Vision, Video and Image Processing Lab, Dept. of Microelectronics and Computer

Engineering, Faculty of Electrical Engineering, Universiti Teknologi Malaysia

[email protected], [email protected]

Abstract—In this paper we address the issue of

recognizing nonstandard Malaysian car license plates.

These plates contain nonstandard characters such as

italic, cursive and connected letters, which most plate

recognition systems are unable to recognize. We propose

a technique using stroke extraction and analysis to

recognize these nonstandard characters. The proposed

technique first extracts the contour of the character and

then the stroke direction which are used for classification.

The advantage of this method is that the system requires

no training. From the experiments performed, the

method has a correct recognition accuracy of 95% and

even works with standard car plates.

Keywords-car plate, extraction, recognition, stroke

analysis

I. INTRODUCTION

In this paper, a Malaysian license plate recognition

algorithm using stroke analysis technique is presented.

There are three reasons behind the motivation of this

work. Firstly, unlike other countries in the world, in

Malaysia, there are special car plates released to

commemorate special events in the country, such as

the launch of certain national car models (e.g. proton,

wira, putra) or certain events that has taken place (e.g.

OIC, Putrajaya, BAMbee). These plates do not

conform to the standard of regular plates and thus

cannot be recognized by current plate recognition

systems. Therefore, a car plate recognition system for

these car plates is required. Secondly, we want to

propose a technique that does not require training of

the system each time a new special plate is released.

Finally, even though the focus is on Malaysian car

plates, the proposed technique should be applicable to

other types of characters such as the Chinese or

Japanese characters.

Generally, strokes can be divided into two groups;

simple strokes and complex strokes (Figure 1). There

are many synonymous terms with stroke analysis; such

as chain code, stroke approximation and boundary

representations. Stroke analysis is a powerful

technique in character recognition but yet, it is a rather

difficult and complicated technique to implement. The

biggest challenge is that of extracting strokes from the

character images. Thus, various researchers have

proposed various different approaches for obtaining

these strokes from the character image. Our proposed

algorithm comprises of two main processes; (1)

character extraction and (2) stroke analysis. In

addition, the stroke analysis itself consists of two

parts; (1) stroke tracing and (2) stroke recognition.

(a) (b)

Figure 1. Stroke groups; (a) simple (b) complex

This paper is organized as follows. Review of

related works is given in the next section. Section 3

presents the extraction of strokes from character

images and validation. Section 4 discusses on the

analysis of the stroke in determining the character it

represents. Section 5 presents the refinement

mechanism in reducing ambiguities and thus

improving the performance of the recognition. Section

6 presents results obtained from our proposed method.

II. REVIEW OF RELATED WORKS

Character extraction is one of the required stages

prior to character recognition. The aim is to separate

each character so that it can be fed into the recognition

stage. One of the character extraction techniques that

is normally used is the horizontal-vertical (HV)

histogram projection that analyzes the spaces between

characters in the plate region by examining total pixels

in each row and column [1, 3, 5, 10, 13]. In another

technique [7], it was shown that the labeling analysis

2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems

978-0-7695-4319-2/10 $26.00 © 2010 IEEE

DOI 10.1109/SITIS.2010.16

30

was very useful in extracting characters with some

angle orientation and different sizes. They showed that

the labeling technique allowed the segmentation of

plate characters independent of the plate orientation,

position, size and distance between the camera and the

plate position. The extracted character is then resized

[1, 3, 5, 13] into a predefined size prior to the

character recognition process. The normal approach is

to resize using bicubic interpolation technique [13].

In the area of character recognition, there have been

two major approaches; template matching [1, 2, 5, 6]

and neural networks [3, 7-9, 13, 15]. Template

matching technique basically compares characters

with stored templates and classifies them using some

distance function. In the template matching process,

the templates used by Lee et. al [1] were the character

images themselves. Lim et. al [2] recognized

characters by comparing the created templates based

on the features of the input characters with the features

stored in the database. In [5], Hotelling transform (HT)

was used whereby the recognition was based on a

distance measure between the Hotelling transformed

counterparts and the Hotelling transformed prototypes.

Neural networks however require huge training set

as inputs during the training process. Characters with

similar features to the training set will have a high

probability of being correctly recognized. Parisi et.al

[3] and Koval et.al [13] proposed the Feed Forward

Neural Network, while works in [7, 8, 15] were based

on Multilayer Neural Network and in [9], a Backdrop

Neural Network [9] was used. On average, both

methods achieved accuracy of more than 90%.

III. CHARACTER EXTRACTION

Character extraction is an important preprocessing

stage prior to stroke analysis. In the stroke analysis

stage, an input image of a character with one pixel

thickness is required. To realize this, character

extraction process is designed so as to provide such

requirement. Figure 2 is a binarized plate number that

needs to be segmented into individual characters.

Connected component is used to measure properties of

each blob for segmentation. Next, each character will

go through morphological thinning to obtain a one-

pixel width representation of the character.

Figure 2. Binarized car plates prior to character extraction process

A. Character Classification

Character classification process is to categorize the

input character to a special plate group or normal plate

group. The character classification basically processes

the received image, performs thinning algorithm and

then analyzes the entire object. The Object Ratio ( )

is computed, that is the Width-to-Height ratio of each

segmented object in the image. An object with < 1.5

is classified into normal plate group, and otherwise

classified into special plate group. This process is

actually classifying between separated characters and

cursive characters based on the object ratio.

Once the object has been classified into either a

normal plate or special plate, it will undergo the

extraction process. Special plate extraction process in

general will take slightly longer processing time than

that of the normal plate. This is because the character‟s

template used for the special plate has bigger size

(cursive letters) than the template for normal plate

(single letters). In the extraction process, each

character is separated by cropping the input image

vertically based on the width of a single object.

B. Character Refinement

During the character extraction process, some tails

from the adjacent characters might be included. This is

particularly true for slanting characters (such as italic

characters). Such inclusion will cause errors during the

tracing algorithm and thus these outliers must be

removed. Since these outliers are normally small in

size, we followed the “leave the biggest blob and

remove others” strategy. This can be achieved by the

following steps:

1. Calculate the area, A for each object (n)

2. Find the largest object and its area, BA.

)) (1)

3. Retain the coordinates,(ri, cj), of all pixels within

the biggest object.

31

( ) {

(2)

where N is the number of pixels in .

C. Character Resizing

Every extracted character is different in its size.

This difference can be caused by the different distance

between the camera and the plate number. Since the

extracted characters need to be compared with

predefined templates, each character must be resized

in order to minimize error during the tracing process.

However, the resizing process is not without a

problem. For example, since the extracted object for a

letter “1” or “I” is smaller than the predefined

template, the algorithm stretches the width so as to

match with the template size. Resizing thus changes

the shape of the character. To solve this, a blank

template image is created prior to the resizing process.

Any object that has a width smaller than the template

will simply be copied and placed at the centre of the

blank image. Hence, no stretching of the character‟s

width is performed.

D. Character Shifting

Some plate numbers need more handling especially

those that have combinations of characters. An

example is the “BAMbee” plate number. Many

problems can occur when the lowercase characters in

the cropped image (the letters „b‟ and „e‟) is retained at

its original position. One of the problem is that the

tracing algorithm might fail in finding the starting

coordinate of the character. This is because the

character is located below half of the height of the

image. Normally such a blob is considered as an

invalid character. Another problem is that, in the

recognition stage the chain code is normally matched

with the codes of capital letters. This results in

recognition failure on lowercase characters. Therefore

the extracted character that behaves like a small

character will be shifted to the centre of the template.

IV. STROKE ANALYSIS

Stroke analysis is used to recognize the extracted

characters. The idea is that it will trace every character

in such a way that it tries to follow how the human

writes a character (alphabets or numbers). Hence, no

matter what font style is used on the plate number, the

algorithm should be able to recognize the character.

The algorithm consists of two main processes; (1)

character tracing and (2) character recognition.

A. Tracing Algorithm

Character tracing is the first stage in stroke analysis.

After the thinning process the stroke of a character is

generated. The stroke is a sequence of pixel direction

during the tracing procedure. To start the tracing of the

character, the first pixel needs to be identified. Here

we assume that humans write a character from top to

bottom. Hence, the first tracing coordinate should be

located at the top of the character image. By using left

to right and top to bottom scanning approach, the

starting point would be the first pixel and is given the

value of 1. Other pixels will be given values

depending on its relative position as given in Figure 3.

Figure 3. Direction structure

For characters such as “2”, “3”, and “5”, the starting

coordinate is not the top most pixel for obvious

reasons. To correct this, starting pixel refinement

process is applied. Based on our assumption, the

starting coordinate should not be located at the bottom

part of a character rather it should be located at either

on the topmost or one third from the top. Therefore,

the character image is first divided into half of its

height and then the refinement algorithm will search

for the best starting pixel. The refinement process will

find a white pixel that has only one neighbor that is

not located at any of the borders (of the half image). If

the scanning cannot find any new pixels, the first pixel

is retained as the starting pixel.

From this starting pixel the scanning algorithm will

scan its neighborhood looking for the next valid pixel.

This pixel is then marked as “X” in the direction

structure as shown in Figure 3. The marked pixel will

be ignored in the next scan so as to avoid redundancy

while tracing. The character is traced by following

white pixels in the image. Every path taken during the

tracing is recorded. The path will record three

variables: Chain ( ) Next ( ), and Done ).

Chain will store the trace code as given in Figure 3.

Next will store all white pixels neighboring X, and

Done will store those pixels that have been marked as

X. These variables will keep the relative position of

relevant pixels.

During the tracing process, if there are more than

Scanning Direction

32

one pixel neighbors (as in the case of intersection)

then a validity check will be performed. In such a case,

the locations of other pixels will be stored in variable

Next for the next scanning step. The tracing is

considered complete when there are no longer any

white pixels found in the image.

B. Matching Algorithm

The next process then is to match the chain code

with a character. We follow a simple brute-force

matching whereby the chain code will be compared

with every template character in a sequential manner.

Once a match is found, the process will stop the search

and return the character.

V. AMBIGUITY CHECK

Some characters will be misclassified as they look

alike. To reduce such errors we introduce a secondary

confirmation process. In this process, the character

undergoes a skeletonizing process rather than thinning

process. By using the skeleton, the edges of a

character can be taken as features to differentiate

between them. We found that through this secondary

confirmation process we can differentiate the

following character pairs; “B and 8”, “D and O”, “5

and S”, and “K and X”.

VI. RESULTS

Figure 4 shows the result for extraction and

recognition of a sample special car plate. We tested

our algorithm on a 122 samples of car plates

consisting of 15 special plates, 93 normal plates, 6 taxi

plates, and 8 foreign plates. The proposed method was

able to correctly classify 95% special car plates, 99%

normal plates, 100% taxi plates and 75% for foreign

car plates. The average accuracy is 95%.

VII. CONCLUSION

We have presented in this paper a technique to

extract and recognize car plates by using stroke

analysis. In doing so, we remove the constraint on the

character style and font used in the plate number. The

technique we used consists of 3 major stages:

character extraction, character tracing, and character

recognition. Combining the three steps, we have

achieved an average accuracy of 95% on various car

plates, including normal and special car plates.

(a)

(b)

(c)

Figure 4. Special plate, (a) input image, Recognition of (b) “P”, (c)

“utra”.

ACKNOWLEDGEMENT

The authors would like to express their gratitude to

Universiti Teknologi Malaysia (UTM) for supporting

this research work.

REFERENCES

[1] Eun Ryung Lee, Pyeoung Kee Kim, and Hang Joon Kim. Automatic Recognition of A Car License Plate Using Color

Image Processing. Proceedings of ICIP, Austin, USA, IEEE.

1994. 94: 301-306.

[2] B.L Lim, Wenzheng Yeo, K.Y. Tan, and C.Y. Teo. A Novel

DSP-Based Real-time Character Classification and

Recognition Algorithm for Car Plate Detection and Recognition. Proceedings of ICSP 1998. Ngee Ann

Polytechnic, China IEEE. 1998 1269-1272.

[3] R. Parisi, E.D.Di Claudio, G. Lucarelli, and G. Orlandi. Car Plate Recognition By Neural Networks and Image processing.

IEEE. 1998. III: 195-198. [4] V. Turchenko, V. Kochan, V. Koval, A.Sachenko, and G.

Markowsky. Smart Vehicle Screening System Using Artificial

Intelligence Methods. IEEE. 2003. [5] Hans A.Hegt, Ron J. De la Haye, and Nadeem A Khan. A

High Performance Licence Plate Recognition System. IEEE.

1998. [6] Dai Yan, Ma Hongqing, Liu Jilin and Li Langang. A high

performance licence plate recognition system based on the web

technique. IEEE. 2001. [7] N.Vazques, M.Nakano and H.Perez Meana. Automatic system

for localization and recognition of vehicle plate numbers.

Journal of Applied Research and Engineering. 2002. 63-77. [8] Siti Norul Huda Sheikh Abdullah, Marzuki Khalid and

Rubiyah Yusof. Comparison of feature extractors in licence

plate recognition. IEEE. 2007. [9] Wei-Khing For, Karianto Leman, How-Lung Eng, Boon-Fong

Chew and Kong-Wah Wan. A Multi-Camera Collaboration

Framework for Real-Time Vehicle Detection and Licence Plate Recognition on Highways. IEEE. 2008.

[10] Luis Salgado, Jose M. Menendez, Enrique Rendon and

Narciso Garsia. Automatic Car Plate Detection and Recognition Through Intelligent Vision Eng. IEEE. 1999.

33

[11] Nor Amizam Jusoh, Dr. Jasni Md Zain and Tuty Asmawaty

Abd Kadir. Enhancing Thinning Method for Malaysian Car Plates Recognition. IEEE. 2007.

[12] Balazs Enyedi, Lajos Konyha and Kalman Fazekas. Real-Time

Number Plate Localization Algorithms. Journal of ELECTRICAL ENGINEERING, Vol. 57, No. 2, 2006, 66-77.

[13] V. Koval, V. Turchenko, V. Kochan, A. Sachenco, G

Markowsky. Smart License Plate Recognition System Based on Image Processing Using Neural Network. IEEE. 2003.

[14] Caser Garcia-Osorio, Jose-Francsico Diez-Pastor, Juan J.

Rodriguez and Jesus Maudes. License Plate Number Recognition.

[15] Rafael C. Gonzalez, Rechard E. Woods, and Steven L. Eddins.

Digital Image Processing Using MATLAB. Prentice Hall. 2003

34