[ieee tencon 2007 - 2007 ieee region 10 conference - taipei, taiwan (2007.10.30-2007.11.2)] tencon...

An Approach to Multi-Font Numeral Recognition

N Santosh Arjun, IEEE Student Member, G Navaneetha, G Vishnu Preethi and T Karthik Babu Department of Electronics and Communication Engineering, University College of Engineering (A), Osmania University.

Hyderabad, 500 007. India. [email protected] [email protected] [email protected] [email protected]

Abstract- Recognition of numerals has been a research area

for many years because of its various applications. But there wasn’t much research done for recognition of multi-font numerals. The approaches proposed so far, suffer from larger computation time and training for obtaining feature vectors. They can be extended to recognize many more fonts but the accuracy decreases rapidly. So as to eliminate these drawbacks, in this paper, we propose a method which recognizes 17 multi-fonts of different sizes varying from size 8 to 72, with an accuracy of 99.76% on a database of 2890 numeral images. The method requires less computation time for recognizing a numeral while maintaining the high amount of accuracy. In this method we use Euler number of a numeral to initially characterize the numbers into different groups. And then we use individual distinct features of each numeral for recognizing it.

I. INTRODUCTION

With recent emergence and widespread application of multimedia technologies, there is an increasing demand to create a paperless environment. The numbers come in different sizes, shapes and fonts. The ability to identify the machine printed or hand-written numbers led to the development of an entirely different field of research known as Optical Character Recognition (OCR). Recognition of visual patterns and characters independent of position, size, and orientation has been a goal of much recent research. The development of OCR technology would not have been as fast as it was if it could not have been able to utilize appropriate processor and data storage resources. There are two approaches in OCR [7]. One of them, Template matching, based on the principle of Superposition, is very strong in the sense of global matching, but is very sensitive to positional change. The other approach, structural analysis, which deals with feature extraction, has the advantage of detecting the local stroke features of characters but the efficiency depends on identifying the unique features of the characters.

There are numerous approaches that address the problem

of recognition of numerals depending on the type of features extracted and the different ways of extracting them. They are root mean square Polynomial Classifiers, Neural Networks, Hidden Markov Models, Fuzzy Logic designs, Knowledge-Based systems, Syntactical Methods, Genetic Algorithms, and Support Vector Methods.

Lingaraju et. al [3] proposed a method for recognition of

numerals using Hermite curves. The Hermite control points and tangent vectors are used as main features. The method requires size normalization and thinning as a pre-processing

step. The recognition time reported was 5.21 seconds and is relatively large.

Z. Chi, et al [5] proposed a method in which SOM (self-organizing maps) algorithm is used in the learning phase to produce prototypes, which together with corresponding variances are used to determine fuzzy regions and membership functions. Learning from training patterns, then the algorithm generates fuzzy rules. In the recognition stage, a fuzzy rule based classifier is employed to classify an input pattern. An unsure pattern is re-classified by a SOM classifier.

Rocha J., et al [6] proposed a method which uses neural

networks. In Neural Networks the performance of a system deteriorates very rapidly with deterioration in the quality of the input or with the introduction of new fonts. In other words, the systems do not adapt to the changed environment easily. Training phase aims at exposing the system to a large number of fonts and their natural variants. S. W. Lee[4] proposed a new scheme for off-line recognition of totally unconstrained hand-written characters using a simple multi-layer cluster neural network trained with the Back Propagation Algorithm. It is shown that the use of genetic algorithms avoids the problem of finding the local minima in training the multi-layer cluster neural network.

Impedovo et. al [10] and others [8, 9] proposed a method

based on Fourier descriptors set for recognizing non-stylized numerals. Fourier descriptor for a curve is based either on amplitude and phase of harmonics or on a complex function of a point moving along the boundary. The Fourier Descriptor is invariant with respect to position and scale but depends on the starting point of the boundary tracing. This type of classification suffers from time inefficiency and also the accuracy is found to be comparatively less.

All the above discussed methods are efficient in

recognition of numerals, but they require large computation time mainly for pre-processing stages like normalization, thinning and skeletoning. Also the accuracy decreases as they are applied to large number of fonts. In this paper we propose a method which eliminates these drawbacks, by recognizing as many fonts as possible with high accuracy and less computation time (time required for recognizing a numeral). The proposed method has shown superior performance when compared to the recent methods developed for multi-font numerals.

In section II, we discuss two of the recent multi-font

numeral recognition methods. In section III, we propose a

new method, which is tested on 17 different fonts and sizes with an accuracy of 99.76%. In the next section experimental results are discussed followed by conclusions and references.

II. MULTI-FONT NUMERAL RECOGNITION – A REVIEW

A. Fuzzy Based Approach Hanmandlu et. al [2] proposed a fuzzy based approach.

The numerals are fit into a standard size window of 42 x 32. To avoid discontinuities, the un-thinned numeral is normalized and later thinned. This thinned image is then considered for extracting features. The binary array of 42 x 32 is fit into horizontal and vertical grid lines of 6 x 4. All boxes are considered for analysis in sequential order. By considering the bottom left corner as the absolute origin (0,0), the co-ordinate distances(vector distances) of all the pixels of a box are calculated. On taking average of the total of all distances with the number of pixels in a box, a normalized vector distances (γ) for each box is obtained. The features consisting of normalized vector distances and angles from each box are obtained. The mean and variances of these features are adapted using two structural parameters, which model their variations across the different fonts and styles of numerals. The features extracted from the size normalized and thinned binary array, form the input for recognition process. This method requires thinning and normalization. Also the method requires larger computation time.

B. Directional density vectors Based Approach B.V. Dhandra et al [1] proposed a method which aims at

producing a system, which could recognize printed numerals of any size, shape and fonts with reasonably less computational time. In this method of off-line numeral recognition, directional density vector is used as the main feature in the recognition process. The isolated binary numeral image is pre-processed to make it noise free, and the image is cropped by inserting a minimum rectangle bounding box. The outer densities of pixels are computed row by row until it touches the outer border of the numeral. This procedure is applied in the four directions viz. bottom, top, left and right directions.

The ratios of all the computed directional density of the

pixels with the total area are stored as the feature vector. The images were trained for all the numerals from 0 to 9 and for different font styles treating this recognition process as one class per numeral. Then the mean directional densities of pixels for each of the numerals by considering the different available fonts are found and stored in mean feature vector library. Also a topological feature, Euler number is used to reduce time required to search for the nearest mean vector by dividing the numeral classes according to their Euler number as a pre-classification step. Then a nearest mean vector is determined by the decision tree based minimum distance nearest neighbor classifier based on the Euclidean Distance(ED).

The fuzzy based approach requires thinning and normalization, thus requiring larger computation time. The directional density vectors based method can also be improved by reducing its computation time. From experimentation, we found that instead of computing directional densities in four directions, it is enough to consider the sequence pattern for some numerals on right side boundary and Euler number for other numerals. If there is any collision of patterns between two numerals, then only there is a need for next step, discussed in the following section.

III. PROPOSED METHOD

In this section we propose a new approach to multi-font numeral recognition. The proposed method aims at producing a system, which could recognize numerals of any size, shape and fonts with relatively less computational time. In this method of off-line numeral recognition, the isolated binary numeral image is pre-processed to make it noise free, and the image is cropped by inserting a minimum rectangle bounding box on the image. The cropped image is inverted and Euler number is used to classify the image as a pre-classification step. Euler number is computed as the difference in the number of objects and number of holes in the object. As a result the numbers 1, 2, 3, 5 and 7 are classified into one group with Euler number ‘+1’. The numbers 4, 6, 9 and 0 corresponding to Euler Number ‘0’ are classified into another group. The number 8 has an Euler number of ‘-1’. The Euler number helps in recognizing the numeral 8 in the first step, as it is the only numeral in the class ‘-1’. For the other two classes there is a need for further processing.

For the class of numerals with Euler number ‘0’, the

image is split into two equal halves horizontally. The Euler number of the upper half is found and the images for which it is 0 are identified as 9. If this Euler number is ‘1’ then the image can be 0, 6 or 4. Next, the Euler number of the lower half of the original image is found. Euler number ‘0’ identifies the image as 6. The other images with Euler number ‘1’ can be either 4 or 0. For these images, upper three- fourth part of the original image is taken to obtain its Euler number. The image with Euler number ‘0’ is determined as 4 and with Euler number ‘1’ is taken as numeral 0.

For the class of numerals with Euler number ‘+1’, the

pattern of arrangement of 0s and 1s in the rightmost column of the cropped image is taken as the feature vector as shown in Figure 1. For example, if the rightmost column is obtained as ‘0000011101111111111111111100’ then ‘01010’ is saved as the feature vector corresponding to this sequence. The feature vector obtained in the rightmost column is compared with the library of vectors saved to recognize the numeral. The ambiguity between any two numerals exists for the sequences ‘101’ and ‘10’. For this case, the feature vector is obtained at a distance 1/8th of the width (of the cropped image) from the right boundary of the image. This is because of the slight unnoticeable differences

arising in the patterns due to variations in fonts at larger sizes. For the sequence ambiguity of ‘01’, the feature vector is calculated at the centre of the numeral image, as the numerals 5 and 7 can be distinguished for larger fonts comfortably using the feature vectors from the centre only. There are three possibilities of the feature vector of 7 due to variations in fonts. As shown in the above figure 1, for the

font ‘Impact’, feature vector ‘101’ is obtained. For font ‘Estrangelo Edessa’, feature vector ‘10’, for ‘Garamond’, ‘1010’ is obtained as feature vector.

The feature vectors computed at the right side boundary of an numeral image obtained in similar manner are summarized below.

Pattern ‘1’ directly identifies the numeral as 1. Patterns ‘0101’ and ‘101’ confirm the numeral as 2. Patterns ‘01010’ and ‘1010’ identify the image 3 and 5

respectively. Patterns ‘010’, ‘10’ and ‘01’ can give ambiguous results

due to variations between different fonts of the same numeral. So for these patterns there is a need for an additional step of processing.

‘10’ pattern in the outermost column can correspond to either 5 or 7. To resolve between these two numbers the pattern of pixels is observed exactly at the centre of the image.Patterns ‘10101’, ‘101010’ correspond to numeral 5 while ‘01010’, ‘1’, ‘10’, ‘101’ and ‘1010’ identify the image as numeral 7.

‘010’ and ‘01’pattern in the outermost column can correspond to 1, 2, 3, 5 or 7. To resolve between these numbers the pattern of pixels is observed at a distance of one-eighth the width from the right boundary of the numeral. This is because slight variations in the same numerals of different fonts can change the pattern at the boundaries. To obtain stable results the pattern at 1/8th distance was observed.

Pattern ‘1010’ and ‘101010’ identify 5, pattern ‘0101’ identifies 2, patterns ‘01010’, ‘010’, ‘0101010’,

‘010101010’ confirm the number 3, ‘01’ and ‘1’ corresponds to 1 and ‘10’ correspond to 7.

A. Algorithm Input: Binary Noise Free Numeral Image Output: Recognition of the numeral in the image Step 1:Invert the image and insert the minimum bounding

rectangle box. Step 2:Find the Euler number of the image. Step 3:If the Euler number is ‘-1’, the numeral is 8 and go

to END else if if Euler number is ‘1’go to step 7.. Step 4: If Euler number is ‘0’, divide the image into two

halves horizontally and calculate their Euler numbers. If the top half Euler number is ‘0’, the numeral is 9 and go to END.

Step 5:If the Euler number of bottom half is ‘0’, the numeral is 6, go to END.

Step 6:obtain the Euler number for the top 3/4th part of the image , if it is ‘0’, then the number is 4, else the number is 0 and go to END

Step 7: The feature vector is obtained as discussed previously and the numeral image number is calculated.

Step 8: END

a) Impact b) Estrangelo Edessa

c) Garamond

Figure 1: Example numeral images for calculating patterns for number 7.

Table 1: Samples of font styles used under study

Sample Numerals Font Type

0123456789 Arial

0123456789 Arial Narrow

0123456789 Book Antiqua

0123456789 Century Gothic 0123456789 Courier

0123456789 Estrangelo Edessa

0123456789 Garamond

0123456789 Georgia

0123456789 Impact 0123456789 Lucida Console

0123456789 Palatino Linotype

0123456789 Sylfaen

0123456789 Shruthi

0123456789 Tahoma

0123456789 Times New Roman

0123456789 Trebuchet MS

0123456789 Verdana

IV. RESULTS

For experimentation, we used 17 different font styles summarized in Table 1. They are Arial, Arial Narrow, Book Antiqua, Century Gothic, Courier, Estrangelo Edessa, Garamond, Georgia, Impact, Lucida Console, Palatino Linotype, Sylfaen, Shruthi, Tahoma, Times New Roman Trebuchet MS and Verdana. For each font we tested for different sizes varying from size 8 to 72 in steps of 4, so 17 images for each font type for each number. The total number of numeral images used for testing are 2890 (17 font types * 17 images per font *10 numbers). The accuracy obtained is 99.76% as shown in Table 2, and the computation time is also less. The average time taken is 0.0135 seconds, when implemented on a Pentium IV machine with 1GB RAM on the WindowsXP platform with MATLAB 7.04 version software. Thus the proposed method is faster in computation and higher in accuracy

V. CONCLUSIONS

In this paper a new method is proposed for multi-font printed numeral recognition which is invariant of size. The

proposed method does not require normalization and is thinning free. It is based on classifying the numerals depending on Euler number and the pattern of arrangement of the pixel values in the numeral image. The computational time is 0.0135 seconds which is relatively less when compared to the existing methods. The efficiency is also high at 99.76%. It is always a trade-off between computational complexity and accuracy. The accuracy can still be increased by adding additional steps to resolve the ambiguities which are causing errors in recognizing a numeral.

VI REFERENCES [1]. B.V.Dhandra, etal, “Multi-font Numeral Recognition without Thinning based on Directional Density of Pixels ”, IEEE international conference on Digital Information Management ICDIM-2006, Bnagalore, India. 6-8 dec 2006. [2]. Hanmandlu, etal, “Fuzzy based approach to recognition of multifont numerals” Proceedings NCDAR, India, 2003, Pg 118-126. [3].Lingaraju G.M., C. Sujata, D. L. Prabhakar, A. N. Shantarajappa, “Cognition and Recognition of Numerals using Hermite Curves” , Proc. of 2nd National Conf. on Document Analysis and Recognition (NCDAR), Mandya, India, 2003, pp. 139-144. [4].S. W. Lee, “Offline Recognition of Totally Unconstrained handwritten numerals using multi layer cluster neural network”, IEEE Trans. on pattern Analysis and Machine Intelligence, Vol. 18, no. 6, pp. 48-652, june 1998. [5].Z.Chi, J. Wu and H. Yan, “Hand written numeral recognition using Self Organising maps and fuzzy rules”, Pattern Recognition, Vol. 28, No. 1 pp.59-66, 1995. [6].Rocha J. and T. Pavlidis, “A Shape Analysis Model with Applications to a Character Recognition System”, IEEE PAMI, vol. 16 no. 4, 1994, pp 393-404. [7]. Mori S, etal, “Historical review of OCR, research and development”, IEEE proceedings, vol. 80, no. 7, 1992, pg 1029-1058. [8] Shridhar M. and A. Badreldin, “High accuracy character recognition algorithm using Fourier and topological descriptors”, Pattern Recognition, vol. 17, 1984, pp.515-523. [9]. Lai M. T. Y. and C. Y. Suen, “Automatic recognition by Fourier descriptors and boundary encoding”, Pattern Recognition, vol. 14, 1981, pp. 383-393. [10].Impedovo S. et. al, “A Fourier descriptors set for recognizing non-stylized numerals”, IEEE Transactions on Systems Man and Cybernetics, vol. SMC-8, 1978, pp. 640-645.

Table 2: Results of Classification of Numerals

Numeral Total

Images

Correct

Classification

%

accuracy

0 289 289 100%

1 289 289 100%

2 289 289 100%

3 289 288 99.65%

4 289 289 100%

5 289 283 97.92%

6 289 289 100%

7 289 289 100%

8 289 289 100%

9 289 289 100%

Total 2890 2883 99.76%

[ieee tencon 2007 - 2007 ieee region 10 conference - taipei, taiwan (2007.10.30-2007.11.2)] tencon...

Documents