International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
337
FUZZY RULE BASED CLASSIFICATION AND RECOGNITION
OF HANDWRITTEN HINDI CURVE SCRIPT
Gunjan Singh1, Avinash Pokhriyal1, Sushma Lehri2
1( Faculty of Management & Computer Application, RBS College, Agra, India.) 2(Professor, I ET, Dr. B. R. Ambedkar University, Agra, India.)
ABSTRACT
This paper presents a novel system for classification and recognition of
handwritten Hindi script using fuzzy rule based approach. Classification & recognition of
handwritten Hindi script is a complex task as characters are cursive in nature and
demonstrate a lot of similar features. The quality of fuzzy logic to deal with vague and
imprecise data makes it appropriate for such problems. In this paper, we focus on two or
three letter words without modifiers. Prior to recognition, handwritten words are
preprocessed and segmented into individual characters. The performance of an optical
character recognition system extremely depends on the procedure used to extract quality
features from characters. During classification stage characters are classified into seven
classes using fuzzy if-then rules based on one of the most important component of Hindi
characters – the vertical bar. Features such as curves, lines, junction points and endpoints
are used at the recognition stage. A 3x3 mask is used to extract features from character
image. System was tested for total 450 words written by 30 different people.
Experimental results show that the proposed method performs classification and
recognition at the rate of 92.02%. The proposed system has been implemented in
MATLAB 2009 environment.
Keywords: Classification, Fuzzy rule based approach, Handwritten Hindi curve script,
Vertical bar, 8-neighbourhood
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), pp. 337-357 © IAEME:www.iaeme.com/ijcet.asp Journal Impact Factor (2012): 3.9580 (Calculated by GISI) www.jifactor.com
IJCET
© I A E M E
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
338
1. INTRODUCTION
Character recognition is a broad field in which all types of machine recognition of
characters in various application domains is studied. It includes the recognition of machine
printed as well as hand written characters. Recognition of machine printed characters
involves the recognition of characters written by a machine, while handwritten character
recognition includes the recognition of characters written by human being either online or
offline. Recognition of machine printed characters is easy as characters are of same size, font
& thickness and have a proper shape, but due to various writing styles, hand written character
recognition is difficult as characters may be of different sizes, width and orientation. A
comparison of both approaches is given in [1]. In this paper, we will present a fuzzy rule
based classification and recognition system for handwritten Hindi script.
Hindi is one of the official languages of India. It is world’s third most commonly used
language after Chinese and English. Hindi script has 13 vowels (‘SWARS’) and 33
consonants (‘VYANJANS’) in its basic character set. All the characters have two common
features – (i) their cursive nature and, (ii) presence of header line (‘SHIROREKHA’). Header
line is a powerful tool of Hindi language. These features differentiate
the script from English and other Latin scripts. Words are formed by combining characters,
half characters and /or modifiers using header line. Fig.1 shows basic character set, a list of
modifiers and few words.
(a) (c)
Figure 1(a). Basic character set, (b) Swars (vowels) & corresponding matras
(modifiers) and (c) Few Hindi language words
Now-a-days Hindi is being used worldwide in many fields such as banking, medical,
science and technology etc. Most of the Hindi language words are being included in world’s
best dictionaries and other vocabulary developing tools. Due to the increasing popularity,
automatic Hindi language recognition systems have now become important. Research in this
area started in early 1970s. In 1977, Sethi and Chatterjee [2] presented a constrained
recognition system for handwritten Hindi characters. In [3], Sinha and Mahabala presented a
syntactic pattern analysis system for the recognition of machine printed and handwritten
characters. The first complete OCR system for machine printed characters is presented in [4].
Recognition of handwritten Hindi characters is still difficult for a machine as characters are
(b)
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
339
cursive in nature and show a lot of similarities such as presence of header line, presence /
absence of vertical bar, loops & curves. A survey for handwritten character recognition was
proposed by R. Srihari [5] in 2000. Most of the work is focused on the recognition of
individual characters, and a little attention has been paid towards the recognition of words,
sentences or text. Recognition of words is difficult as words should be segmented into
individual characters. In the present paper, we propose a fuzzy rule based classification and
recognition system for handwritten Hindi curve script words of two or three letters without
modifiers.
Fuzzy logic is an organized method to solve problems dealing with vague, ambiguous,
imprecise, noisy, or missing input data. The concept of fuzzy logic is first given by Dr. Lotfi
A. Zadeh in 1965[13]. According to Dr, Zadeh, fuzzy logic is a mathematical tool for dealing
with uncertainty. As compared to crisp logic that deals with precise values; it is a form of
multi valued logic, which provides a way to deal with reasoning that is approximate. So it
gives a machine a better mean to simulate human reasoning capabilities. Dealing with
approximation makes it appropriate for problems such as handwritten character recognition.
This paper is organized in 5 sections. Section 2 throws some light on work done in the field
of handwritten Hindi character recognition. Section 3 presents the proposed system. Section 4
shows the experimental results. Finally conclusion is made in the last section.
2. LITERATURE REVIEW
Hanmandlu et al. [6] presented a fuzzy model based recognition system for
handwritten Hindi characters with 90.65% accuracy. The system works by performing coarse
classification of preprocessed character image by dividing it into 3x3 windows and then
determining the presence and position of vertical bar. Then feature are extracted by applying
the box approach. For recognition, an exponential variant of fuzzy membership function,
constructed using the normalized vector distance, is used. Mukherjee and Rege [7] presented
a shape feature and fuzzy logic based offline handwritten character recognition system for the
language with 86.4% recognition rate. Structural features, such as end points, junction points,
and adaptive thinning algorithm are used for segmenting characters into strokes. Then crisp
and fuzzy features are extracted for each stroke of the character. Two stage classification is
performed. Pre classification is performed using tree classifier in which characters are
classified based upon the presence and position of vertical line. Final classification and
recognition is performed using unordered stroke classification based on mean stroke features.
In [8], a handwritten Hindi vowel character recognition system is presented, in which vowels
are segmented into five groups using projection approach. To extract the core character
header line is removed by applying horizontal projection and modifiers are removed using
vertical projection. Feature extraction is done by using Invariant moments. Holambe and
Thool [9] presented a system for the recognition of printed and handwritten Devanagari script
using support vector machine and k-nearest neighbour classification technique. Singh, Mittal
and Ghosh [10] perform estimation of Support vector machine with Radial basis function and
k-nearest neighbour and achieved 93.8% accuracy. Two methods – curvelet transform &
character geometry used for extracting features.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
340
3. PROPOSED SYSTEM
The proposed system works in six stages: preprocessing, segmentation, normalization,
classification, feature extraction and recognition. Flow diagram is shown in Fig.2.
3.1 Preprocessing
During preprocessing, a number of following operations are performed on the collected data
to make it suitable for further processing—
(i) Scanning— Handwritten word data samples, collected from various people, are
scanned through an optical scanner or camera to convert data into a gray scale
image.
(ii) Noise Reduction-- Noise may be introduced in image during scanning, so to
reduce noise following operations are performed:
(a) Filtering—to reduce noise and false points, a nonlinear spatial filter- median
filter is applied. Concept is to convolute a predefined mask with the image and
replaces the value of the centre pixel by the median of intensity values in the
neighbourhood of that pixel [14]
(b) Dilation— there may be gaps in characters, which are filled by dilation using a
structuring element [14].
(c) Erosion— to eliminate the spurious objects from the image, erosion is applied
on it.
(iii) Slant Correction— there are chances that characters in the word are inclined
upwards or declined downwards, which makes feature extraction process difficult.
For that, slant correction is done by using [ 12].
(iv) Binarization--In this paper, features are extracted from binary images of
characters, so there is a need to convert the image to binary form. Global
thresholding is applied for binarization. The method works by choosing a
threshold value for the whole image and then sets the values of pixels to 1 whose
value is greater than the threshold and 0 otherwise.
(v) Thinning—Finally, binary image is thinned to single pixel width by the method
presented in [11].
Scanning Filtering Erosion Dilation
Slant
Correction Binarization Thinning
Noise Reduction
Preprocessing
Segmentation Normalization Classification
Feature
Extraction Recognition
Figure 2. Flow diagram of the proposed
Start
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
341
3.2 Segmentation
Thinned image of word is segmented into individual characters by histogram equalization as
following—
(i) First, horizontal histogram is taken to get the upper and lower boundary of the
word.
(ii) Then vertical histogram is taken to get the region of each character.
(iii) A case occurs when number of regions is more than the number of characters in
the word. It may be due to the presence of a character in which vertical bar is not
connected to the character. In that case, the region of the vertical bar, with highest
peak value, is considered to be a part of the character to its left.
3.3 Normalization Binary images of individual characters are normalized into 9x9.
3.4 Classification All Hindi language characters are made up of mainly three components: header line or
SHIROREKHA, vertical bar, and curves. In the proposed method, we choose vertical bar
component to classify characters. TABLE 1 shows the features (presence or absence, length,
position, connectedness of vertical bar and number of junction points) on which basis
different classes of characters are formed. A character can belong to one class only.
Table 1: Features used for classification
Feature Symbol Values
Presence of vertical bar
VB
P (present)
NP(not present)
Position of vertical bar
POS
M(middle)
RE (right end)
Length of vertical bar
LEN
S (20%-30% of the character width W)
L(70%-80% of the character width W)
Connectedness of vertical bar to
character
CON
C (connected)
NC (not connected)
Number of junction points
JP
1,2,3.4, or 5
A junction point is a point with 3 or more pixels in its neighbourhood .Method of
extracting these features is given in algorithm VERTICALBAR_INFO and
JUNCTIONPOINT_COUNT. A movable 3X3 mask (Fig.3) is applied on the image, which
shows 8-neighborhood of the pixel P0.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
342
P8 P1 P2
P7 P0 P3
P6 P5 P4
Figure 3: 3X3 mask
In these algorithms, following notations are used:
CP -- current pixel
CL -- current location
COUNT_1 -- counter variable to count the number of pixels. Initial value is set to 0.
COUNT_2 -- counter variable to count the number of junction points. Initial value is set to 0.
ROW -- current row number
COL -- current column number
Algorithm VERTICALBAR_INFO
To determine the information about the vertical bar do the following:
1. Starting from the last column of the first row i.e. ROW==0 & COL==8, convolute
the mask on the binary image of character and check:
(i) IF pixel is a foreground pixel then call it as P0.
IF number of neighbouring pixels of P0 ≥ 3 and one pixel is P5 then do
the following --
(a) Set CP = P0.
(b) Set N = COL.
(c) Increase COUNT_1 by 1.
(ii) ELSE move to next column to the left and repeat step (i) till COL ≥ 4
2. To identify the presence of vertical bar check the value of COUNT_1
IF COUNT_1 ==1
THEN VB is P
ELSE VB is NP.
3. To identify the position of vertical bar check the value of N.
IF N ≥ 8
THEN POS is RE
ELSE POS is M
4. To identify the length and connectedness of vertical bar to character check POS.
(i) IF POS==M
THEN do the following till P5 is encountered
(a) Set P5=P0
(b) Increase COUNT_1 by 1
(ii) IF COUNT_1 >3
THEN LEN is L
ELSE LEN is S
(iii) IF POS ==RE
THEN Set CP=P0 and check the following till P5 is encountered
IF P6 OR P7 OR P8 exists
THEN CON is C
ELSE CON is NC
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
343
Algorithm NUM_JUNCTIONPOINTS
To determine the number of junction points do the following
1. Starting from the upper left corner pixel, convoluting the mask on the
image from left to right.
2. Find the first foreground pixel P0
IF number of neighbouring pixels of P0 ≥ 3
THEN increase COUNT_2 by 1
ELSE P0=P3
3. Repeat step 2 till rightmost lower pixel is obtained.
4. Set JP=COUNT_2
Using above mentioned algorithms, following fuzzy rules are formed to classify the
characters into one of the eight classes. Flow process is shown in Fig.4.
(i) IF VB == NP THEN character belongs to class A ( )
(ii) IF VB == P AND POS == M AND LEN == L THEN character belongs to class B
( )
(iii) IF VB == P AND POS == M AND LEN == S AND JP < 2 THEN character
belongs to class C( )
(iv) IF VB == P AND POS == M AND LEN == S AND JP ≥ 2THEN character belongs
to class D ( )
(v) IF VB == P AND POS== RE AND CON == NC THEN character belongs to class
E ( )
(vi) IF VB == P AND POS == RE AND CON == C AND JP <4 THEN character
belongs to class F( )
(vii) IF VB == P AND POS == RE AND CON == C AND JP ≥ 4 THEN character
belongs to class G( )
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
344
Figure 4. Flow process of classification
3.5 Feature Extraction Steps for extracting features are given in following algorithm--
Algorithm FEATURE_REC
1. Remove header line by applying the following method-
(i) Apply the 3X3 movable mask on the normalized image and scan the first row
from right to left.
(ii) IF pixel is a foreground pixel then call it P0.
IF P7 is a foreground pixel OR P0 is an end point OR P0 is a
disconnected component
SET P0 = 0
ELSE move to the left pixel.
If
JP ≥2
If
CON==NC
yes
If LEN==L
no
Read normalized image of size 9X9
of the character
Character
belongs to class
A
( )
Character
belongs to
class E
( ) Character
belongs to class
B
( )
Character belongs
to class
D
( )
Character belongs to class C
( )
If JP ≥ 4
no
Character belongs to
class F
( )
Character belongs to
class G
( )
Read presence of VB
If VB==A
no
Read position of VB
If POS==RE
no
Read length of VB
yes
no
Read value of JP
yes
yes
Read connectedness of VB
yes
no
Read value of JP
yes
VB : Vertical bar
A: Absent
POS : Position of vertical bar
RE : Right end
M: Middle
LEN : Length of vertical bar
L : Large
S: Small
JP : Junction point
CON: Connectedness of vertical bar
NC : Not connected
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
345
Image is scanned from right to left to avoid the deletion of character pixels in
characters such as: because these characters, except , may be written
in two ways— (a) header line covers the whole character and, (b) when header line
covers only half or a portion of the character. In the first case, this step may result in
deletion of pixels, which are common to header line and character, in characters
mentioned above as well as characters such as and may produce some
disconnected components with small number of pixels.
2. Delete disconnected components as following--
(i) Scan the second row of the image from left to right.
(ii) Find the first foreground pixel P0.
(iii) IF P3 ==1
IF any pixel in 8 neighbourhood of P3 does not exists
THEN SET P0=0 AND P3=0
ELSE IF P5==1
IF any pixel in 8 neighbourhood of P5 does not exists
THEN SET P0=0 AND P5=0
Fig. 5 shows the process of deleting header line from character and its result.
(a) (b) (c)
Figure 5: (a) Character with header line, (b) Character without header line and
disconnected component, (c) Character after removing disconnected component
3. Apply the 3X3 movable mask on the normalized image of classified character and
scan the image
from top to bottom row wise. Collect following information for junction points and
end points--
(i) N1 : total number of junction points
(ii) N2: total number of end points
(iii) JPi : ith
junction point, where i=1 to N1
(iv) EPi : ith
end point where i=1 to N2
(v) Curve (JPi) : curve on ith
junction point (Table 2)
(vi) Curve (EPi) : curve on ith
end point
(vii) Line(JPi) : line on ith
junction point (Table 2)
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
346
(viii) Line(EPi) : line on ith
end point
(ix) Loop(JPi) : loop on ith
junction point
(x) D1(i): direction of next endpoint from ith
end point
(xi) D2(i): direction of next junction point from ith
junction point
Values and symbols of different types of curves, lines & loops are given in the
TABLE 2.
Table 2: Values and symbols for curves, lines and loop
Features Values Symbol
Curve
Left Curve LC
Upper left curve ULC
Lower left curve LLC
Right curve RC
Upper right curve URC
Lower right curve LRC
U curve U
Line
Vertical line VL
Horizontal line HL
Back slash BS
Loop
Present
P
Not present NP
Different forms of above mentioned curves, lines and loops are shown in Fig. 6.
In this code, following notations are used:
PS -- Starting point
CL -- current location
CP -- Current pixel
COUNT -- counter variable. Initial value is set to 0.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
347
Algorithm CURVE_LINE_LOOP_INFO
To determine the nature of the curve do the following:
Convolute the mask on the binary image of classified character from bottom to top row wise.
Let P is the first foreground pixel. Call it current pixel (CP).
1. If CP is a junction point or end point, then check the 8-neighbourhood of CP.
(a) IF P1 is true THEN
(i) Repeat till P1 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
(b) IF P3 is true THEN
(i) Repeat till P3 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
(c) IF P8 is true THEN
(i) Repeat till P8 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
(d) IF P1 OR P2 is true THEN
(i) Repeat till P1 OR P2 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
(e) IF P1 OR P8 is true THEN
(i) Repeat till P1 OR P8 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
(f) IF P2 OR P3 OR P4 is true THEN
(i) Repeat till P2 OR P3 OR P4 encountered
(ii) Increase COUNT by 1.
ELSE stop.
(g) IF P4 OR P5 is true THEN
(i) Repeat till P4 OR P5 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
(h) IF P6 OR P7 OR P8 is true THEN
(i) Repeat till P6 OR P7 OR P8 is encountered
(ii) Increase COUNT by 1.
ELSE stop.
2. Check the following to know the type of curve and line:
(i) IF step 1(h) is true
IF step 1(a) is true
IF step 1(f) is true
IF COUNT ≥ 3
THEN Curve is LC
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
348
(ii) ELSEIF step 1(e) is true
IF step 1(f) is true
IF COUNT ≥2
THEN Curve is ULC
(iii) ELSEIF step1(h) is true
IF step 1(e) is true
IF COUNT ≥2
THEN Curve is LLC
(iv) ELSE IF step 1(f) is true
IF step 1(a) is true
IF step 1 (h) is true
IF COUNT ≥ 3
THEN Curve is RC.
(v) ELSE IF step 1(d) is true
IF step 1(h) is true
IF COUNT ≥ 2
THEN Curve is URC.
(vi) ELSE IF step 1 (f) is true
IF step 1(e) is true
IF COUNT ≥ 2
THEN Curve is LRC.
(vii) ELSEIF step 1(g) is true
IF step 1(h) OR step1 (f) is true
IF step 1(d) is true
IF COUNT ≥3
THEN Curve is U
(viii) IF step 1(a) is true
IF COUNT ≥ 2
THEN Line is VL
(ix) IF step 1(b) is true
IF COUNT ≥ 2
THEN Line is HL
(x) IF step 1(c) is true
IF COUNT ≥ 2
THEN Line is BS
3. If CP is a junction point, then do the following to check the presence of loop:
IF step 1(h) is true
IF step 1(a) OR step 1 (g) is true
IF step 1(f) is true
IF Pi == CP AND COUNT ≥ 5
THEN Loop is P.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
349
(a) (b)
(c) (d)
(e) (f)
(g)
(h) (i) Figure 6 : Different types of curves : (a) Left curve (LC), (b) Upper left curve (ULC) , (c) Lower
left curve (LLC), (d) Right curve (RC), (e) Upper right curve (URC), (f) Lower right curve (LRC), (g)
U curve (U) , (h) Vertical line (VL), Horizontal line (HL), Backward slash (BS), (i) loop
CP
CP
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
350
3.6 Recognition
Fuzzy rules are used for recognition. Class wise rules applied for characters are:
1. IF Class is A
IF Curve (EP1) == RC
THEN character is
ELSE IF Curve (JP1) ==LRC
IF N2==4 OR D1 (3) == P3
THEN character is
ELSE character is
2. IF Class is B
IF Curve (EP2) == LC
THEN character is
ELSE IF Curve (EP2) == URC
IF Curve (JP1) == LC OR Loop(JP1) ==P
THEN character is
ELSE character is
3. IF Class is C
IF Curve (EP1) == LC
THEN character is
ELSE IF Curve (EP1) == RC
IF N2==3
THEN character is
ELSE character is
4. IF Class is D
IF Curve (EP1) == LC
THEN character is
ELSE IF Curve (JP1) == LC
IF N2 < 2
THEN character is
ELSE IF N2==2
THEN character is
ELSE character is
ELSE IF Loop (JP1) ==P
IF Curve (JP1) == RC OR URC
THEN character is
ELSE character is
5. IF Class is E
IF Loop (JP1) ==P
IF N1==2
THEN character is
ELSE character is
ELSE IF Curve (EP1) == U
THEN character is
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
351
6. IF Class is F
IF N2 > 3
IF Curve (EP1) == ULC
THEN character is
ELSE IF Curve (EP1) == RC OR Curve (EP2) == RC
THEN character is
ELSE character is
ELSE IF N2==3
IF Curve (JP1) == LLC
THEN character is
ELSE IF Curve (JP1) ==U
THEN character is
ELSE IF Curve (EP1) == ULC
THEN character is
ELSE character is
ELSE
IF Curve (JP1) ==U
THEN character is
ELSE IF Curve (JP1) ==LLC
THEN character is
ELSE IF Curve (JP1) ==LC OR Loop (JP1) ==P
THEN character is
ELSE character is
7. IF Class is G
IF N2>4
IF Curve (EP1) == RC
THEN character is
ELSE IF Line (EP1) == BS
IF D2 (1) ==P3 OR D2(2)==P3
THEN character is
ELSE character is
ELSE IF N2 ==4
IF Loop on JP1 ==P
THEN character is
ELSE character is
ELSE
IF Curve (JP1) ==LLC OR U
THEN character is
ELSE IF Curve (JP1) == LC
THEN character is
ELSE IF Loop on JP1 ==P
IF Loop on JP3 ==P OR LINE (EP2) == HL
THEN character is
ELSE character is
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
352
Table 3: Summary of fuzzy rules for each character
Class N1 N2 Curve(JP) Curve(EP) Line(JP) Line(EP) Loop(JP) D1 D2 D3 Character
A
--- -- --- RC --- --- --- --- --- ---
--- --- --- --- LRC --- --- --- --- --- --- 4 --- --- LRC --- --- P3 --- ---
B
--- --- --- LC --- --- --- --- --- --- --- --- --- URC --- --- --- --- --- ---
--- --- LC URC --- --- P --- --- ---
C
--- --- --- LC --- --- --- --- --- --- --- --- --- RC --- --- --- --- --- --- --- 3 --- RC --- --- --- --- --- ---
D
--- --- --- LC --- --- --- --- --- --- --- <2 LC --- --- --- --- --- --- --- --- 2 LC --- --- --- --- --- --- --- --- --- LC --- --- --- --- --- --- --- --- --- --- --- --- --- P --- --- --- --- --- RC OR
URC
--- --- --- P --- --- ---
E
2 --- --- --- --- --- P --- --- --- --- --- --- --- --- --- P --- --- --- --- --- --- U --- --- --- --- --- ---
F
--- >3 --- --- ---
--- --- --- --- ---
--- >3 --- ULC --- --- --- --- --- --- --- >3 --- RC --- --- --- --- --- --- --- 3 --- --- --- --- --- --- --- --- --- 3 LLC --- --- --- --- --- --- --- --- 3 U --- --- --- --- --- --- --- --- 3 --- ULC --- --- --- --- --- --- --- <3 --- --- --- --- --- --- --- --- --- <3 U --- --- --- --- --- --- --- --- <3 LLC --- --- --- --- --- --- --- --- <3 LC --- --- --- P --- --- ---
G
--- >4 --- --- --- --- --- --- --- --- --- >4 --- RC --- --- --- --- --- --- --- >4 --- --- --- BS --- P3 P3 --- --- 4 --- --- --- --- --- --- --- --- --- 4 --- --- --- --- P --- --- --- --- <4 LLC OR
U
--- --- --- --- --- --- ---
--- <4 LC --- --- --- --- --- --- --- --- <4 --- --- --- HL P --- --- ---
--- <4 --- --- --- --- P --- --- ---
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
353
4. EXPERIMENTAL RESULTS
Dataset has been created by collecting handwritten word samples by 30 people of
different age groups. Each person was asked to write 15 predecided words. A part of dataset
is shown in the following figure—
Figure 7: Word samples taken for experiment
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
354
These word samples were scanned, using a flat-bed scanner at 300dpi. Results of operations
performed during recognition process on scanned image of word are shown in the
following figure.
Figure 8. Result of operations performed during preprocessing, segmentation
and classification on sample word
Original image
Filtered image
Binarized image
Thinned image
Segmented
image
VB == NP VB == P
POS == RE
CON == C
JP ≥ 4
VB == P
POS == RE
CON == C
JP < 4
Character belongs to class G A F
Classification
Eroded and dilated image
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
355
After classification, features mentioned in TABLE 2 are extracted for each character by
applying algorithm FEATURE_REC, which are then used at the time of recognition.
Recognition rate for each word sample and for the proposed method is given in TABLE 4.
Table 4. Average recognition rate of selected words
Sample Word Recognition
rate
of character
1
Recognition
rate of
character 2
Recognition
rate of
character 3
Avg.
recognition
rate
S1
92.15% 94.08% 88.23% 91.48%
S2
94% 90.11% 87.23% 90.44%
S3
90.93% 97.26% 95.06% 94.41%
S4
94.14% 90.17% 90% 91.43%
S5
83.66% 93.96% 92.07% 89.89%
S6
95% 93.48% 84.36% 90.94%
S7
95.22% 92.01% 89.76% 92.33%
S8
96.31% 92.45% 91.19% 93.31%
S9
88.42% 92.31% 94.21% 91.64%
S10
89.75% 83.52% 93.46% 88.91%
S11
90.68% 88.99% ---------- 89.83%
S12
96.29% 94.43% --------- 95.36%
S13
88.57% 93.91% --------- 91.24%
S14
96.81% 97.44% -------- 97.12%
S15
87.41% 96.80% -------- 92.10%
Overall Average Recognition Rate 92.02%
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January
Figure 9. Graphical representation of recognition rate of sample words
5. CONCLUSION
In this paper, we have present
simple Hindi language two or three
approach. Characters are first classified into seven different classes and then recognized class
wise. Few misclassification cases
characters such as & and
way such as & . We have extracted features for
for recognition process. Algorithms developed perform well and give fine results as the most
prominent features, such as vertical bar, curves, loops and lines,
recognition stage. Experimental results verify the significance o
92.02% recognition rate. Fuzzy logic performs better than other methods as it can deal with
imprecise, incomplete and vague data efficiently without losing any important information. In
future, we will work to achieve better
emphasizing more on characters having similar shape such as and
modifiers.
REFERENCES
Journal Papers: [1]. N. Arica and F.T. Yarman
off line hand writing, C99
[2]. I.K. Sethi, and B. Chatterjee, Machine
Devnagari,
pattern recognition, vol. 9, no. 2, 1977, pp.69
[3]. R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEE
Trans. System, Man Cybern. 9,1979, 435
[4]. S. Palit, B.B. Chaudhuri, P.P. Das, B.N. Chatterjee,
Processing and Computer Vision, Narosa
84
86
88
90
92
94
96
98S
1
S2
S3
S4
S5
S6
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
356
Graphical representation of recognition rate of sample words
presented a novel method for classification and recognition of
two or three letter words without modifiers using fuzzy rule based
Characters are first classified into seven different classes and then recognized class
es arise due to the presence of: some of the
& , and characters which can be written in more than one
have extracted features for all the basic characters of the language
Algorithms developed perform well and give fine results as the most
, such as vertical bar, curves, loops and lines, are used at classification and
Experimental results verify the significance of the proposed system with
Fuzzy logic performs better than other methods as it can deal with
imprecise, incomplete and vague data efficiently without losing any important information. In
to achieve better results and to improve the recognition rate by
emphasizing more on characters having similar shape such as and on Hindi words with
F.T. Yarman-Vural, An overview of character recognition focused on
C99-06-C-203, 2000,IEEE.
I.K. Sethi, and B. Chatterjee, Machine recognition of constrained hand printed
, vol. 9, no. 2, 1977, pp.69 – 75.
R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEE
Trans. System, Man Cybern. 9,1979, 435-441.
S. Palit, B.B. Chaudhuri, P.P. Das, B.N. Chatterjee, Pattern Recognition, Image
Processing and Computer Vision, Narosa Publishing House, India,1995,163
S6
S7
S8
S9
S1
0
S1
1
S1
2
S1
3
S1
4
S1
5
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
February (2013), © IAEME
Graphical representation of recognition rate of sample words
or classification and recognition of
words without modifiers using fuzzy rule based
Characters are first classified into seven different classes and then recognized class
of the similar shape
and characters which can be written in more than one
of the language
Algorithms developed perform well and give fine results as the most
classification and
f the proposed system with
Fuzzy logic performs better than other methods as it can deal with
imprecise, incomplete and vague data efficiently without losing any important information. In
and to improve the recognition rate by
on Hindi words with
overview of character recognition focused on
rinted
R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEEE
Pattern Recognition, Image
1995,163-168.
Series1
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
357
[5] R. Plamondon and S. N. Srihari, “On-line and off-line handwriting recognition: A
comprehensive survey”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22(1), 2000,
pp63–84.
[6] M. Hanmandlu, O.V. R. Murthy and V. K. Madasu, fuzzy model based recognition of
handwritten Hindi characters, 0-7695-3067-2/07, 2007,IEEE.
[7] P. Mukerji and P.P. Rege, Shape Feature and Fuzzy Logic Based Offline Devnagari
Handwritten Optical Character Recognition, Journal of Pattern Recognition Research
4, 2009, 52-68.
[8] R.J.Ramteke, Invariant moments based feature extraction for handwritten Devnagari
vowel recognition, IJCA, ( 0975-8887) Vol 1 – No. 18., 2010.
[9] A. N. Holambe, R.C.Thool , Printed and handwritten character & number recognition
of Devanagari script using SVM and KNN, Int. Journal of Recent Trends in
Engineering and Technology, Vol. 3, No. 2, May 2010
[10] B. Singh, A. Mittal and D. Ghosh, An evaluation of different feature extractors and
Classifiers for offline handwritten Devnagari character recognition, Journal of Pattern
Recognition Research 2, 2011, 269-277.
[11] A. Pokhriyal and S. Lehri, MERIT: Minutiae Extraction Using Rotation Invariant
Thinning. International Journal of Engineering Science & Technology, vol. 2(7),
2010, 3225-3235.
[12] Primekumar K.P and Sumam Mary Idicula, “Performance Of On-Line Malayalam
Handwritten character Recognition Using HMM and SFAM” International journal of
Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 115 -
125, Published by IAEME
Proceeding Papers: [12] P. Mukherji, P. P. Rege and L. K. Pradhan, Analytical Verification System for
Handwritten Devnagari Script. Proceedings of the Sixth IASTED VIIP, pp. 237-242,
Palma DeMallorca, Spain, August,2006.
Books:
[13] S.N. Sivanandam and S. N. Deepa, Principles of Soft Computing (Second Edition,
Wiley-India)
[14] R.C. Gonzales and R.E.Woods, Digital Image Processing (Second Edition, Prentice
Hall)