fringe map based charcter segmentationijrpublisher.com › gallery › 268-june-1388.pdf · 2.1...

12
FRINGE MAP BASED CHARCTER SEGMENTATION R.SINDHU 1 P.HEMA KUMAR 2 1 P.G Student-Department of Electronics and Communication Engineering, Swarnandhra Institute of Engineering and Technology, Narasapur 2 Assistant Professor, Department of Electronics And Communication Engineering , Swarnandhra Institute of Engineering and Technology, Narasapur, 1 [email protected] 2 [email protected] Abstract Accurate segmentation of text lines from printed or handwritten documents is an important task in any document processing system. This becomes a challenging and complex problem due to several reasons. Situations arise when the text from neighboring lines overlaps the white space area, or touches text of the current line. Complications may also arise when due to varying skew, text lines curve along the page in varied trajectories. These situations are beyond the scope of common algorithms developed for some printed. We proposed a approach based on fringe maps to generate segmenting paths between adjacent text lines. 1. Introduction Literally, OCR stands for Optical Character Recognition. It is a wide spread technology to recognize text inside images, such as scanned documents and photos. OCR technology is used to convert virtually any kind of images containing written text (typed, handwritten or printed) into machine-readable text data. OCR Technology became popular in the early 1990s while attempting to digitize historic newspapers. Since then the technology has underwent several improvements. Nowadays solutions deliver near to perfect OCR accuracy. Advanced methods like Zonal OCR are used to automate complex document based workflows. Probably the most well-known use case for OCR is converting printed paper documents into machine-readable text documents. Once a scanned paper document went through OCR processing, the text of the document can be edited with word processors like Microsoft Word or Google Docs. Before OCR technology was available, the only option to digitize printed paper documents was to manually re-typing the text. International Journal of Research Volume VIII, Issue VI, JUNE/2019 ISSN NO:2236-6124 Page No:2295

Upload: others

Post on 25-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

FRINGE MAP BASED CHARCTER SEGMENTATION

R.SINDHU1

P.HEMA KUMAR2

1 P.G Student-Department of Electronics and Communication Engineering,

Swarnandhra Institute of Engineering and Technology, Narasapur 2 Assistant Professor, Department of Electronics And Communication

Engineering , Swarnandhra Institute of Engineering and Technology, Narasapur, [email protected]

[email protected]

Abstract

Accurate segmentation of text lines from printed or handwritten documents is an

important task in any document processing system. This becomes a challenging and

complex problem due to several reasons. Situations arise when the text from neighboring

lines overlaps the white space area, or touches text of the current line. Complications may

also arise when due to varying skew, text lines curve along the page in varied trajectories.

These situations are beyond the scope of common algorithms developed for some printed.

We proposed a approach based on fringe maps to generate segmenting paths between

adjacent text lines.

1. Introduction Literally, OCR stands for Optical Character Recognition. It is

a wide spread technology to recognize text inside images, such as

scanned documents and photos. OCR technology is used to convert

virtually any kind of images containing written text (typed, handwritten

or printed) into machine-readable text data.

OCR Technology became popular in the early 1990s while

attempting to digitize historic newspapers. Since then the technology

has underwent several improvements. Nowadays solutions deliver near

to perfect OCR accuracy. Advanced methods like Zonal OCR are used

to automate complex document based workflows.

Probably the most well-known use case for OCR is converting

printed paper documents into machine-readable text documents. Once

a scanned paper document went through OCR processing, the text of

the document can be edited with word processors like Microsoft Word

or Google Docs. Before OCR technology was available, the only option

to digitize printed paper documents was to manually re-typing the text.

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2295

Page 2: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

Not only this massively time consuming, it also came with inaccuracy

and typing errors.

2.1 OPTICALCHARACTER SEGMENTATION (OCR):

Optical Character Recognition (OCR) is a piece of software

that converts printed text and images into digitized form such that it can

be manipulated by machine. Unlike human brain which has the

capability to very easily recognize the text/ characters from an image,

machines are not intelligent enough to perceive the information

available in image. Therefore, a large number of research efforts have

been put forward that attempts to transform a document image to

format understandable for machine. OCR is a complex problem because

of the variety of languages, fonts and styles in which text can be

written, and the complex rules of languages etc.

2.2 GENERATIONS OF OCR:

The generations of OCR are

First generation OCR systems

Second generation OCR systems

Third generation OCR systems

Fourth generation OCR systems (Today’s generation).

2.3 Process of OCR:

The process of OCR is a composite activity comprises different

phases. These phases are

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2296

Page 3: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

Figure 2.3.1: Preprocessing of OCR

2.4 SEGMENTATION:

Segmentation is an integral part of any text based recognition

system. Segmentation phase include basically three phases, i.e. line

segmentation, word segmentation, character segmentation.

Methods Disadvantages

Histogram approach It causes some loss on the text line area

Header line and base

line approach

Header lines are rows with maximum number of black pixel and base li

ne are rows with minimum number of black pixel.

Hough transform

approach

It requires a relatively large amount of memory and a long computation

time.

Smearing method The consecutive black pixels along the horizontal directions are smeared.

Grouping approach It does not work when most of the edge pixels are isolated.

Gradient based

approach

Gradient magnitude and orientation of each pixel are explicitly used to

group the pixels.

2.5 METHODS OF CHARACTER SEGMENTATION

1. Water reservoir method

2. Split profile method

3. Segmentation model

4. Line and Zone Separation

5. Syllable segmentation

6. Water drop fall algorithm

6.1 Types of Drop fall Algorithm

6.2 Top-Left Drop Fall

6.3 Top-Right Drop Fall

6.4 Bottom-Left Drop Fall

6.5 Bottom-Right Drop Fall

2.6 FRINGE MAP METHOD:

In this method, it is necessary to understand the concept of

fringe maps. These can be said to be related to distance transforms [18]

for binary images. In a fringe map, each pixel is represented with a

fringe number. Every Black pixel has a fringe number of zero. White

pixels have a fringe number which is a positive integer, that is the

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2297

Page 4: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

distance from the nearest black pixel using a L2 metric. In

other words, A white pixel with fringe number x states that:

1) It is x pixels away from its nearest black pixel.

2) It is surrounded by atleast (x-1) white pixels in all directions.

3. OPTICAL CHARACTER RECOGNITION (OCR)

3.1 PROJECTION PROFILE METHOD:

Projection profile method is of two types

1. Horizontal projection profile method

2. Vertical projection profile method.

3.2 Horizontal projection profile method:

The horizontal projection profile method is used to calculate

sum of all white pixels on every row. It gives corresponding histogram

of that image in line segmentation.

3.3 Vertical projection profile method:

Vertical projection profile method is used to calculate sum of all white pixels and plot the

histogram of computed white pixels in word and charcter segmentation.

PROPOSED METHOD

Text line segmentation in printed or handwritten document

images is one of the crucial tasks in a document image analysis (DIA).

The performance of a DIA system depends critically on the accuracy of

text line segmentation. Errors at this stage cannot be easily overcome

or undone by the following stages that perform character recognition

or post processing. Text line segmentation is usually seen as a simple

or solved problem in an ideal situation, where documents contain

distinct white space between lines. However for historical documents or

Indic scripts this is not always true. Most challenges encountered in

such nonconventional text line segmentation are when:

1. Adjacent text lines may be very close or touch each other.

2. There is variation of the angle between text lines.

3. Text lines are curvilinear.

4. Variation in the skew directions.

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2298

Page 5: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

.

Figure 3.3.1: An example of documents with touching, overlapping

and skewed text lines

Fig3.4.1 shows examples of documents with touching, overlapping and

skewed text. Touching and overlapping of text lines occurs in printed or

handwritten documents because of narrow regions of white space

between adjacent text lines. Most Indian scripts such as Telugu,

Kannada, Tamil, Bangla and Malayalam have documents that suffer

overlapping and touching lines due to vowel modifiers and consonant

modifiers. An example is shown in Fig 4.2 for Telugu text where these

are called as Matras and Vottus respectively.

Appearance of skewed lines in the text image makes the

problem complex. The problem becomes compounded if the lines in a

text image are skewed with different orientations. Such lines are called

multi-skewed lines.

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2299

Page 6: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

3.3.2: Telugu script shows consonant modifiers and vowel modifiers of a

character.

We see that amongst the traditional text line segmentation

algorithms, the projection profiles method [18, 20, 25] is suitable for

clearly separated lines, it cannot handle text lines with overlapping,

touching and different skew angles. Connected component based

methods [18,24] and smearing method [18, 25] also do not do so well

for these documents.

The proposed text line segmentation method in this paper is

based on fringe maps [19]. Its objective is to find a segmenting path

between two adjacent lines to separate them. Our motivation is to use

an approach that is based upon using white spaces to guide the

segmentation method, much as a human reader follows the gaps and

views the lines between the text.

FRINGE MAP METHOD:

In this method, it is necessary to understand the concept of

fringe maps. These can be said to be related to distance transforms [27]

for binary images. In a fringe map [18, 26], each pixel is represented

with a fringe number. Every Black pixel has a fringe number of zero.

White pixels have a fringe number which is a positive integer, that is the

distance from the nearest black pixel using a L2 metric. In other words,

A white pixel with fringe number x states that:

1) It is x pixels away from its nearest black pixel.

2) Its is surrounded by atleast (x-1) white pixels in all directions.

For us the second point is also important. We assume inputs to

be normal binary images of text documents where the printing or

writing is dark and background is light. So black pixels or the writing is

represented as '0' and white pixels are '-1'. To generate a Fringe map

for the input binary image we start by examining each neighbor of each

black pixel, and write a positive integer in each neighbor which is white

pixel value -1. Horizontal, vertical, and diagonal neighbors are

examined. Having done this, examine each neighbor of each pixel with

a one in it, and write twos in the neighbors that are -1. Continue

growing fringes until there are no more -1 value pixels. An example

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2300

Page 7: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

fringe map is shown in Figure 4.3. Originally as proposed

concept of fringe distances were used by

Brown [19] to recognize characters in images. Here we have

very novel application since we want to use the fringe maps to segment

text lines

Figure 3.3.3: An example fringe map of an telugu image and Peak

Fringe number (in circle) between two consecutive zeros in vertical

direction.

4.1 IMPLEMENTATION:

The proposed algorithm for text line segmentation of printed or

handwritten document images deals with the following challenges:

(1) Components of neighboring text lines may touch or overlap as shown in Fig

(2) Text lines that appear in the document may not have a simple linear

skew and could be multiskew.

The proposed algorithm segment lines in three stages as shown

in Figure. The first stage generates a fringe map for the given input

binary image. In the second stage, Peak fringe numbers (PFNs) are

located in the fringe map. A filtering operation on the PFNs is

performed. Then the PFNs between text lines are determined. In the last

stage, a segmenting path between lines is generated by joining the

PFNs. Here it is important to note that each background pixel in the

fringe map is represented with a fringe number that is a positive

integer, black pixels or object pixels have fringe number that is zero. In

next step, we find the Peak Fringe number (PFN) between two zeros

(black pixels) in vertical direction in the fringe map. A PFN with value

x is surrounded by (x-1) pixels white pixels. PFNs are white pixels that

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2301

Page 8: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

may lie between the lines or in the white space inside a character. We

are interested in the determining the PFNs between the lines, and need

to reject the latter. In the last step we generate a segmenting path

between two adjacent lines by joining PFNs. With this perspective view

of the processing we now show more details in the following.

Figure 4.1.1: Proposed text line segmentation method block diagram.

4.2 Generating Fringe Map:

Fringe map is generated for the given input binary image. In a

fringe map, each pixel is represented with a fringe number. A pixel with

a fringe number x is surrounded by at least (x-1) white pixels in all

directions. We use maximum fringe numbers to find white pixels

between lines.

Definition of PFN:

In a fringe map, Peak Fringe Number (PFN) is defined as a

white pixel between two consecutive black pixels with maximum fringe

value (a number) in the direction of interest. An example PFN is shown

in Fig 4.4; where the direction of interest is in the vertical direction.

Locate and Determine the PFNs between lines:

In the following we assume that the default direction for PFNs

is vertical. We scan the fringe map along columns and locate the

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2302

Page 9: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

PFNs. The PFNs may be present inside the connected

component or outside the connected component (between the lines) as

shown in Figure 5. We are interested in only the PFNs outside the

connected components through which we can find a segmenting path

between lines. Generally, we observe that the values of PFNs which are

inside the connected components are less than the values of PFNs

outside the connected components. Therefore we can use this

observation to

separate the PFNs, and distinguish between those PFNs that we need.

A simple threshold based method could be used to separate the required

PFNs. Here we use the choice of arithmetic mean of all the PFNs to be

used as threshold T. We observe that as expected the PFNs greater than

or equal to T are usually found between the text lines. Thus we have

filtered out the desired PFNs which are now called as filtered peak

fringe numbers (FPFNs). Fig 4.5 shows these FPFNs, whose fringe

number is greater than or equal to T. FPFNs play an important role in

generating segmenting path between lines.

4.3 Text Line Segment Path Generation:

Now a text line segmenting path is required to be found that

separates adjacent text lines but is generated by joining the FPFNs. It

is quite possible that the filtering process leaves gaps and we may not

be able to join FPFNs easily. We need now to further analyze the

FPFNs and find a value.

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2303

Page 10: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

Figure 4.3.1: Regions and filtered PFNs

M1 that is greater than or equal to the arithmetic mean of the

FPFNs. Then tentative segmentation paths can be made by joining all

those FPFNs that are greater than M1. However it is not so simple

since discontinuities may arise. The following procedure explains the

search and joining of these FPFNs with those in a tentative

segmentation path. We scan the fringe map along columns from left to

rightand search for FPFNs with fringe value greater than or equal to

M1 and build a partial segment path. Those points on this path which

are end points of this partial segment path are called as Marked

Segment Points (MSPs). For a given point at a position (i,j) in the

current column j, for which we have already constructed a partial path.

Now beyond this we don’t find any points immediately that could be

added to the path straight away. However there is another FPFN in the

neighborhood where the path may continue

The issue is to find a local connection between these points.

These sets of points that need to be joined to the path are found by

searching with relaxed constraints on the fringe values that can be

added if the following two cases are used. Consider now the point p,

(i,j), we search for FPFN nearest to it in a square window of size N × N

where N = 2 × M1 with boundaries: Top as (i - M1), Bottom as (i +

M1), Left as (j+1) and Right as (j + 2×M1).

Now p is a FPFN greater than or equal to M1 and we set this

FPFN as a MSP. Now we consider the cases where for j+1 we add

pointsto the segmenting path called as segment points according to:

Case 1: FPFN is found at m,n then set the FPFN point as a

MSP and then we set the local maximum fringe number as a segment

point in each column of the window from (j+1)th column upto the (n-

1)th column.

Case 2: In the extreme condition that no FPFN is found in the

window then we set a local maximum fringe number less than or equal

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2304

Page 11: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

to M1 as a segment point in each column of the window from

(j+1)th column to (j + M1) th column.

The above procedure results in partial segmenting

pathsbetween lines. Now to join the resulting partial segmenting paths

we look backwards from right to left. We scan the fringe map along

columns from right to left and search for the MSPs to join the partial

paths such that for a point p at ith row and j th column:

If p(i, j) is a MSP then set that MSP as a segmentpoint.

Then we search for FPFN or MSP or segment point nearest to it in the

window of size N×N where N = 2×M1 with boundaries, Top as

(i - M1), Bottom as (i+ M1), Right as (j-1) and Left as (j -

2×M1).Again two cases may arise:

Case 1: FPFN or MSP or segment point is found at m,n then we

set local maximum fringe number as segment point in each column of

the window from (j-1)th column to (n1)th column and if the point at m,n

is FPFN then set FPFN point as MSP.

Case 2: No FPFN or MSP or segment point found in the

window then set local maximum fringe number less than or equal to M1

as segment point in each column ofthe window from (j-1)th column to (j

- 2× M1 - 1)th column and set (j - 2× M1) th column point as MSP. The

result of segment path generation is shown in Fig 4.6.

CONCLUSION

A projection profile method is used for line, word and

character segmentation. But for telugu scripts some of the lines are

not segmented properly. So, we use fringe map method in order to

segment the lines of telugu script accurately. Projection profile

method is applied for different languages and we find the accuracy of

line, word and character of different languages.

FUTURE SCOPE

Future scope of our project is to segment the lines rather than

displaying and it is applied to the projection profile method for

character segmentation. By using projection profile method some

characters which are connected together are not segmented for

different languages. So, that characters are segmented properly by

using water dropfall method.

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2305

Page 12: FRINGE MAP BASED CHARCTER SEGMENTATIONijrpublisher.com › gallery › 268-june-1388.pdf · 2.1 OPTICALCHARACTER SEGMENTATION (OCR): Optical Character Recognition (OCR) is a piece

REFERNECES

1.V. K. Koppula and A. Negi, “Using fringe maps for text line segmentation

in printed or handwritten document images,” in 2010 Second Vaagdevi

International Conference on Information Technology for Real World

Problems (VCON’10), Warangal, India, Dec 2010, pp. 83–88.

2.R. L. Brown, “The fringe distance measure: an easily calculated image

distance measure with recognition results comparable to Gaussian blurring,”

IEEE Trans. Syst., Man,Cybern., vol. 24, no. 1, pp. 111–115, 1994.

3.A.V.S. Venkat Rao, N., “Canonical syllable segmentation of Telugu

document images”, TENCON 2008 - IEEE Region 10 Conference , pp 1-5,

Nov- 2008.

4.Y. K. Chen, “Segmentation of handwritten connected numeral string using

background and foreground analysis”, Proc. 15th ICPR, pp. 598-601, 2000.

5.Fujisawa, "Segmentation methods for character recognition from

segmentation to document structure analysis", Proceeding of the IEEE,

vol.80, pp.1079 - 1092, 1992.

6.Rafael C. Gonzalez, Richard E. Woods - ―Digital Image Processing‖

second edition, Pearson Education, ISBN: 81-7808-629-8 Image

segmentation and extraction Richard E. woods.

International Journal of Research

Volume VIII, Issue VI, JUNE/2019

ISSN NO:2236-6124

Page No:2306