recognition of cursive roman handwriting – past, present ... · optical character recognition...

Post on 29-Jul-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Recognition of Cursive Roman Handwriting– Past, Present and Future

H. Bunke

bunke@iam.unibe.ch

Department of Computer Science, University of Bern

Neubruckstrasse 10, CH-3012 Bern, Switzerland

Acknowledgments:

- S. Gunter, T. Varga, M. Zimmermann

- Swiss National Science Foundation (20-5287.97 and IM2)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.1/61

Introduction

optical character recognition (OCR)

��

��

��

HH

HH

HH

Oriental Script Roman Script

��

���

HH

HHH

machine printed text handwritten text

��

HH

H

on-line off-line

��

HH

isolated cursive

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.2/61

Introduction

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.3/61

Introduction

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.4/61

Introduction

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.5/61

Introduction

(why) is it difficult?

• large variation in personal handwriting style

• different writing instruments

• segmentation problem

• large vocabulary (possibly open)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.6/61

Introduction

hundert

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.7/61

Introduction

is there any future need for automatic handwriting recognition?

• applications with commercial potential: address, form and check reading

• digital libraries, transcription of historical archives

• "non-death" of paper and new devices for handwriting acquisition

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.8/61

Introduction

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.9/61

Contents

1. Introduction

2. State of the Art

3. Current Developments

4. Future Trends

5. Conclusion

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.10/61

Document Image Preprocessing

standard operations include

• noise filtering

• binarization

• thinning

• skew correction

• slant correction

• estimation of baseline and main writing zones

• horizontal and vertical scaling

• additional problem dependent methods to separate handwriting frombackground

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.11/61

Document Image Preprocessing

original image final result

binarized image deslanted image

thinned image estimation of writing zones

estimation of slant deslanted and deskewed image

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.12/61

Isolated Character Recognition

• usually cast as a classification problem

• consists of preprocessing, feature extraction, and classification

features for isolated character recognition:

• raw pixels

• derived from series expansion, moments, etc.

• projection based features, contour based features

• structural features: end points, forks, junctions, etc.

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.13/61

Isolated Character Recognition

classifiers for isolated character recognition:

• nearest-neighbor

• Bayes classifier

• neural nets

• SVM, etc.

which classifier is best?

• depends on many factors, for example, available training set, number offree parameters, time & memory constraints, etc.

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.14/61

Cursive Word Recognition

• major problem: segmentation

• Sayre’s paradox

• three approaches

− holistic− segmentation-based (oversegment and merge)− segmentation-free (Hidden Markov Models, HMM)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.15/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x01

...x0n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.16/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x11

...x1n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.17/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x21

...x2n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.18/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x31

...x3n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.19/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x41

...x4n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.20/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x51

...x5n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.21/61

Hidden Markov Models (HMMs)

slidingwindow

featurevector

↓0

B

B

@

x61

...x6n

1

C

C

A

HMM S1

P11

S2P12

P(X)

P22

S3P23

P(X)

P33

...

P(X)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.22/61

General Text Recognition

• segmentation-based: segment line of text into individual words, then usecursive word recognizer

• segmentation-free: segmentation and recognition are integrated

− concatenate HMM word to word sequence (or sentence) models− use constraints to narrow down the search-space, for example,

soft-constraints derived from n-gram language models

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.23/61

Segmentation-free Word Sequence Recognition

• concatenation of HMM

w1

w2

wn

w1

w2

wn

w1

w2

wn

...

...

...

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.24/61

Segmentation-free Word Sequence Recognition

• concatenation of HMM

w1

w2

wn

w1

w2

wn

w1

w2

wn

...

...

...

p(w1

i)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.25/61

Segmentation-free Word Sequence Recognition

• concatenation of HMM

w1

w2

wn

w1

w2

wn

w1

w2

wn

...

...

...

p(w1

i) p(w2

i|w1

j)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.26/61

Segmentation-free Word Sequence Recognition

• concatenation of HMM

w1

w2

wn

w1

w2

wn

w1

w2

wn

...

...

...

p(w1

i) p(w2

i|w1

j) p(w3

i|w2

j)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.27/61

Segmentation-free Word Sequence Recognition

• concatenation of HMM

w1

w2

wn

w1

w2

wn

w1

w2

wn

...

...

...

p(w1

i) p(w2

i|w1

j) p(w3

i|w2

j)

• bi-gram language model

word next word probability

to the 0.009333

to be 0.002239

to a 0.000138

to have 0.000105

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.28/61

Recognition Experiment

40

50

60

70

80

0 1000 2000 3000 4000 5000 6000 7000 8000

Wor

d R

ecog

nitio

n R

ate

[%]

Vocabulary Size [n]

Simple Sentence ModelUnigram ModelBigram Model

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.29/61

Some Recent Trends

• databases for development and performance evaluation

• multiple classifier systems

• synthetic training data

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.30/61

Databases

• isolated characters and words:− CEDAR− NIST− CENPARMI− ELT9− IRESTE− ...

• cursively handwritten text− Senior/Robinson, PAMI 1998− Elliman/Sherkat, ICDAR 2001− IAM, collection in progress (since about 1997)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.31/61

Some Details of the IAM Database

• more than 1,500 scanned pages of handwritten text

• material from over 600 individual writers− 95,000 correctly segmented words− over 13,000 lines of text− over 5,000 complete sentences

• covering a vocabulary of over 12,000 words

• ground truth and lexical tags available (LOB corpus)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.32/61

Some Details of the IAM Database (2)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.33/61

Some Details of the IAM Database (3)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.34/61

Multiple Classifier Systems

• motivation: use a group of experts rather than a single expert

• many approaches to handwriting recognition have been proposed usingmcs’s

• often the basic classifiers are constructed ’by hand’

• recently so-called ensemble methods have been proposed:− they require only a single classifier to be constructed by hand− the classifier ensemble is generated automatically

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.35/61

Multiple Classifier Systems (2)

"classical" approach

input resultcombiner

nc

1c

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.36/61

Multiple Classifier Systems (3)

c1

cn

combiner resultinput

ensemble method

generateautomatically

base classifier

c

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.37/61

Issues in MCS’s

• ensemble generation− bagging− feature subspace− boosting− others

• combination− voting− rank sum− weighted voting− trainable classifier

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.38/61

Some Results

recognition rates achieved by various ensemble generation methods

algorithm recognition rate

Bagging 68.11%

AdaBoost 68.67%

random subspace 67.35%

feature selection 71.58%

original classifier 66.23%

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.39/61

Synthetic Generation of Training Data

• all recognizers need to be trained

• the larger the training set, the better the performance("you never have enough training data")

• but collection of training data is expensive

• previous work on generation of synthetic training data:− machine printed OCR [Baird et al.]− Arabic and Chinese OCR− isolated characters− (synthetic handwriting for other purposes [Guyon, Plamondon])

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.40/61

Synthetic Generation of Training Data

• no work on synthetic training data generation for cursive Romanhandwriting recognition

• two approaches:− using templates− applying geometric distortions to existing handwritten text

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.41/61

Synthetic Handwriting from Templates

• templates extracted from forms

• templates extracted from running text, using HMM in forced alignmentmode

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.42/61

Synthetic Handwriting from Templates (2)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.43/61

Synthetic Handwriting from Templates (3)

• disadvantages:− all instances of a character are identical− no ligatures

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.44/61

Synthetic Handwriting from N-Grams

• compile a list of frequent 3- and 2-tuples from an electronic corpus

• extract templates of these tuples from a handwritten text, using forcedalignment

• split the given text into available tuples and generate the synthetichandwriting

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.45/61

Synthetic Handwriting from N-Grams (2)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.46/61

Some Results

0 1 2 3 4 560

62

64

66

68

70

72

74

training set

reco

gniti

on r

ate

[%]

• 1193 word instances; 16 writers; 357 word vocabulary

• 80% training; 20% testing; 5-fold cross validation

• 1 = natural training data2 = synthetic training data3 = synthetic training data4 = synthetic training data

• test data: always natural

• except for the training data (natural/synthetic) identical conditions for allexperiments (same training/test words; same size of training/test set etc.)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.47/61

Future Perspectives

• some random comments:

− MCS’s− synthetic training data− enhanced HMMs (for example, 2D)− enhanced language models− etc.

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.48/61

Future Perspectives

• to reach a new quality of recognition we need to go from text transcriptionto text understanding:

− include syntactic and semantic text analysis− include task specific knowledge (in addition to statistical parameter

estimation)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.49/61

Who can read this?

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.50/61

Who can read this?

When I was in high school, my physics teacher - whose namewas Mr. Bader - called me down one day after physics classand said, "You look bored; I want to tell you something inte-resting." Then he told me something which I found fascina-ting, and have, since then, always found fascinating....The subject # is this - the principle of least action.Richard P. Feynman: The Feynman Lectures, Volume II.

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.51/61

Who can read this?

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.52/61

Who can read this?

Középiskolás koromban, egy nap a fizikatanárom - Bader úrnakívták - magához hívott fizikaóra után és azt mondta: "Unott-nak látszol; szeretnék mondani neked valami érdekeset." Majdelmondott valamit, amit elbûvölõnek találtam, és az-óta is mindig elbûvölõnek találom ... A legkisebb hatáselvérõl van szó.Richard P. Feynman: The Feynman Lectures, Volume II.

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.53/61

Integration of Grammatical Knowledge

• prerequisites:

− a word sequence recognizer that produces an n-best list (see before)− a stochastic context free grammar− a parser to compute the probability of a sentence or the most

probable parse tree

• procedure:

− reorder the n-best list from the recognizer taking parse probabilitiesinto account

final score = recognition score + γ f(parse probability)

where γ is a normalization factor and f(.) is a normalization function

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.54/61

Example of Grammatical Knowledge Integration

Rank Recognition Score Candidate Sentence

1 23923.6 She has put up the value other money .

2 23921.8 She has put up the value of her money .

3 23890.3 She had put up the value other money .

4 23888.4 She had put up the value of her money .

5 23854.3 She has put up the value at her money .

Rank Parse Prob. Candidate Sentence

1 1.58352e-19 She had put up the value of her money .

2 4.62861e-20 She has put up the value of her money .

3 1.12458e-21 She has put up the value at her money .

4 2.63105e-22 She had put up the value other money .

5 7.69052e-23 She has put up the value other money .

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.55/61

Example of Grammatical Knowledge Integration

Rank Recognition Score Candidate Sentence

1 23923.6 She has put up the value other money .

2 23921.8 She has put up the value of her money .

3 23890.3 She had put up the value other money .

4 23888.4 She had put up the value of her money .

5 23854.3 She has put up the value at her money .

Rank Parse Prob. Candidate Sentence

1 1.58352e-19 She had put up the value of her money .

2 4.62861e-20 She has put up the value of her money .

3 1.12458e-21 She has put up the value at her money .

4 2.63105e-22 She had put up the value other money .

5 7.69052e-23 She has put up the value other money .

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.56/61

Some Experimental Results

6

8

10

12

14

16

18

20

22

24

26

28

30

32

34

0 10 20 30 40 50 60 70 80 90 100

Sen

tenc

e R

ecog

nitio

n R

ate

[%]

Rank [n]

Reordered 100-Best ListBaseline System

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.57/61

Future Challenge

• to deal with human factors (i.e. errors and abnormalities introduced byhumans)

− statistical modeling has proven very useful− however we also need to incorporate task specific knowledge

provided by human experts

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.58/61

Sample Check Images

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.59/61

Sample Check Images (2)

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.60/61

Conclusions

• the recognition of cursive Roman handwriting has been a subject ofresearch for several decades

• for specific tasks some level of maturity has been reached andcommercial systems have become available

• some other tasks, particularly the recognition of unconstrained generaltext, need much more research

• these tasks are interesting for practical applications

• there do exist promising directions to further develop the field

Recognition of Cursive Roman Handwriting – Past, Present and Future – p.61/61

top related