online handwriting recognition - ?· a.h. toselli (iti - upv) online handwriting recognition 5 /...

Download OnLine Handwriting Recognition - ?· A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 5 / 34.…

Post on 01-May-2019

214 views

Category:

Documents

1 download

Embed Size (px)

TRANSCRIPT

OnLine Handwriting Recognition(Master Course of HTR)

Alejandro H. Toselli

Departamento de Sistemas Informticos y ComputacinUniversidad Politcnica de Valencia

February 26, 2008

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 1 / 34

Outline

1 Introduction

2 Feature Extraction for On-Line HTR

3 DFT - Data Global Modelling

4 DFT - Data Local Modelling

5 Experimental Results

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 2 / 34

Introduction

On-Line Handwriting Recognition Context

Acquisition types:

Tabla Digitalizadora

Video

OFFLINEONLINE

Escner

Cmara

Usual applications:

Sign recognition and verification.

Recognition of postal codes.

Recognition of courtesy amount in bank checks.

Transcription of ancient documents.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 3 / 34

Introduction

Motivations

Time-based Feature Extraction adequate for using with HMM.

Because of the similarity existing between the nature of on-LineHTR and Speech recognition.

Tamil HCR Competition.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 4 / 34

Introduction

Tamil HCR Competition - Motivation

Isolated character samples of 156 Tamil characters.Written by native Tamil writers.Off-Line and On-Line corpus version.

Some examples of Tamils symbols:

User 54

User 111

Evaluation criterion is the highest top-choice accuracy at zeroreject rate.50683 labelled samples available for training purposes.26926 unlabelled samples available for testing purposes.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 5 / 34

Introduction

On-Line Recognition Representation

On-Line handwriting is represented by a points sequence in R2 spaceordered in time. Parametric curve.

A isolated handwritten character/word:

A series of consecutive pen-downs and pen-up strokes.

Each stroke is a sequence of coordinates (xt , yt) t = 0..N1.

Some on-line devices included the pressure made on each point.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 6 / 34

Feature Extraction for On-Line HTR

Our Traditional Feature Extraction

Seven Feature Extractions in function of the time:

xt and yt coordinates normalized between 0 and 100.

x t and y

t first derivatives normalized between 0 and 1.

x t and y

t second derivatives.

All derivatives are calculated using:

ft =C

c=1(rt+c rtc)

2C

c=1 c2

rt {x t , y

t , x

t , y

t }

curvature:

kt =x t y

t x

t y

t(

x t2 + y t

2)3/2

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 7 / 34

Feature Extraction for On-Line HTR

DFT representation

On-Line handwriting can be represented

by a sequence of coordinates (xt , yt) t = 0..N1.

also as a function ft : t (xt + iyt) C, or ft : t (xt , yt) R2.

Pair of Discrete Fourier Transforms:

Direct Transform Fn =N1

t=0

ft e2itnN

Inverse Transform ft =1N

N1

n=0

Fn e2itnN

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 8 / 34

Feature Extraction for On-Line HTR

Global/Local Data Modelling

DFT Feature Extraction can be applied to:

the whole character stroke - Global Data Modelling

local stroke segments - Local Data Modelling

We use two different classifiers:

K-NN classifier for global data modelling.

Continuous density HMMs for local data modelling.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 9 / 34

DFT - Data Global Modelling

Introduction - DFT Modelling Example

2700 2800 2900 3000 3100 3200 3300 3400 3500 3600 3700

0 10 20 30 40 50 60 70

Coo

rd X

Time

850

900

950

1000

1050

1100

1150

1200

1250

0 10 20 30 40 50 60 70

Coo

rd Y

Time

100

1000

10000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 10 / 34

DFT - Data Global Modelling

Data Global Modelling - Direct DFT

DFT feature extraction examples using the first 40 coefficients:

100

1000

10000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

10

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

100000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

100

1000

10000

-1 -0.5 0 0.5 1

Am

plitu

de

Frequency

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 11 / 34

DFT - Data Global Modelling

Data Global Modelling - Inverse DFT

Examples of reconstructed samples considering different number ofFourier-Coefficients.

Original 2 FC 4 FC 6 FC 12 FC 20 FC 30 FC 60 FC

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 12 / 34

DFT - Data Local Modelling

Local Data Modelling

Local DFT Features Extraction:

Sliding-Window running through the whole stroke.

Window size: 40 points.

Moving step: 1 points.

Hamming windows.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 13 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 14 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 15 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 16 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 17 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 18 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 19 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 20 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 21 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 22 / 34

DFT - Data Local Modelling

Demo 1

Exit from the Demo

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 23 / 34

DFT - Data Local Modelling

Sliding-Window Segments

Org-Sign Spectrum Rbt-Sign

10

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

Org-Sign Spectrum Rbt-Sign

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

100

1000

10000

-1 -0.5 0 0.5 1

10

100

1000

10000

-1 -0.5 0 0.5 1

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 24 / 34

DFT - Data Local Modelling

Assembling of Regenerated Stroke Segments

Original Signal

Reconstructed from 4 Fourier-Coefficients

Sliding-Window, 40 points windows-size and

20 points moving-step.

A.H. Toselli (ITI - UPV) OnLine Handwriting Recognition 25 / 34

Experimental Results

Tamil Corpus Partition

Isolated character samples of 156 Tamil characters.

Written by native Tamil writers.

Experiments carried out with 2 different versions of the corpus.

Full version:

Training Test Total#Writers 90 27 117#Samples 39618 11065 50683

Reduced version:

Training Test Total#Writers 89 26 115#Samples 6240 1560 7800

A.H. Toselli (ITI - UPV) OnLine Handwriting

Recommended

View more >