hmm writer verification

8
University of Fribourg, Switzerland Department of Computer Science DIVA research group Master Seminar: Hidden Markov Models (HMMs): an univers al tool?, 2005 Dr. Jean Hennebert Prof. Rolf Igold Biometrics: Writer verication with HMMs September 2005 Seminar paper by: David Baechler Juchstr. 31 CH-1712 Tafers [email protected]

Upload: macbaed

Post on 05-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 1/8

University of Fribourg, Switzerland

Department of Computer ScienceDIVA research groupMaster Seminar: Hidden Markov Models (HMMs): an universal tool?, 2005Dr. Jean HennebertProf. Rolf Igold

Biometrics: Writer verification with HMMs

September 2005

Seminar paper by:

David BaechlerJuchstr. 31CH-1712 [email protected]

Page 2: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 2/8

Contents

1 Introduction 2

2 Distinctions 2

2.1 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 HMM based recognizers 3

3.1 About HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Advantages of HMMs . . . . . . . . . . . . . . . . . . . . . . 33.3 Idea behind the presented verification system . . . . . . . . . 33.4 Character models . . . . . . . . . . . . . . . . . . . . . . . . . 33.5 Writer verification . . . . . . . . . . . . . . . . . . . . . . . . 4

4 Experiments 4

4.1 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.2 Confidence measure . . . . . . . . . . . . . . . . . . . . . . . 54.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

5 Related work 5

5.1 Texture analysis problem . . . . . . . . . . . . . . . . . . . . 55.2 Classification problem . . . . . . . . . . . . . . . . . . . . . . 55.3 Morphologic processing of projection profiles . . . . . . . . . 6

5.4 Set of features of each text line . . . . . . . . . . . . . . . . . 65.5 Distribution of edge fragments . . . . . . . . . . . . . . . . . 65.6 Graphemes as features . . . . . . . . . . . . . . . . . . . . . . 65.7 Hamming distance . . . . . . . . . . . . . . . . . . . . . . . . 6

6 Conclusion 6

1

Page 3: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 3/8

Abstract

Biometric identification becomes more and more important. Butbiometry does not only include physical aspects. A person can also beidentified by individual characteristics as its voice or the handwriting.There is only the aspect that these characteristics can be influencedby the individual.

This seminar paper describes a writer verification approach usingHMMs, developped by Andreas Schlapbach and Horst Bunke at theUniversity of Bern. They achieved a good result with an ERR of about 2.5%. The FAR is below 1% with a FRR of 16%.

Additionally some related work basing on HMMs and other meth-ods is mentioned.

1 Introduction

Writer verification (signature) is a biometric method. There is a close rela-tionship between writer identification/verification and handwriting recogni-tion. Hidden Markov models are by far the predominant approach to solvethis problem.

This seminar paper will give a short overview over off-line text indepen-dent handwriting verification using HMMs. It is a summary of two papersby Andreas Schlapbach and Horst Bunke of the University of Bern. Theypublished a paper about handwriting identification [1] and an enhanced pa-per including also the verification [2]. Further related work, mentioned inthe papers, will be presented.

2 Distinctions

2.1 Identification

Writer identification is the task of determining the author of a sample of handwriting from a set of writers.

2.2 Verification

Writer verification is the task of deciding whether or not a handwritten texthas been written by a certain person.

2

Page 4: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 4/8

3 HMM based recognizers

3.1 About HMMs

The fundamental assumption of a Hidden Markov Model (HMM) is that theprocess to be modeled has a finite number of states. These states changeonce per time step in a random but statistically predictable way. To bemore precise, the state at any given time depends only on the state at theprevious time step.

An HMM is a statistical model where the system to be modeled is as-sumed to be a Markov process with unknown parameters. The challenge isto determine the hidden parameters from the observable parameters, basedon this assumption.

In a regular Markov model, the state is directly visible to the observer,and therefore the state transition probabilities are the only parameters. AHidden Markov Model adds outputs: each state has a probability distribu-tion over the possible output tokens.

3.2 Advantages of HMMs

HMM based recognizers have a number of advantages over other approachesof handwriting analysis. They allow to model characters of variable with.Shape variations and noise are no more big problems. And a text line can beimplicitly segmented into words and characters which is difficult to achieve

explicitly. Finally the standard Algorithms for training and testing are wellknown.

3.3 Idea behind the presented verification system

An HMM-based text line recognizer is built for each writer. A text lineof an unknown writer is is presented to each recognizer. The results arethen ranked. Each HMM can be understood as an expert specialized inrecognizing the handwriting of one person.

3.4 Character models

All the text lines are normalized. A sliding window moves from left to rightand extracts nine features, three global and six local ones. For each upperand lower case character an individual HMM is built. Each character HMMconsists of 14 states connected in a linear topology and embedded betweentwo non-emitting states. The character models are then concatenated tomodel a complete text line.

3

Page 5: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 5/8

3.5 Writer verification

For each writer a text line recognizer is built and trained with data comingfrom that writer only. The resulting recognizer for each writer is an experton the handwriting style of that person.

The text line with a claimed identity has to be tested if it is really fromthis writer. The verification criterion used by the system is based on aconfidence measure. It is calculated from the difference of the log-likelihoodscore and the average and normalized by the length of the text. If theverification criterion is above a certain threshold, the text line should be infact from the claimed writer.

4 Experiments

4.1 Database

The data is taken from the IAM database [4]. It consists of two sets, a clientdata set and an impostor data set. The client data set contains 4037 textlines coming from 100 writers. The impostor data set is made of 626 textlines from 20 writers unknown to the system. Each text line is presentedseven times with an identity of a writer known to the system (4382 lines).

But the data has been collected for recognition and not for verification.

Figure 1: normalized text lines

4

Page 6: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 6/8

4.2 Confidence measure

The decision criterion for the writer verification task is the following confi-dence measure.

cmtext line =lclaimed identity − lavg

text line length

with

lavg =1

N +1

 j=1∧ j=r(t)

li

If the confidence measure is above a certain threshold, the text line isassigned to the claimed writer; otherwise it is rejected.

4.3 Results

The system performs well on both accepting clients and rejecting impostors.Different variants of  lavg in the confidence measures are tested and the bestEqual Error Rate (EER) is about 2.5% (with the equation shown above).

The False Acceptance Rate (FAR) is smaller than 1% at a False RejectionRate (FRR) of 16% and vice versa. What is not treated in that experimentis the problem of skilled forgeries (impostors). The influence of the text linelength to the result is not clear. It seems to be long enough for achieving agood result. But what is the optimal length?

5 Related work

HMMs are not the only approach for signature verification. There are someother systems. The sub-sections below describe this related work. Probablysome of them use even HMMs but don’t call it so or don’t reveal it explicitly.

This information is taken from [1] and [2] and should just give a veryshort overview. 1

5.1 Texture analysis problem

Said et al. use global statistical features extracted from the whole images to

solve the problem. The identification is therefore a texture analysis problem.

5.2 Classification problem

Another approach is to classify a document into two classes: authorship ornot authorship. Cha et al. calculate the distance between two documentsand decide then if the result is positive or negative.

1The references to the related work are mentioned in [1] and [2].

5

Page 7: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 7/8

5.3 Morphologic processing of projection profiles

The approach of Zois et al. is to morphologically process projection profilesof single words. They are processed in segments and the feature vectors arethen classified with a neuronal network or a Bayesian classifier.

5.4 Set of features of each text line

Hertel et al. propose a k-nearest-neighbor classifier. A text is first seg-mented into text lines. From each text line a set of features is extracted andsubsequently compared with a reference vector.

5.5 Distribution of edge fragments

Also edge-based directional probability distributions can be used as features.Bulacu et al. use two edge fragments in the neighborhood of a pixel andcompute the joint probability distribution of the orientations of the twofragments.

5.6 Graphemes as features

Several authors of papers deal with graphemes (single letters). Each hand-writing is represented as a set of invariants. These invariants are detected.

5.7 Hamming distanceLeedham et al. present a simple solution which deals with the Hammingdistance of an eleven feature vector. This system performs with handwrittendigits.

6 Conclusion

HMMs are the predominant approach for the analysis of handwriting. Thereare other approaches than HMM. But methods with HMM show a biggerflexibility concerning especially shape variations and noise. Therefore it is

the ideal method to deal with handwriting. But this does not mean thatother approaches couldn’t produce good results!The results of the presented verification system are good. It would be

interesting to see the results with skilled forgers and with a bigger numberof persons. Also the influence of the text line length to the result is an openquestion. (What is the minimum length for producing acceptable results?)

Hence, despite the good results, there is work left for future research.

6

Page 8: HMM Writer Verification

8/2/2019 HMM Writer Verification

http://slidepdf.com/reader/full/hmm-writer-verification 8/8

References

[1] Andreas Schlapbach and Horst Bunke: Off-line Handwriting Identifica-tion Using HMM Based Recognizers, IEEE, 2004.1

[2] Andreas Schlapbach and Horst Bunke: Using HMM Based Recognizersfor Writer Identification and Verification, Proceedings of the 9th Int’lWorkshop on Frontiers in Handwriting Recognition (IWFHR-9 2004),pages 167172, IEEE, 2004

[3] Andreas Schlapbach: Using HMM Based Recognizers for Writer Iden-tification and Verification, handout of presentation at University of Fri-bourg, May 31, 2005

[4] The Homepage of the IAM Database: http://www.iam.unibe.ch/

~zimmerma/iamdb/iamdb.html

7