chapter 9 creating and maintaining database presented by zhiming liu instructor: dr. bebis

30
Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Upload: corey-terry

Post on 16-Dec-2015

224 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Chapter 9Creating and Maintaining

Database

Presented by Zhiming Liu

Instructor: Dr. Bebis

Page 2: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Outline

• Introduction

• Enrollment Policies

• The Zoo

• Biometric Sample Quality Control

• Training

• Enrollment Is System Training

Page 3: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Introduction

• Biometric enrollment asks an individual to give out private information.

• Enrollment is a process directed by some enrollment policy, which needs to be acceptable to the public.

• Positive enrollment: under enrollment policy EM, select trusted individuals and store machine representation of these m enrolled members in a verification database M.

Page 4: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Introduction

• Negative enrollment: for criminal identification systems, under enrollment policy EN, determine the undesirable individuals and store machine representations of the n selected individuals in the screening database N.

• Because of error and fraud, there are fake and duplicate identities in legacy databases.

Page 5: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Introduction

- A fake identity can be one of two cases, created and

stolen identities:

1. Created identity: some subject d enrolls in M as d’K using

documents for a nonexistent identity, either fake documents or fake ID.

2. Stolen identity: a fake identity can also be a falsely enrolled

subject d’K as subject dK, the stolen identity.

- A duplicate identity IB

Subject A duplicate

IA

Page 6: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment policies

- Positive enrollment: this is a process of the registration of M trusted subjects dm in database M. The enrollment could be based on some already enrolled population W.

- Negative enrollment: is a process of registration of N questionable subjects dn by storing machine descriptions of these subjects in database N, which contains much more specific and detailed descriptions.

Page 7: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment policies

• Social issues

- How to make biometric authentication work without creating additional security loopholes, and without damaging civil liberties?

- Who will administer and maintain databases of authorized subjects?

- How will the data integrity of these databases be protected?

Page 8: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

The zoo• Apply animals to subject categories, depend on

whether one subject is easy to authenticate or not.

- Sheep: The group of subjects that dominate the population are easy to authenticate because their real-world biometric is very distinctive and stable.

- Goats: The group of subjects that are particularly difficult to authenticate because of a poor real-world biometric that is not distinctive, perhaps due to physical damage to body parts or due to large spurious variability in the biometric measurements over time.

This is the portion of the population that generates the majority of False Rejects.

Page 9: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

The zoo - Lambs: These are the enrolled subjects who are easy to imitate.

Lambs are the cause of most False Accepts because they

are imitated by wolves.

- Wolves: These are subjects that are particularly good at imitating,

impersonating, or forging a particular biometric.

- Chameleons: These are the subjects who are both easy to imitate

and good at imitating others.

They are a source of passive False Accepts when enrolled

and of active False Accepts when being authenticated.

Page 10: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

The zoo

Page 11: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Biometric sample quality control

• Many random False Rejects/Accepts occur because of adverse signal acquisition situations.

- two solutions

Page 12: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Biometric sample quality control

- for example, apply image enhancement or suggest subjects

present the biometric in a different, “better” way.

- Failure to Enroll (FTE)

Input quality control higher FTE rates

Low-quality samples lower FTE rates

- Relationship with ROC

lower FTE higher FAR and FRR

Page 13: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Biometric sample quality control

Page 14: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Training

• Why does a biometric system need to be trained?

- Compute match score s(B’, B).

- The goal is to make the average difference between these match

scores and mismatch scores as high as possible.

• There are two aspects to training

- Enrollment policies and authentication protocols

Page 15: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Training

1. Enrollment of subjects: During enrollment one or more samples B of a subject’s biometric β are acquired and biometric samples or templates derived from the samples B are stored in some database M.

2. Protocols: A biometric authentication system itself needs to be trained, by refining and enhancing the signal or image to match the user population characteristics and incrementally improving the match engine.

Page 16: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Training

Page 17: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

• Build database M by selecting subjects d from the world population W and assigning an identifier ID to each subject.

Page 18: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

• Three possibilities:

1. Correctly “linked”, ID = k

2. Subject dk is in reality a subject dj, with j < k, i.e., dk is “duplicate” of subject dj. As a result, IDj and IDk are duplicates, representing the same individual.

3. Subject dk is in reality a subject dj, with j > k, i.e., dk is faking unenrolled subject dj. As a result, IDk corresponds to a “fake” identity.

Page 19: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

• We have non-zero probabilities

- PD is the probability that some subject d M is also enrolled under a different ID number

- PF is the probability that subject d M is a fake identity

• Database integrity

- Integrity: how well the database reflects the truth data of the seed documents (birth certification, proofs of citizenship, and passports) used for enrollment

Page 20: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training• The database integrity when it comes to duplicates is

determined by PD , the probability of duplicates

- PDEA (Double Enroll Attack) refers to the probability that an already enrolled subject dj wishes to re-enroll in the database as a different identity dk.

- FNMRE is the probability that a match between two samples of the same biometric is not detected, i.e., is missed.

- The number of duplicates in M is PD * m, with m the number of entities in M

Page 21: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

• The enrollment integrity is further determined by PF, the probability of a fake enroll as dk

- FMRE is the probability that a match between two different biometric samples is falsely declared during enrollment

- PIA is the probability of impersonation attack

- The number of fake identities in M equals PF * m

Page 22: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training• Probabilistic enrollment

- build an access control list of subjects di, i = 1,…,m of some database M.

- association between di and the corresponding biometric βi

- compute likelihood

it expresses how well a subject’s biometric βi match his template Bi

- probability can only be computed if there exist some machine representation of real word biometrics βi , let these representations be another set of templates and write

Page 23: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

where, for simplicity, we assume that the match score

is the likelihood that di is the true subject, given Bi

• Modeling the world

- Prob (di | Bi) can be approximated by match score si only under very unrealistic circumstances.

- more realistic approximations will have to involve the modeling of other subjects dk enrolled in M, more generally, compute Prob (di |O)

the likelihood of subject di given the biometric data O collected at enrollment time

Page 24: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

- Prob (O) is the prior probability that this particular observation will occur (which cannot be computed exactly)

- assume Prob (di) = Pd is constant

- evaluate Prob (O|di) is a matter of fitting model di to the data O and determine how well this can be done.

- evaluating the rest of this expression Prob (O|dk) k = j+1,…, m is impossible, because these subjects are not available upon dj

enrollment

Page 25: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

• Modeling the rest of the world — cohorts

- the most difficult issue in training a biometric authentication system is the modeling of data from unknown people.

- voice verification methods not only use a model describing the speaker’s biometric machine representation, but also a model describing all other speakers.

- two techniques to approximate the denominator of (9.7)

Page 26: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

- reduce the set M to one fictitious model subject D, trained on a pool of data from many different speakers, who represent the “world” W of possible speakers.

- factor , so that the denominator reflects the whole population D + di

1. World modeling

Page 27: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

- approximate the set M by a subset Mi that resemble subject di . for each subject di , a set of approximate forgeries is computed and stored. We denote this set by Di — the set is called the set of cohorts of speaker i.

- factor i = ci, the number of cohorts for di

2. Cohort modeling

Page 28: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

• Updating the probabilities

- denote Prob (di |O) with Pi

- during operation of the authentication system, data from subjects is collected and likelihood Pi could be updated.

- upon authentication of subject di , a biometric sample is acquired that we denote here as O.

- compute Prob (di |O, O)

Page 29: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training

- what needs to be evaluated is the denominator Prob (O)

- set Prob (di) = Pi

Page 30: Chapter 9 Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Enrollment is system training