copyright 2009 bsolv. all rights reserved citizen360 identity resolution introduction to the...

7
Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should see the system overview before you run this presentation. Click here to launch the overview presentation .

Upload: pierce-goldwyn

Post on 01-Apr-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reserved

Citizen360 Identity Resolution

Introduction to the Identity Resolution (IR) processes

Version 1.0

You should see the system overview before you run this presentation. Click here to launch the overview presentation.

Page 2: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reserved

Citizen360 – Introduction to Identity Resolution

Identity Resolution (IR)

This is the process by which computer records are analyzed to find those records which represent the same physical person and to subsequently merge or link those records.

Page 3: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reserved

Citizen360 IR Approach

IR Sweep - this batch program identifies possible citizen matches– Built with a high degree of parallelism - up to 10 instances can run in parallel– Can be configured to run against customized citizen models – Runs a configurable IR Algorithm– Can set the confidence-threshold level (e.g., 70%) at which match results are not

reported

IR Algorithm – this is the algorithm, used by the IR Sweep program, that calculates the “match confidence level” between two citizens

– The algorithm results in a single confidence result expressed as a percentage, e.g., 83%– The algorithm is made up of three major components:

Identifier Match, e.g., SSN Personal Demographic Data (PDD) match, e.g., names, ages, and gender Location match, e.g., phones, emails, and addresses.

Record “Merge”– This moves the different citizen detail-records under the same “citizen header”– Although called a “merge” it is really a “link”. The source systems are not forced to be

the same

Page 4: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reserved

Date of BirthDate: 07/13/1965Source DOH

Date of BirthDate: 07/14/1965Source DSS

CitizenId: 222222Master Index:10001340065

Date of BirthDate: 07/14/1965Source DSS

CitizenId: 333333Master Index:1000130073

Date of BirthDate: 07/13/1965Source DOH

Merging does not change the data – it is still held by Source System

CitizenId: 111111Master Index: 10001340057

Date of BirthDate: 07/14/1965Source DHS

Identity Resolution ProcessMatch: 83%

Identity Resolution ProcessMatch: 82%

Master Index HistoryValue: 10001340057

Master Index HistoryValue: 10001340065

Master Index HistoryValue: 1000130073

We can continue to use any of the original/historic “master index” values to reference the citizen

Based on Identity Resolution processes we may decide to merge other records…

The data is still unique by source system - but we now know that it is for a common citizen

Page 5: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reserved

IR Algorithm Configuration

Data elements that are compared are given “grades”:

– None A confirmed non-match

– Approximate A less exact match or quite often one or more values

are absent (null)– Close

For example SSNs that have some digits swapped, dates of birth that are 1 day apart, a name that “sounds like” another name

– Exact A confirmed exact match

Each data element grade type is given a score between 0 and 1 (exact)

– The grade scoring is configurable through the user interface

Each data element grade is weighted and applied to the overall score

– The data element weighting is configurable through the user interface

Page 6: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reserved

IR Algorithm Sophistication – a few examples

Can select the preferred phonetic algorithms for different fields, e.g., Soundex, Metaphone, Double Metaphone, Phonex, NYCIIS

The Double Metaphone phonetics comparison is generally the best for names:– Much more powerful than Soundex– Can properly handle Eastern European names, e.g., Budjinski– Considers correct and incorrect pronunciations of names such as “Juan”, e.g., “hwahn”

and “jewann”– Can handle silent B in Bomb and Dumb, etc.

The address-comparison converts the addresses into the best-fit standardized post office address names

Full and partial address matches Dates are considered Close matches if they are within a range, only have a single

digit difference, or the format is possibly different (US standard - mm/dd/yyyy, compared to US INS - dd/mm/yyyy)

Emails with the same name but different domains are considered Close matches, e.g., [email protected] and [email protected]

Number fields (e.g., SSNs) are considered Close if they just have digits swapped, if they are the same except a digit is missing from one, etc.

Page 7: Copyright 2009 bSolv. All rights reserved Citizen360 Identity Resolution Introduction to the Identity Resolution (IR) processes Version 1.0 You should

Copyright 2009 bSolv. All rights reservedConfidential and Proprietary

THANK YOU

[email protected]

www.bsolv.com

3330 Cumberland BoulevardSuite 500Atlanta, GA 30339

Office: +1 678.638.6692