how does the postal service sort...

55
US Postal Service Bayesian Networks OCR: Factors Constructing an Inference Engine How Does the Postal Service Sort Mail? Gwyn Whieldon Hood College April 14, 2012 Gwyn Whieldon How Does the Postal Service Sort Mail?

Upload: others

Post on 12-Nov-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

How Does the Postal Service Sort Mail?

Gwyn Whieldon

Hood College

April 14, 2012

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 2: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The US Postal Service: Facts and Figures

Postal Service Statistics

The Postal Service employs over 574,000people (making it the second-largestcivilian employer in the United States.)

The USPS delivers approximately 700million pieces of mail per day, on average(which is less than used to be sent.)

This works out to around 1200 pieces ofmail processed per employee – whichwould be impossible to sort by hand.

This is where technology will come in!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 3: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The US Postal Service: Facts and Figures

Postal Service Statistics

The Postal Service employs over 574,000people (making it the second-largestcivilian employer in the United States.)

The USPS delivers approximately 700million pieces of mail per day, on average(which is less than used to be sent.)

This works out to around 1200 pieces ofmail processed per employee – whichwould be impossible to sort by hand.

This is where technology will come in!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 4: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The US Postal Service: Facts and Figures

Postal Service Statistics

The Postal Service employs over 574,000people (making it the second-largestcivilian employer in the United States.)

The USPS delivers approximately 700million pieces of mail per day, on average(which is less than used to be sent.)

This works out to around 1200 pieces ofmail processed per employee – whichwould be impossible to sort by hand.

This is where technology will come in!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 5: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The US Postal Service: Facts and Figures

Postal Service Statistics

The Postal Service employs over 574,000people (making it the second-largestcivilian employer in the United States.)

The USPS delivers approximately 700million pieces of mail per day, on average(which is less than used to be sent.)

This works out to around 1200 pieces ofmail processed per employee – whichwould be impossible to sort by hand.

This is where technology will come in!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 6: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The US Postal Service: Facts and Figures

Postal Service Statistics

The Postal Service employs over 574,000people (making it the second-largestcivilian employer in the United States.)

The USPS delivers approximately 700million pieces of mail per day, on average(which is less than used to be sent.)

This works out to around 1200 pieces ofmail processed per employee – whichwould be impossible to sort by hand.

This is where technology will come in!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 7: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

Where’s This Letter Go?

Since 1965, the USPS has beenusing something called

Optical Character Recognition

or OCR, for short.

This is where they scan an imageof the delivery address on theenvelope, and convert that addressinto text.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 8: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

Where’s This Letter Go?

Since 1965, the USPS has beenusing something called

Optical Character Recognition

or OCR, for short.

This is where they scan an imageof the delivery address on theenvelope, and convert that addressinto text.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 9: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

Where’s This Letter Go?

After reading this address with a machine called a

multiline optical character reader (MLOCR),

the destination address will be looked up in their database.

With this in hand, the letter is stamped with a printedbarcode which allows it to be automatically sorted – all theway to the delivery person!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 10: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

Where’s This Letter Go?

After reading this address with a machine called a

multiline optical character reader (MLOCR),

the destination address will be looked up in their database.

With this in hand, the letter is stamped with a printedbarcode which allows it to be automatically sorted – all theway to the delivery person!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 11: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

Where’s This Letter Go?

After reading this address with a machine called a

multiline optical character reader (MLOCR),

the destination address will be looked up in their database.

With this in hand, the letter is stamped with a printedbarcode which allows it to be automatically sorted – all theway to the delivery person!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 12: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

Where’s This Letter Go?

After reading this address with a machine called a

multiline optical character reader (MLOCR),

the destination address will be looked up in their database.

With this in hand, the letter is stamped with a printedbarcode which allows it to be automatically sorted – all theway to the delivery person!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 13: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The Math Behind the Magic

We’d like an algorithm to perform the following task:

Input:

Picture/scan of text

Output:

Content of text

Hood College401 Rosemont Ave.

Frederick, MD 21701

We’ll use something called a Bayesian network for the task.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 14: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The Math Behind the Magic

We’d like an algorithm to perform the following task:

Input: Picture/scan of text Output:

Content of text

Hood College401 Rosemont Ave.

Frederick, MD 21701

We’ll use something called a Bayesian network for the task.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 15: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The Math Behind the Magic

We’d like an algorithm to perform the following task:

Input: Picture/scan of text Output: Content of text

Hood College401 Rosemont Ave.

Frederick, MD 21701

We’ll use something called a Bayesian network for the task.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 16: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The Math Behind the Magic

We’d like an algorithm to perform the following task:

Input: Picture/scan of text Output: Content of text

Hood College401 Rosemont Ave.

Frederick, MD 21701

We’ll use something called a Bayesian network for the task.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 17: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The Math Behind the Magic

We’d like an algorithm to perform the following task:

Input: Picture/scan of text Output: Content of text

Hood College401 Rosemont Ave.

Frederick, MD 21701

We’ll use something called a Bayesian network for the task.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 18: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Why Automate?Sorting Through the Mail

The Math Behind the Magic

We’d like an algorithm to perform the following task:

Input: Picture/scan of text Output: Content of text

Hood College401 Rosemont Ave.

Frederick, MD 21701

We’ll use something called a Bayesian network for the task.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 19: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Bayesian Networks: A Definition

Definition (Bayesian Network)

A Bayesian network (also called a directed acyclic graphical model)is a directed, acyclic graph with a node for each random variable,and an directed edge from X → Y if Y has a conditionaldependence on X .

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 20: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Bayesian Networks: A Definition

Definition (Bayesian Network)

A Bayesian network (also called a directed acyclic graphical model)is a directed, acyclic graph with a node for each random variable,and an directed edge from X → Y if Y has a conditionaldependence on X .

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 21: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Bayesian Networks: A Definition

Definition (Bayesian Network)

A Bayesian network (also called a directed acyclic graphical model)is a directed, acyclic graph with a node for each random variable,and an directed edge from X → Y if Y has a conditionaldependence on X .

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 22: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 23: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 24: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 25: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 26: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 27: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 28: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing

[0,1]

(C)oughing

[0,1]

(B)legh-ing

[0,1]

Random Variables, Illnesses:

(A)llergies

[0,1]

(F)lu

[0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 29: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing [0,1]

(C)oughing [0,1]

(B)legh-ing [0,1]

Random Variables, Illnesses:

(A)llergies [0,1]

(F)lu [0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 30: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing [0,1]

(C)oughing [0,1]

(B)legh-ing [0,1]

Random Variables, Illnesses:

(A)llergies [0,1]

(F)lu [0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 31: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

DefinitionsToy Examples

Toy Bayesian Network: Medical Diagnoses

A→ S ,A→ C

F → S ,F → C ,F → B

Random Variables, Symptoms:

(S)neezing [0,1]

(C)oughing [0,1]

(B)legh-ing [0,1]

Random Variables, Illnesses:

(A)llergies [0,1]

(F)lu [0,1]

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 32: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCR

When we’re trying to convert images of text into the textitself, we’re going to make a simplifying assumption – thatI’ve already broken up my text into characters.

Then the simplest form of our Bayesian network looks like:

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 33: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCR

When we’re trying to convert images of text into the textitself, we’re going to make a simplifying assumption – thatI’ve already broken up my text into characters.

Then the simplest form of our Bayesian network looks like:

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 34: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCR

When we’re trying to convert images of text into the textitself, we’re going to make a simplifying assumption – thatI’ve already broken up my text into characters.

Then the simplest form of our Bayesian network looks like:

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 35: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRIn reality though, not all pairs are created equal.

a

X2 →

{a

uX1 → q

P(“qu”) > P(“qa”)←− Our conditional probability should reflect this!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 36: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRIn reality though, not all pairs are created equal.

a X2 →

{a

u

X1 → q

P(“qu”) > P(“qa”)←− Our conditional probability should reflect this!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 37: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRIn reality though, not all pairs are created equal.

a X2 →

{a

uX1 → q

P(“qu”) > P(“qa”)←− Our conditional probability should reflect this!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 38: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRIn reality though, not all pairs are created equal.

a X2 →

{a

uX1 → q

P(“qu”) > P(“qa”)

←− Our conditional probability should reflect this!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 39: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRIn reality though, not all pairs are created equal.

a X2 →

{a

uX1 → q

P(“qu”) > P(“qa”)←− Our conditional probability should reflect this!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 40: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRIn reality though, not all pairs are created equal.

a X2 →

{a

uX1 → q

P(“qu”) > P(“qa”)←− Our conditional probability should reflect this!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 41: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCR

We can go one step further and consider triplet factors. Foran alphabet of 26 letters though, this gives 17,567 differentconditional probabilities we’d have to record per triple ofletters in a word – not desirable!

Take top 2000 instead.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 42: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRWe can go one step further and consider triplet factors. Foran alphabet of 26 letters though, this gives 17,567 differentconditional probabilities we’d have to record per triple ofletters in a word – not desirable!

Take top 2000 instead.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 43: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRWe can go one step further and consider triplet factors. Foran alphabet of 26 letters though, this gives 17,567 differentconditional probabilities we’d have to record per triple ofletters in a word – not desirable!

Take top 2000 instead.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 44: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Bayesian Networks and OCRWe can go one step further and consider triplet factors. Foran alphabet of 26 letters though, this gives 17,567 differentconditional probabilities we’d have to record per triple ofletters in a word – not desirable! Take top 2000 instead.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 45: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Constructing an Inference Engine

We can see that this still didn’t guarantee 100% accuracy.However, this was a fairly simplistic model – and our inferenceengine wasn’t optimized for our “handwriting”.

Can add “SimilarityFactors”, which increases the probabilitythat similarly written characters will be given the same values.

Our character and word accuracy for each of these was given by:

charAcc wordAcc

singletonFactors 0.767 0.220pairwiseFactors 0.792 0.260tripletFactors 0.800 0.340

similarityFactors 0.816 0.370

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 46: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Constructing an Inference Engine

We can see that this still didn’t guarantee 100% accuracy.However, this was a fairly simplistic model – and our inferenceengine wasn’t optimized for our “handwriting”.

Can add “SimilarityFactors”, which increases the probabilitythat similarly written characters will be given the same values.

Our character and word accuracy for each of these was given by:

charAcc wordAcc

singletonFactors 0.767 0.220pairwiseFactors 0.792 0.260tripletFactors 0.800 0.340

similarityFactors 0.816 0.370

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 47: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Singleton FactorsPairwise FactorsTriplet FactorsOther Inference Engine Bits

Constructing an Inference Engine

We can see that this still didn’t guarantee 100% accuracy.However, this was a fairly simplistic model – and our inferenceengine wasn’t optimized for our “handwriting”.

Can add “SimilarityFactors”, which increases the probabilitythat similarly written characters will be given the same values.

Our character and word accuracy for each of these was given by:

charAcc wordAcc

singletonFactors 0.767 0.220pairwiseFactors 0.792 0.260tripletFactors 0.800 0.340

similarityFactors 0.816 0.370

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 48: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Constructing an Inference Engine

Can buy programs which “train”themselves to read your writing perfectly.

Typically use different OCR forhandwriting vs. printed block text,

...a la Google Books.

For tablet writing, often add in “strokeanalysis” – meaning, how you write acharacter is as important as what youwrite.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 49: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Constructing an Inference Engine

Can buy programs which “train”themselves to read your writing perfectly.

Typically use different OCR forhandwriting vs. printed block text,

...a la Google Books.

For tablet writing, often add in “strokeanalysis” – meaning, how you write acharacter is as important as what youwrite.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 50: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Constructing an Inference Engine

Can buy programs which “train”themselves to read your writing perfectly.

Typically use different OCR forhandwriting vs. printed block text,

...a la Google Books.

For tablet writing, often add in “strokeanalysis” – meaning, how you write acharacter is as important as what youwrite.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 51: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Constructing an Inference Engine

Can buy programs which “train”themselves to read your writing perfectly.

Typically use different OCR forhandwriting vs. printed block text,

...a la Google Books.

For tablet writing, often add in “strokeanalysis” – meaning, how you write acharacter is as important as what youwrite.

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 52: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Thanks!

Thanks to the Organizers for the opportunity to speak!

Acknowledgements: This talk came out of a programmingassignment in the Stanford online course:

“Probabilistic Graphical Models” by Daphne Kollar

While I coded the factor constructions, the overall codestructure and inference engine are from her course materials.

I would highly recommend this course to anyone interested inthese materials!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 53: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Thanks!

Thanks to the Organizers for the opportunity to speak!

Acknowledgements: This talk came out of a programmingassignment in the Stanford online course:

“Probabilistic Graphical Models” by Daphne Kollar

While I coded the factor constructions, the overall codestructure and inference engine are from her course materials.

I would highly recommend this course to anyone interested inthese materials!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 54: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Thanks!

Thanks to the Organizers for the opportunity to speak!

Acknowledgements: This talk came out of a programmingassignment in the Stanford online course:

“Probabilistic Graphical Models” by Daphne Kollar

While I coded the factor constructions, the overall codestructure and inference engine are from her course materials.

I would highly recommend this course to anyone interested inthese materials!

Gwyn Whieldon How Does the Postal Service Sort Mail?

Page 55: How Does the Postal Service Sort Mail?sections.maa.org/mddcva/MeetingFiles/Spring2012Meeting/...civilian employer in the United States.) The USPS delivers approximately 700 million

US Postal ServiceBayesian Networks

OCR: FactorsConstructing an Inference Engine

Wrap-UpThanks

Thanks!

Thanks to the Organizers for the opportunity to speak!

Acknowledgements: This talk came out of a programmingassignment in the Stanford online course:

“Probabilistic Graphical Models” by Daphne Kollar

While I coded the factor constructions, the overall codestructure and inference engine are from her course materials.

I would highly recommend this course to anyone interested inthese materials!

Gwyn Whieldon How Does the Postal Service Sort Mail?