ocrdroid : a framework to digitize text using mobile phones

28
OCRdroid: A Framework to Digitize Text Using Mobile Phones Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme University of Southern California Presenter Mi Zhang

Upload: soleil

Post on 14-Jan-2016

33 views

Category:

Documents


2 download

DESCRIPTION

OCRdroid : A Framework to Digitize Text Using Mobile Phones. Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme University of Southern California Presenter Mi Zhang. Outline. What is OCRdroid ? Related Work Design Considerations - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

OCRdroid: A Framework to Digitize Text Using Mobile Phones

Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu,

Sameera Poduri, and Gaurav Sukhatme University of Southern California

Presenter Mi Zhang

Page 2: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Outline

What is OCRdroid ?

Related Work

Design Considerations

System Architecture

Experimental Results

Summary

Page 3: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

What is OCRdroid ?

Why? Huge demand for recognizing text in camera-captured

pictures Mobile phones are Ubiquitous and Powerful

What? OCRdroid = OCR + Mobile Phone Two Applications

PocketPal: Personal Receipt Management Tool PocketReader: Personal Mobile Screen Reader

Page 4: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Related Work

Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen

Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel

A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi

Page 5: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Design Considerations

Real-Time Processing

Lighting Conditions

Text Skew

Perception Distortion (Tilt)

Text Misalignment

Blur (Out – Of - Focus)

Page 6: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Real-Time Processing

Issues : Limited memory Relative Low processing power Require quick response

Our Solutions : Multi-Thread System Architecture Image Compression Computationally Efficient Algorithms

Page 7: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Lighting Conditions

Issues : Uneven Lighting (Shadows, Reflection, Flooding, etc.)

Page 8: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Lighting Conditions

Our Solution : Local Binarization : Fast Sauvola’s Algorithm

Page 9: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Text Skew

Issues : When perspective is not fixed, text lines may get skewed

from their original orientation

Page 10: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Text Skew

Our Solution : Branch-and-Bound text line finding algorithm + Auto-

rotation

Page 11: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Perception Distortion (Tilt)

Issues : When the text plane is not parallel to the imaging plane Mobile phones are susceptible to tilts Small Perception Distortion causes OCR to fail

Page 12: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Perception Distortion (Tilt)

Our Solution : Use Embedded Orientation Sensor (Pitch and Roll) Calibration

Page 13: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Text Misalignment

Issues : Camera screen covers a partial text region Irregular shapes of text characters

Page 14: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Text Misalignment

Our Solution : Step#1 : Modified version of Sauvola’s algorithm

Top Border

Right Border

Left Border

Bottom Border

Page 15: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Text Misalignment

Our Solution : Step#1(Cont) : Routes to perform Sauvola’s algorithm

Page 16: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Text Misalignment

Our Solution : Step#2 : Noise Reduction

Right Border

Left Border

Bottom Border

. .. ... . .Top

BorderW

W

Page 17: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Blur (Out Of Focus)

Issues : OCR needs sharp edge response

Page 18: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Blur (Out Of Focus)

Our Solution : Android autofocus mechanism

Page 19: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Internet

OCR Engine – Tesseract

Web Server

1. Photo of a receipt2. Front end processing

3. Upload image

4. Perform

Backend

Processing &

OCR

5. Return

OCRResults

6. Results returned

7. Information Extraction

Android Phone

System Architecture

Page 20: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Camera Preview

Orientation Handler

Alignment Checker

Image Upload

OCR Data Receiver

Information Extraction

Mobile Database

Internet

Internet

Capture

Improper Alignment Detected

Pro

per A

lign

men

t D

ete

cte

d

Front-End Architecture

Page 21: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Back-End Architecture

Store Image

Skew Detection & Auto-rotation

OCR Text Output

Binarization

Internet

Tesseract OCR Engine

Sends Results back to Mobile Device

Internet

Page 22: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Experimental Results

Test Corpus Ten distinct black & white images

Three distinct lighting conditions Normal: Adequate light Poor: Dim Flooding: Light source focus on a particular portion of image

Performance Metrics Character Accuracy Word Accuracy Timing

Page 23: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Experimental Results

Binarization: (Measured by Character Accuracy) Normal: Around 97% Poor: Around 60% Flooding: Around 60%

Skew tolerance: Up to 30 degrees

Perception Distortion: Up to 10 degrees

Page 24: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Experimental Results

Misalignment Detection:

Timing Performance: Misalignment Detection: Less Than 6 seconds Overall Process: Less Than 11 seconds

Page 25: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

More Information

Project Website @: http://www-scf.usc.edu/~ananddjo/ocrdroid/index.php Test Cases & Results Demo Video Paper Presentation Slide Tools Information (Mobile Phone + Software)

Page 26: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Summary

OCRdroid – A Generic Framework for Developing OCR-based Applications on Mobile Phones

Six Design Considerations & Our Solutions Especially, we advance a new real-time computationally

efficient algorithm for text misalignment detection

Experimental Results

Page 27: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Questions ?

Page 28: OCRdroid : A Framework to Digitize  Text Using Mobile Phones

Thank You