ocrdroid : a framework to digitize text using mobile phones
DESCRIPTION
OCRdroid : A Framework to Digitize Text Using Mobile Phones. Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme University of Southern California Presenter Mi Zhang. Outline. What is OCRdroid ? Related Work Design Considerations - PowerPoint PPT PresentationTRANSCRIPT
OCRdroid: A Framework to Digitize Text Using Mobile Phones
Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu,
Sameera Poduri, and Gaurav Sukhatme University of Southern California
Presenter Mi Zhang
Outline
What is OCRdroid ?
Related Work
Design Considerations
System Architecture
Experimental Results
Summary
What is OCRdroid ?
Why? Huge demand for recognizing text in camera-captured
pictures Mobile phones are Ubiquitous and Powerful
What? OCRdroid = OCR + Mobile Phone Two Applications
PocketPal: Personal Receipt Management Tool PocketReader: Personal Mobile Screen Reader
Related Work
Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen
Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel
A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi
Design Considerations
Real-Time Processing
Lighting Conditions
Text Skew
Perception Distortion (Tilt)
Text Misalignment
Blur (Out – Of - Focus)
Real-Time Processing
Issues : Limited memory Relative Low processing power Require quick response
Our Solutions : Multi-Thread System Architecture Image Compression Computationally Efficient Algorithms
Lighting Conditions
Issues : Uneven Lighting (Shadows, Reflection, Flooding, etc.)
Lighting Conditions
Our Solution : Local Binarization : Fast Sauvola’s Algorithm
Text Skew
Issues : When perspective is not fixed, text lines may get skewed
from their original orientation
Text Skew
Our Solution : Branch-and-Bound text line finding algorithm + Auto-
rotation
Perception Distortion (Tilt)
Issues : When the text plane is not parallel to the imaging plane Mobile phones are susceptible to tilts Small Perception Distortion causes OCR to fail
Perception Distortion (Tilt)
Our Solution : Use Embedded Orientation Sensor (Pitch and Roll) Calibration
Text Misalignment
Issues : Camera screen covers a partial text region Irregular shapes of text characters
Text Misalignment
Our Solution : Step#1 : Modified version of Sauvola’s algorithm
Top Border
Right Border
Left Border
Bottom Border
Text Misalignment
Our Solution : Step#1(Cont) : Routes to perform Sauvola’s algorithm
Text Misalignment
Our Solution : Step#2 : Noise Reduction
Right Border
Left Border
Bottom Border
. .. ... . .Top
BorderW
W
Blur (Out Of Focus)
Issues : OCR needs sharp edge response
Blur (Out Of Focus)
Our Solution : Android autofocus mechanism
Internet
OCR Engine – Tesseract
Web Server
1. Photo of a receipt2. Front end processing
3. Upload image
4. Perform
Backend
Processing &
OCR
5. Return
OCRResults
6. Results returned
7. Information Extraction
Android Phone
System Architecture
Camera Preview
Orientation Handler
Alignment Checker
Image Upload
OCR Data Receiver
Information Extraction
Mobile Database
Internet
Internet
Capture
Improper Alignment Detected
Pro
per A
lign
men
t D
ete
cte
d
Front-End Architecture
Back-End Architecture
Store Image
Skew Detection & Auto-rotation
OCR Text Output
Binarization
Internet
Tesseract OCR Engine
Sends Results back to Mobile Device
Internet
Experimental Results
Test Corpus Ten distinct black & white images
Three distinct lighting conditions Normal: Adequate light Poor: Dim Flooding: Light source focus on a particular portion of image
Performance Metrics Character Accuracy Word Accuracy Timing
Experimental Results
Binarization: (Measured by Character Accuracy) Normal: Around 97% Poor: Around 60% Flooding: Around 60%
Skew tolerance: Up to 30 degrees
Perception Distortion: Up to 10 degrees
Experimental Results
Misalignment Detection:
Timing Performance: Misalignment Detection: Less Than 6 seconds Overall Process: Less Than 11 seconds
More Information
Project Website @: http://www-scf.usc.edu/~ananddjo/ocrdroid/index.php Test Cases & Results Demo Video Paper Presentation Slide Tools Information (Mobile Phone + Software)
Summary
OCRdroid – A Generic Framework for Developing OCR-based Applications on Mobile Phones
Six Design Considerations & Our Solutions Especially, we advance a new real-time computationally
efficient algorithm for text misalignment detection
Experimental Results
Questions ?
Thank You