brailleocr: an open source document to braille converter application
DESCRIPTION
This presentation is actually about an Open Source application, BrailleOCR that helps to convert scanned documents to Braille and thus helps the Visually Impaired. What is the use of this application in real life? Well, BrailleOCR is currently the only app that integrated Optical character recognition and Braille Translation together. This app will eventually help converting a lot of important documents to Braille. The project site for this project is given here IJCA Paper: http://www.ijcaonline.org/archives/volume68/number16/11664-7254 Project site: https://code.google.com/p/brailleocr/ The app uses a four step process. Initially, we have a scanned image, which is a RGB image. The first step or the Pre-Processing step deals with conversion of a RGB image to grayscale. The 2nd step deals with Character Recognition using the Tesseract Engine. Now, the recognition step may have errors and we require post processing to correct them. The 3rd step is thus the Post-Processing step and it actually corrects errors in the previous step. The final and the most important step is the Braille Conversion step.TRANSCRIPT
An Open Source Tesseract based Tool for Extracting Text from Images with Application
in Braille Translation
Pijush Chakraborty [Roll: 32]Calcutta Institute of Engineering and
Management
CS681 Seminar
Introduction
Contribution of the Application in real life:o Our application integrates the working of an OCR with Braille
Translation.o BrailleOCR is currently the only application that supports
conversion of Image document to Braille format.o Will help in converting large documents to Braille format and
eventually help a lot of Visually Impaired people.o Project site: code.google.com/p/brailleocro DOI IJCA Paper reference: 10.5120/11664-7254
Open Source APIs used:o Tesseract Engine[Open-source OCR Engine]o Tess4J API [JNA Wrapper for using Tesseract with Java] o JOrtho API [Java open-source spell checking API]o Swing Graphics API
Introduction: Use of our Application
Introduction: BrailleOCR GUI
Methodology
Conversion of an Image Document to Braille consists of the following steps:
Methodology: Steps to be Followed
Fig. 1. Steps to be Followed
Conversion of an Image Document to Braille consists of the following steps:
Methodology: Steps to be Followed
Fig. 1. Steps to be Followed
Conversion of an Image Document to Braille consists of the following steps:
Methodology: Steps to be Followed
Fig. 1. Steps to be Followed
Conversion of an Image Document to Braille consists of the following steps:
Methodology: Steps to be Followed
Fig. 1. Steps to be Followed
Pre Processing Step
Pre Processing Steps:◦ Conversion to grayscale◦ Conversion of grayscale image to binary◦ The second sub-step is handled by Tesseract
using adaptive threshold. Reason for Grayscale conversion:
◦ Increases the accuracy in the Recognition step as stated in Ref. [2].
◦ Table 1 gives the Accuracy rate for certain input images.
Pre Processing: Image Type
Input Image No. of Images
Accuracy
Color Image 10 89%
Grayscale Image 10 93%
Table 1: Accuracy of Tesseract
Different Algorithms available: Averaging Luminosity method
Luminosity method Benefits: Human perception has more sensitivity for green more that red and red
more than blue Wight of green color component is highest followed by red and blue
i.e weight of color channel ∝ sensitivity
Algorithm Used:The color image can be represented as a discrete function f(x,y)=(xi,yj), 0<=i<N, 0<=j<M where N is the height of the image and M is the width of the image.
for i=0 to N-1 for j=0 to M-1 gr(xi,yj) = 0.299*r(xi,yj)+0.587*g(xi,yj)+0.114*b(xi,yj)
Here gr(xi,yj) is the grayscale image pixel, r(xi,yj) is the red channel, g(xi,yj) is the green channel and b(xi,yj) is the blue channel
Pre Processing: Grayscale Conversion
Pre Processing: Implementing the Algorithm
Fig. 2. Scanned Image
Fig. 3. Grayscale Image
Text Extraction Step
What is Optical Character Recognition?◦ Conversion of Scanned Image
document to Machine Encoded Text.◦ Useful in keeping backup of
important documents as text format.
Brief History:◦ 1929-1975: OCR without Electronic
computers◦ 1985-2000: Development in OCR for
computers◦ 2000-2013: Developments of
industrial standard OCR
Text Extraction: What is OCR?
Fig. 4. OCR implementation
Tesseract is currently the best Open Source OCR Engine.
Developed at HP between 1984 and 1994. Released Tesseract for open source in 2005 and
since then Google has taken over the Project. Project site:
Google recently launched Tesseract v3.0 Used with Java Applications using a JNA wrapper
Tess4J. Project site: code.google.com/p/tesseractocr
Text Extraction: Tesseract History
Get outlines by connected component analysis.
Organize outlines to Blobs
Organize Blobs to Text Lines
Characters are chopped and features are extracted
Text Extraction: Tesseract Architecture
Fig. 5. Architecture
Features are extracted using polygonal approximation.
Matched with prototype to find matching patterns.
The adaptive classifier scans the image twice to get better result the second time.
Text Extraction: Tesseract Charcter Recognition
Fig. 6. Prototype Matching
Post Processing Step
Why Post Processing?◦ Corrects errors in the previous step◦ Gives error free text for Braille Conversion◦ Spell checking systems provide the best results for post
processing step.
JOrtho API◦ JOrtho is an open source Java spell checking API that gives
suggestions for commonly misspelled words in the text.◦ The key algorithms include phonetic matching algorithms
such as Soundex ◦ Project site: jortho.sourceforge.net
Post Processing: Correcting the Text
Soundex Code:◦ The Soundex Code of a word returns a
alphabet followed by 3 numbers using the algorithm bellow
Algorithm:◦ Retain the first letter of the name and
drop all other occurrences of a, e, i, o, u, y, h, w.
◦ Replace consonants with digits as follows (after the first letter):
b, f, p, v = 1c, g, j, k, q, s, x, z = 2d, t = 3l = 4m, n = 5r = 6
◦ Two adjacent letters with the same number are coded as a single number. Two letters with the same number separated by 'h' or 'w' are coded as a single number
Post Processing: Soundex Algorithm
Example: “Metacalt”and “Metacalf” return the same string M324 as they are phonetically same
Fig. 7. Spell Cheking
Braille Translation Step
History of Braille:◦ Invented by Louis Braille in the 19th century◦ Accepted throughout the world as aform of
written communication for blind individuals◦ There have been some modifications to the
Braille system such as inclusion of concatenated words.
Use of Braille:◦ Braille is the primary reading and writing
system used by the visually impaired.◦ Helps in increasing literacy among the
visually impaired.◦ In modern world Braille technologies are
supported by various electronic devices. Braille Cell:
◦ Braille cells are 6-dot cells having some dots raised or lowered.
◦ 64 possible combinations.◦ Used in Braille Refreshable Display
What is Braille?
Fig. 9. six-dot Braille cell
Fig. 8 Braille Refreshable Display
Braille Details:◦ Grade 1 and Grade 2 are the most
commonly used.
◦ Grade 1 Braille includes single letters, numbers while grade 2 Braille includes concatenated words such as for,with,you, etc..
◦ Numbers (0,1 to 9) are denoted by (j,a to i) preceded by the number denoting cell
◦ Compounds letters (ex: and, with, wh, the,th…) have separate Braille representations.
◦ Uppercase alphabets have a preceding Braille cell denoting capital letter.
Braille: Braille Types
Fig. 10. Braille representations
Braille ASCII:◦ Subset of ASCII character set.◦ Contains all 64 Braille representations (6-dot cell).◦ Maps one-to-one ASCII input to Braille code. ◦ Supported by all Braille embossers.◦ It uses ASCII codes to send information to Braille displays.
Braille Patterns:◦ Braille Patterns are Unicode patterns that represent Braille characters.◦ Consists of 256 combinations of the 8-dot Braille cell. We require only 64.◦ Braille embossers and Braille Displays are recently upgraded to support
Unicode Braille.◦ The Unicode Braille set ranges from U+2800 to U+28FF though we need
only U+2800 to U+283F◦ In our application, we have focused on Unicode Braille representation.
Braille Translation: Electronic Braille
Braille Code Example:String: “6 dot Braille Cells for 64 combinations” Braille:
The flowchart bellow gives the entire algorithm of translation.
Braille Translation: Algorithm
Fig. 11. Flow Chart for Translation
Implementation
Extracting Text and correcting errors.
Implementation: BrailleOCR
Fig. 12. Extracting Text and Correcting Errors
Translation to Braille
Implementation: Braille Conversion
Fig. 13. Converting Text to Braille
Conclusion
We have showed the process of integrating Tesseract OCR Engine with Braille Translation.
Our Future plans are to make it multilingual such that it can support Bharti Braille too which has Bengali, Hindi, Gujarati and all other Indian languages.
We will also provide better support for Grade 2 Braille as Grade 2 Braille is common now-days.
Project Site: code.google.com/p/brailleocr
Conclusion and Future Plans
[1] Tesseract Project Site: code.google.com/p/tesseractocr [2] Chirag Ptel, AtulPatel, Dharmendra Patel, Optical Character
Recognition using Tool Tesseract: A Case Study, IJCA, October 2012 [3] Pijush Chakraborty and Arnab Mallik, An Open Source Tesseract
based Tool for Extracting Text from Images with Application in Braille Translation for the Visually Impaired, IJCA, April 2013
[4] R.Smith, An Overview of the Tesseract OCR Engine, Proc. Ninth Int. Conference on Document Analysis and Recognition , IEEE Computer Society (2007)
[5] Ray Smith, Tesseract OCR Engine, OSCON 2007 [6] Tess4J Project Site: http://tess4j.sourceforge.net/ [7] JOrtho Project Site: http://jortho.sourceforge.net/ [8] Soundex Reference: http://en.wikipedia.org/wiki/Soundex [9] The Rules of Unified English Braille, International Council on English
Braille(ICEB), June 2001 [10] Braille ASCII: http://en.wikipedia.org/wiki/Braille_ASCII [11] BrailleOCR Project Site: code.google.com/p/brailleocr
References:
Questions?
Thank You!..