text extraction from digital image
DESCRIPTION
Text Extraction is a process by which we convert Printed document/Scanned Page or Image in which text are available to ASCII Character that a Computer can Recognize.TRANSCRIPT
![Page 1: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/1.jpg)
Prepared By:Amit Bhoraniya (7022)
Kaushik Godhani(7009)Mayur Halai(7016)
Vikram Ghunsar(7039)
Text Extraction From Image
Guided By:Mr. Udesang Jaliya
Mr. Kirti Sharma
![Page 2: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/2.jpg)
What is Text Extraction ??Text Extraction is a process by which
we convert Printed document/Scanned Page or Image in which text are available to ASCII Character that a Computer can Recognize.
![Page 3: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/3.jpg)
Goal Of Project
GENERAL APTITUDEComputer ScienceElectronics & Communication Engineering
![Page 4: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/4.jpg)
How Will We Archive That Goal ??
1Preprocessing
2Segmentation
3Recognition
![Page 5: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/5.jpg)
Pre-Processing1
![Page 6: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/6.jpg)
Pre-Processing
1Gray Scale 2Noise Removal 3Thresholding
![Page 7: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/7.jpg)
Gray Scale
![Page 8: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/8.jpg)
Noise Removal
Noise Removal is used to Enhance the ImageFor Enhancing We have used Median Filter
FilteredImage = Median Filter(Origional Image, FilterSize)We have used FilterSize [5,5]
![Page 9: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/9.jpg)
Thresholding
Edge DetectionDilate ImageDetect Text Area Using HistrogramPersonal Thresholding to Text Area
![Page 10: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/10.jpg)
Edge Detection using Canny
![Page 11: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/11.jpg)
Dilate
![Page 12: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/12.jpg)
Text Area Using Histrogram
![Page 13: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/13.jpg)
Algorithm
• Row Histrogram• Separate Region by (no. of Pixel > 60 )• For Each Row
– Separate Region by (no. of Pixel > Height of (Row/4))
![Page 14: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/14.jpg)
2 Segmentation
![Page 15: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/15.jpg)
Segmentation
1Line Segmentation 2Word
Segmentation
3Character Segmentation
![Page 16: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/16.jpg)
From above Image, Image are segment in to Different Lines, Below an example of Only For one Line.
TEXT SEGMENTATION
![Page 17: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/17.jpg)
Find all the word than convert text area in one image
Segmentation
Character are separate from the word
![Page 18: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/18.jpg)
3 Recognition
![Page 19: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/19.jpg)
Recognization
1Feature Extraction 2Classifier
3Text Document
![Page 20: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/20.jpg)
• Feature Extraction• Binary Code Method• Chain Code Method• PCA (Principle Component Analysis)• LDA (Linear Discriminative Image)
• Classifier• Artificial Neural Network• Support Vector Machine
Recognization
![Page 21: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/21.jpg)
Applications• Banking (To read Credit Card)• Libraries (To convert Scanned Page to
Image)• Govt. Sector (Form Processing)• Used in Car Number Plate Recognition
System• Undesirable Text removal from images.
![Page 22: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/22.jpg)
References
1. OCR for Devnagari Script by Mahesh Goyani2. Edge Based Text Extraction From Complex Images
by Xiaoqing Liu and Jagath Samarbandhu3. Automatic Text Detection using Morphological
Operations and Inpainting by Khyati Vaghela4. Font and Background Color Independent Text
Binarization by T.Kasar , J.Kumar , A.G. Ramkrishnan
![Page 23: Text extraction From Digital image](https://reader033.vdocuments.site/reader033/viewer/2022061114/545ccea5b0af9fa92c8b4aa6/html5/thumbnails/23.jpg)
Thank You