omr and ocr

30
Get Ready::

Upload: arslan-arshad

Post on 18-Aug-2015

79 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Omr and ocr

Get Ready::

Page 2: Omr and ocr

OMR, OCR And ICROptical Mark Recognition, Optical Character

Recognition and intelligent character Recognition.

Page 3: Omr and ocr

Definition/Concept of OMR

A technology that allows an input device (e.g. imaging scanner) to read hand-drawn marks such as small circles or rectangles on specially designed paper.

Often used for test, survey, or questionnaire answer sheets.

The process of capturing data by contrasting reflectivity at predetermined positions on a page

Sometimes Referred to as Optical Mark Reader

Page 4: Omr and ocr

OMR Forms

An OMR works with a specialized document and contains timing tracks along one edge of the form to indicate scanner where to read for marks which look like black boxes on the top or bottom of a form

OMR “Reads” mark information from Forms in the form of numbers/letters and put it into the computer.

Page 5: Omr and ocr

OMR Forms Timing tracks indicate where to read for marks and indicate where to

clip images.

Timing Tracks

Page 6: Omr and ocr

OMR Scanners and Software Have specifically placed LEDs (Light-emitting diodes).

LEDs sense marks in certain columns once a timing track is detected.

Software interprets the output from the scan and translates it to the desired format (e.g. ASCII).

Scanner Characteristics: ~130 pages per minute

(e.g. Kodak i 830) ~85 pages per minute (e.g

Axiome AXM 980 or Kodak 3000 Series)

INSIGHT 4ES (3,000/hour)

Kodak i 830

Page 7: Omr and ocr

OMR Scanners and Software

OMR Software is used to capture data

from OMR Sheets.(e.g Remark Office

OMR, Smartshoot OMR)

Software Characteristics: - Performing specific imaging

functions such as:

- image acquisition,

- file conversion,

- data extraction, and

- file read/write commands (e.g. ISIS)

Axiome AXM 980

Remark Office OMR

Page 8: Omr and ocr

OMR Storage Characteristics Storage: -

Barcodes: Identification of forms.

OMR Marks and Barcodes are read and moved directly into a database management system (e.g. SQL) then to a census database.

Images are not normally scanned and stored.

However, The capability of saving the scanned image is there!

Storage of Scanned Images (Recent Mainstream Capability)

Increasingly critical for validating results

Images can be used for correcting poorly filled out forms

Images can be used for validating results

Comprehensive image database of forms

Page 9: Omr and ocr

OMR Accuracy Accuracy

To achieve high accuracy, well structured design and good quality printing of these forms is critical.

If the timing track and the bubbles on the form are not in the exact columns where the LEDs in the read head can detect them (Skew), there is no way for the scanner to read the marks (Float) This is referred to as skew and float

Page 10: Omr and ocr

OMR Advantages OMR is a data collection technology

that does not require a recognition engine. Therefore: It is fast, using minimum processing power

to process forms Costs are predictable and defined OMR capture speeds range around 4000

forms per hr

Page 11: Omr and ocr

OMR Disadvantages Disadvantages

OMR cannot recognize hand-printed or machine-printed characters.

With OMR, images of forms are not captured by scanners so electronic retrieval is not possible.

Tick boxes may not be suitable for all types of questions

Page 12: Omr and ocr

OMR Challenges/Issues The entire process must be tested

Information Capture Recognizing Verifying Results

Questionnaire Design and Preparation is Critical Forms must be readable to the scanner when collected

Field Operators must take particular care in filling out questionnaires Completeness and consistency checks must be in place Careful care must be taken for the condition of the

Questionnaire (dust, humidity, transportation, etc)

Page 13: Omr and ocr

Price of OMR

Today, most Economic China Made Hardware Scanners are available for atleast Rs. 180,000/- per scanner. Reasonably acceptable versions reach 2,50,000/- and beyond.

Software Prices depend on use there are multiple software like Remark Office OMR(nearly cost 5000/year)

The average cost around 0.25 per sheet.

Page 14: Omr and ocr

Major Commercial Suppliers Pearson NCS - UK Company with US manufacturing base

(http://www.ncspearson.com)

Scantron - US Company with US manufacturing base (http://www.scantron.com)

Sekonic - Japanese Company with Japanese manufacturing base (http://www.sekonic.co.jp)

Axiome - Swiss Company with Swiss Manufacturing base (http://www.axiome.ch)

Page 15: Omr and ocr

What is OCR and ICR? OCR: -

“Gives scanning and imaging systems the ability to turn images of machine printed characters into machine readable characters.” Images of the machine printed characters are

extracted from a bitmap of the scanned image.

ICR: -

“Gives scanning and imaging systems the ability to turn images of hand written characters into machine readable characters.” Images of the hand written characters are

extracted from a bitmap of the scanned image

Page 16: Omr and ocr

Forms OCR/ ICR is more flexible since:

no timing tracks are required The image can float on a page

The use of drop color reduces the size of the scanner’s output and enhances the accuracy

ICR/OCR technology often uses registration mark on the four-corners of a document, in the recognition of an image.

Page 17: Omr and ocr
Page 18: Omr and ocr

OCR/ICR Scanner

Forms can be scanned through a scanner and then the recognition engine of the OCR/ICR system interpret the images and turn images of handwritten or printed characters into ASCII data (machine-readable characters).

Speeds Range from: 85-160 sheets/min (dependent on the recognition engine)

Page 19: Omr and ocr

OCR/ICR Software There are plenty of free software in market.

1. Microsoft OneNote 2007

2. MS Office Document Imaging

3. SimpleOCR

4. TopOCR

5. FreeOCR

These software are free and easily available, These software use OCR algorithm to recognize letter from image.

Premium versions are supported automatic scan from scanner and these are bit faster than free software. Price lie between (5,000—20,000)

Office Imaging App

Page 20: Omr and ocr

OCR/ICR Storage Characteristics

Storage/Retrieval

Images are scanned and stored and maintained electronically

There is no need to store the paper forms as long as you safeguard the electronic files

With OCR/ICR technologies, images can be scanned, indexed, and written to optical media

Page 21: Omr and ocr

Ideal OCR/ICR Accuracy Thresholds

Accuracy:

Accuracy achieved by data entry clerks (~99.5%) are approximately equal to OCR/ICR in in perfect tuning (~99.5%)

Up to 99.9% accuracy with editing (like OMR)

The recognition engine must be tuned, tested and validated very carefully

Page 22: Omr and ocr

Ideal OCR/ICR Accuracy Thresholds

Accuracy:

Accuracy achieved by data entry clerks (~99.5%) are approximately equal to OCR/ICR in in perfect tuning (~99.5%)

Up to 99.9% accuracy with editing (like OMR)

The recognition engine must be tuned, tested and validated very carefully

Page 23: Omr and ocr

OCR/ICR Advantages Advantages

Recognition engines used with imaging can capture highly specialized data sets

OCR/ICR recognize machine-printed or hand-printed characters.

Scanning and recognition allowed efficient management and planning for the rest of the processing workload

Quick retrieval for editing and reprocessing

Page 24: Omr and ocr

OCR/ICR Disadvantages

May require significant manual intervention. Additional workload to data collectors -ICR has

severe limitations when it comes to human handwriting.

Characters must be hand-printed/machine-printed with separate characters in boxes.

ineffective when dealing with cursive

characters.

Page 25: Omr and ocr

OMR-OCR/ICR Compared

Page 26: Omr and ocr

OCR/ICR Challenges/Issues Has corresponding issues with OMR

Algorithm development (Preparation of memory dictionary)

Processing time considerations due to recognition engine

Development costs

Page 27: Omr and ocr

Price of OCR/ICR

OCR is less costly as compared to OMR.

The printer of OCR cost between 6000-80,000

Price is depend on speed of scanning and quality provide by Scanner.

On other hand there are many free OCR Software providing free functionality like (OCR Using Microsoft OneNote 2007, SimpleOCR, TopOCR, FreeOCR and Ms office Document Imaging)

Page 28: Omr and ocr

Major Commercial Suppliers Top Image Systems (TIS) (http://www.topimagesystems.com)

ReadSoft (http://www.readsoft.com)

Teleform (http://www.intelliscan.com/TeleForm1.htm)

Scanner Suppliers Fujitsu, Canon, Bell & Howell, Kodak

Page 29: Omr and ocr

Technology Evolution

Cursive

Bad quality machine print

UnconstrainedHandprint

ConstrainedHandprint

Machine Print

TEXT STYLESFORM TYPESNo special form designNo constraining boxes or combsCondensed stringsDirty & Noisy formsBad quality paperLegacy Forms

Specially designed for automatic recognition

Constraining boxes or combs

Drop out ink for preprinted text & boxes

TECHNOLOGY EVOLUTION

OCR ICRIntelligentRecognition

Illustration: Conference on Technology Options for 2011 Census

Page 30: Omr and ocr

THANK YOU!