text detection in video min cai 2002.3.13. background video ocr: text detection, extraction and...

20
Text Detection in Video Min Cai 2002.3.13

Post on 19-Dec-2015

229 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Text Detection in Video

Min Cai

2002.3.13

Page 2: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Background

Video OCR: Text detection, extraction and recognition

Detection Target: Artificial text

Text detection: Detect the region from Single frame Refine the region by combining consecutive frames

Page 3: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Existing Work

Feature Extraction Text Detection based on feature

Color Connected-component

Texture Texture-Segmentation

Edge Top-Down

Bottom-Up

Page 4: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Connected-component-based methods

Basic idea Treat text as an uniform color (color level) and classify each pixel as

text or non-text according to the color value. Combine connected text-pixels into connected components. Group collinear connected components into a text string.

Advantage Can detect an arbitrary orientation text ---- with similar color and in

a simple background. Disadvantage

Sensitive to color variance Lossy compression of video introduces color bleeding Complex background

Page 5: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Texture Segmentation method

Basic idea Treat text as a type of texture Use texture segmentation algorithms to detect text

Gabor Filter Gaussian derivatives

Advantage Can segment text areas & graphic areas in a simple background

efficiently. It is usually used in document analysis.

Disadvantage Time-consuming Cannot handle well a text embedded in various background.

Page 6: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Bottom-Up method

Basic idea A seed region is defined as a small region with high edge density. Grow a seed region into successively larger components until all

seed regions are reached on the image.

Advantage It is a generic method to detect a homogeneous object of various

shape. That is, it can detect not only a rectangular object, but also other shapes.

Disadvantage Sensitive to noise. Can not handle the large range of font-size. Sensitive to the stroke density (different language).

Page 7: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Top-Down method

Basic idea Based on run-length smoothing algorithm Analyze horizontal and vertical projection profiles

Advantage Can detect the boundary of horizontal alignment text string quickly

and correctly Noise insensitive

Disadvantage Cannot handle diagonal alignment text. One pass of horizontal & vertical projection cannot handle the

complex layout.

Page 8: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Analysis (1)

A certain contrast against background Artificial text strings are designed to be read easily

A certain stroke density Text strings always appear horizontally Spatial cohesion

Characters of the same text string are of similar heights, orientation and spacing

Size constraint Text strings have certain size restriction

A text string appears in multiple consecutive frames and the similar position.

Page 9: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Analysis (2)

Problems Resolutions

How to extract more useful edge? Local Thresholding

How to highlight text areas? Text area recovery

How to detect text regions fast and correctly

?

Coarse-To-Fine detection

Page 10: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Single Threshold

Page 11: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Local threshold (1)

Use a small kernel (red) to scan the whole image. In a bigger window (gray) surrounding the kernel, calculate

the local threshold corresponding to its local histogram.

a. Window move

MIN MAXT-local

Count

Edgestrength 0

Low half High half

b. Local threshold selection

Page 12: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Local threshold (2)

Page 13: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Text-like area recovery (1)

Before recovery After recovery

Page 14: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Text-like area recovery (2)

Before recovery After recovery

Page 15: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

High pass filter

Page 16: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Using Top-down scheme to detect text-like areas

Coarse-to-Fine detection

Horizontal project

Vertical project

Can divide?

The first region from the array

Add to Processing array

Initial:Add the whole

Image to processing array

Add to result array YesNo

Page 17: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Detect text-like areas

b. Coarse vertical projection

1) 2)

3) 4)

Page 18: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Refinement

Combine the neighboring text areas with similar height

Using size constraints to remove unsatisfied areas

Page 19: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Multi-frame analysis

Text region matching Find all the regions corresponding to the same text

Text region enhancement Enhance the text image quality by multi-frame integration

Repetitive text elimination Only record the text at its first emergence.

Page 20: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text

Thank you!

End