Download - EE591f Digital Video Processing 1 Roadmap Introduction Intra-frame coding –Review of JPEG Inter-frame coding –Conditional Replenishment (CR) –Motion Compensated

EE591f Digital Video Processing EE591f Digital Video Processing

11

RoadmapRoadmap

IntroductionIntroduction

Intra-frame coding Intra-frame coding – Review of JPEGReview of JPEG

Inter-frame codingInter-frame coding– Conditional Replenishment (CR)Conditional Replenishment (CR)– Motion Compensated Prediction (MCP)Motion Compensated Prediction (MCP)

Object-based and scalable video coding*Object-based and scalable video coding*– Motion segmentation, scalability issuesMotion segmentation, scalability issues


22

Introduction to Video CodingIntroduction to Video Coding

Lossless vs. lossy data compressionLossless vs. lossy data compression– Source entropy H(X)Source entropy H(X)– Rate-Distortion function R(D) or D(R)Rate-Distortion function R(D) or D(R)

Probabilistic modeling is at the heart of data Probabilistic modeling is at the heart of data compressioncompression– What is P(X) for video source X?What is P(X) for video source X?– Modeling moving pictures is more difficult than Modeling moving pictures is more difficult than

modeling still images due to temporal dependencymodeling still images due to temporal dependency


33

Shannon’s PictureShannon’s Picture

Rate (bps)

Distortion

Coder ACoder B

Which coder wins, A or B?


44

Distortion MeasuresDistortion Measures

ObjectiveObjective– Mean Square Error (MSE)Mean Square Error (MSE)– Peak Signal-to-Noise-Ratio (PSNR)Peak Signal-to-Noise-Ratio (PSNR)– Measure the fidelity to original videoMeasure the fidelity to original video

SubjectiveSubjective– Human Vision System (HVS) basedHuman Vision System (HVS) based– Emphasize visual quality rather than fidelityEmphasize visual quality rather than fidelity

We only discuss objective measures in this We only discuss objective measures in this coursecourse


55

RoadmapRoadmap


Intra-frame codingIntra-frame coding – Review of JPEGReview of JPEG


Scalable video codingScalable video coding– 3D subband/wavelet coding and recent trend3D subband/wavelet coding and recent trend


66

A Tour of JPEG Coding Standard

Key Components

Transform

Quantization

Coding

-8×8 DCT-boundary padding

-uniform quantization

-DC/AC coefficients

-Zigzag scan-run length/Huffman coding


77

JPEG Baseline Coder

169130

173129

170181

170183

179181

182180

179180

179179169132

171130

169183

164182

179180

176179

180179

178178167131

167131

165179

170179

177179

182171

177177

168179169130

165132

166187

163194

176116

15394

153183

160183Tour Example


88

Step 1: Transform• DC level shifting

• 2D DCT

169130

173129

170181

170183

179181

182180

179180

179179169132

171130

169183

164182

179180

176179

180179

178178167131

167131

165179

170179

177179

182171

177177

168179169130

165132

166187

163194

176116

15394

153183

160183

412

451

4253

4255

5153

5452

5152

5151414

432

4155

3654

5152

4851

5251

5050393

393

3751

4251

4951

5443

4949

4051412

374

3859

3566

4812

2534

2555

3655

-128

412

451

4253

4255

5153

5452

5152

5151414

432

4155

3654

5152

4851

5251

5050393

393

3751

4251

4951

5443

4949

4051412

374

3859

3566

4812

2534

2555

3655

13

42

12

09

40

21

13

4430

55

47

73

30

46

32

16113

916

109

621

179

3310

810

17201024

2727

132

6078

4413

1827

2738

56313

DCT


99

Step 2: Quantization

99103

101120

100112

121103

9895

8778

9272

644992113

77103

10481

10968

6455

5637

3524

22186280

5669

8751

5740

2922

2416

1714

13145560

6151

5826

4024

1914

1610

1212

1116

Q-table

13

42

12

09

40

21

13

4430

55

47

73

30

46

32

16113

916

109

621

179

3310

810

17201024

2727

132

6078

4413

1827

2738

56313

00

00

00

00

00

00

00

0000

00

00

00

00

00

00

0000

00

00

01

10

11

01

1100

01

01

23

21

13

23

520

Q

Why increasefrom top-left tobottom-right?


1010

Step 3: Entropy Coding

Zigzag Scan

00

00

00

00

00

00

00

0000

00

00

00

00

00

00

0000

00

00

01

10

11

01

1100

01

01

23

21

13

23

520

(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)

Zigzag Scan

End Of the Block:All following coefficients are zero


1111

RoadmapRoadmap






1212

Conditional ReplenishmentConditional Replenishment

Based on motion detection rather than motion Based on motion detection rather than motion estimationestimationPartition the current frame into “still areas” Partition the current frame into “still areas” and “moving areas”and “moving areas”– Replenishment is applied to moving regions onlyReplenishment is applied to moving regions only– Repetition is applied to still regionsRepetition is applied to still regions

Need to transmit the location of moving areas Need to transmit the location of moving areas as well as new (replenishment) informationas well as new (replenishment) information– No motion vectors transmittedNo motion vectors transmitted


1313

Conditional ReplenishmentConditional Replenishment


1414

Motion DetectionMotion Detection


1515

From Replenishment to PredictionFrom Replenishment to Prediction

Replenishment can be viewed as a degenerated Replenishment can be viewed as a degenerated case of predictioncase of prediction– Only zero motion vector is considered Only zero motion vector is considered – Discard the historyDiscard the history

A more powerful approach of exploiting A more powerful approach of exploiting temporal dependency is predictiontemporal dependency is prediction– Locate the best match from the previous frameLocate the best match from the previous frame– Use the history to predict the current Use the history to predict the current


1616

Differential Pulse Coded ModulationDifferential Pulse Coded Modulation

_

+D

xn

xnxn-1

yn

xn-1

+yn xn

D

Encoder

Decoder

Q^yn

^

^^

^

xn-1^

^

Xn,yn: unquantized samples and prediction residues

Xn,yn: decoded samples and quantized prediction residues^ ^

nnnn yyxx ˆˆ


1717

Motion-Compensated Predictive CodingMotion-Compensated Predictive Coding


1818

A Closer LookA Closer Look


1919

Key ComponentsKey Components

Motion Estimation/CompensationMotion Estimation/Compensation – At the heart of MCP-based codingAt the heart of MCP-based coding

Coding of Motion Vectors (overhead)Coding of Motion Vectors (overhead)– Lossless: errors in MV are catastrophic Lossless: errors in MV are catastrophic

Coding of MCP residuesCoding of MCP residues– Lossy: distortion is controlled by the quantization Lossy: distortion is controlled by the quantization

step-sizestep-size

Rate-Distortion optimizationRate-Distortion optimization


2020

Block-based Motion ModelBlock-based Motion Model

Block sizeBlock size– Fixed vs. variableFixed vs. variable

Motion accuracyMotion accuracy– Integer-pel vs. fractional-pelInteger-pel vs. fractional-pel

Number of hypothesisNumber of hypothesis– Overlapped Block Motion Compensation (OBMC)Overlapped Block Motion Compensation (OBMC)– Multi-frame predictionMulti-frame prediction


2121

Quadtree Representation of Quadtree Representation of Motion Field with Variable BlocksizeMotion Field with Variable Blocksize


2222

Rate-Distortion Optimized BMARate-Distortion Optimized BMA

Distortion alone

Rate and Distortion

counted bits using a VLC table


2323

Experimental ResultsExperimental Results

Cited from G. Sullivan and L. Baker, “Rate-Distortion optimizedmotion compensation for video compression using fixed or variable size blocks”, Globecom’1991


2424

Fractional-pel BMAFractional-pel BMA

Recall the tradeoff between spending bits on Recall the tradeoff between spending bits on motion and spending bits on MCP residuesmotion and spending bits on MCP residues

Intuitively speaking, going from integer-pel to Intuitively speaking, going from integer-pel to fractional-pel is good for it dramatically fractional-pel is good for it dramatically reduces the variance of MCP residues for reduces the variance of MCP residues for some video sequence.some video sequence.

The gain quickly saturates as motion accuracy The gain quickly saturates as motion accuracy refinesrefines


2525

8-by-8 block, half-pel, var(e)=123.88-by-8 block, integer-pel, var(e)=220.8

ExampleExample

MCP residue comparison for the first two frames of Mobile sequence


2626

Fractional-pel MCPFractional-pel MCP


2727

Multi-Hypothesis MCPMulti-Hypothesis MCP

Using one block from one reference frame Using one block from one reference frame represents a single-hypothesis MCPrepresents a single-hypothesis MCPIt is possible to formulate multiple hypothesis It is possible to formulate multiple hypothesis by consideringby considering– Overlapped blocksOverlapped blocks– More than one reference frameMore than one reference frame

Why multi-hypothesis?Why multi-hypothesis?– The benefit of reducing variance of MCP residues The benefit of reducing variance of MCP residues

outweighs the increased overhead on motionoutweighs the increased overhead on motion


2828

Example: B-frameExample: B-frame

fn-1 fn fn+1

1/5.0/0),,()1(),(),(ˆ 11 ayxfayxfayxf nnn


2929

Generalized B-frameGeneralized B-frame

fn-1 fn fn+1

nk

kk

n yxfayxf ),(),(ˆ

fn+2fn-2


3030

Multi-Hypothesis MCPMulti-Hypothesis MCP


3131


Motion Estimation Motion Estimation – At the heart of MCP-based codingAt the heart of MCP-based coding



step-sizestep-size



3232

Motion Vector CodingMotion Vector Coding

2D lossless DPCM2D lossless DPCM– Spatially (temporally) adjacent motion vectors are Spatially (temporally) adjacent motion vectors are

correlatedcorrelated– Use causal neighbors to predict the current oneUse causal neighbors to predict the current one– Code Motion Vector Difference (MVD) instead of Code Motion Vector Difference (MVD) instead of

MVsMVs

Entropy coding techniquesEntropy coding techniques– Variable length codes (VLC)Variable length codes (VLC)– Arithmetic codingArithmetic coding


3333

MVD ExampleMVD Example

MV

MV1 MV2

MV3

),,( 321 MVMVMVmedianMVMVD

Due to smoothness of MV field, MVD usually hasa smaller variance than MV


3434

VLC Example VLC Example

MVx/MVy symbol codeword

0

-1

-2

1

2

3

1

2

3

4

5

6

1

010011

00100

00101

00110

Exponential Golomb Codes: 0…01x…xm m


3535





step-sizestep-size



3636

MCP Residue CodingMCP Residue Coding

Transform Quantization Coding

Conceptually similar to JPEG

Transform: unitary transform

Quantization: Deadzone quantization

Coding: Run-length coding


3737

TransformTransform

Unitary matrix: A is real, AUnitary matrix: A is real, A-1-1=A=ATT

Unitary transform: A is unitary, Y=AXAUnitary transform: A is unitary, Y=AXATT

ExamplesExamples– 8-by-8 DCT8-by-8 DCT– 4-by-4 integer transform 4-by-4 integer transform

1221

1111

2112

1111

A


3838

Deadzone QuantizationDeadzone Quantization

2

0

deadzone

codewords


3939





step-sizestep-size



4040

Lagrangian Multiplier MethodLagrangian Multiplier MethodRDJ

motionmotiondfd RDJ

RECModeREC RDJ

Motion estimation

Mode selection

ModeMotion

2cQUANTMode

QUANT: a user-specified parameter controlling quantization stepsize


4141

SummarySummary

How does MCP coding work?How does MCP coding work?– The predictive model captures the slow-varying The predictive model captures the slow-varying

trend of the samples {ftrend of the samples {fnn}}

– The modeling of prediction residues {eThe modeling of prediction residues {enn} is easier } is easier

than that of original samples {fthan that of original samples {fnn}}

Fundamental weaknessFundamental weakness– Quantization error will propagate unless the Quantization error will propagate unless the

memory of predictor is refreshedmemory of predictor is refreshed– Not suitable for scalable coding applicationsNot suitable for scalable coding applications


4242

RoadmapRoadmap






4343

Scalable vs. MulticastScalable vs. Multicast

What is scalable coding?What is scalable coding?

Multicast Scalable coding

foreman.yuv

foreman128k.codforeman256k.codforeman512k.codforeman1024k.cod

foreman.yuv

foreman.cod

1024512256128


4444

Spatial scalabilitySpatial scalability

11 00 11 11 11 …… 00 11 00 11 00 00 00 …… 11 11 00 11 00 00


4545

Temporal scalabilityTemporal scalability

11 00 11 11 11 …… 00 11 00 11 00 00 00 …… 11 11 00 11 00 00

Frame 0,1,2,3,4,5,…Frame 0,2,4,6,8,…Frame 0,4,8,12,…

30Hz15Hz7.5Hz


4646

SNR (Rate) scalabilitySNR (Rate) scalability

11 00 11 11 11 …… 00 11 00 11 00 00 00 …… 11 11 00 11 00 00

PSNRavg=30dB PSNRavg=35dB PSNRavg=40dB

N

iiavg PSNR

NPSNR

1

1PSNRi: PSNR of frame i


4747

Scalability via Bit-Plane CodingScalability via Bit-Plane Coding

A=(a0+a12+a222+ … … +a727)

Least Significant Bit (LSB)

Most Significant Bit (MSB)

Example A=129 sign=+,a0a1a2 …a7=10000001

sign=-, a0a1a2 …a7=00110011 A=-(4+8+64+128)=-204

sign bit


4848

Why DPCM Bad for Scalability?Why DPCM Bad for Scalability?

Base layer

Enhancement Layer 1

Enhancement Layer 2

Ibase P P P

Ienh1

Ienh2

1 2 3 …Frame number

P

P

P

P

P

P

suffer from drifting problemsuffer from coding efficiency loss


4949

3D Wavelet/Subband Coding3D Wavelet/Subband Coding

t

x

y

2D spatial WT+1D temporal WT


5050

Motion-Adaptive 3D Wavelet TransformMotion-Adaptive 3D Wavelet TransformRecall Haar transform

)12()2()(

)),12()2((2

1)(

nxnxnd

nxnxns

])[(2

1

],[

12

122

nnn

nnn

dWfs

fWfd

Motion-adaptive Haar transform

))()2((2

1)(

),12()2()(

ndnxns

nxnxnd

W,W-1: forward and backward motion vector

lifting-based implementation

Download - EE591f Digital Video Processing 1 Roadmap Introduction Intra-frame coding –Review of JPEG Inter-frame coding –Conditional Replenishment (CR) –Motion Compensated

Top Related