kai-chao yang hierarchical prediction structures in h.264/avc

Kai-Chao Yang

Hierarchical Prediction Structures in H.264/AVC

OutlineAnalysis of Hierarchical B Pictures and

MCTF ICME 2006

Multiple Description Video Coding using Hierarchical B Pictures ICME 2007

Rate-Distortion Optimization for Fast Hierarchical Picture Transcoding ISCAS 2006

All Related Researches

Heiko Schwarz, Detlev Marpe, and Thomas Wiegand

ICME 2006

Analysis of Hierarchical B Pictures and MCTF

Hierarchical B-Pictures (1/2)Key pictures

Hierarchical prediction structuresDyadic structure

Non-dyadic structure

IDR I/P I/P I/P

GOP GOP GOP

…Hierarchical

prediction

Hierarchical

prediction

Hierarchical

prediction

… … … … … … … …

Hierarchical B-Pictures (2/2)Coding delay

Minimum coding delay = hierarchy levels – 1

Memory requirementMaximum decoded picture buffer (DPB): 16Reference picture buffering type

Sliding window Adaptive memory control

Memory management control operation (MMCO)o 0: End MMCO loopo 1: mark a Short-term frame as “Unused”o 2: mark a Long-term frame as “Unused”o 3: assign a Long-term index to a frameo 4: specify the maximum Long-term frame indexo 5: reset

Minimum DPB size = hierarchy levels

0 1234Coding order 5

0 1 2 … … N-2 N-1 N

Short-termframes

Long-termframes

New

Old

replaceThomas Wiegand, “Joint Committee Draft (CD),” Joint Video Team, JVT-

C167, 6-10 May, 2002

Frame buffer

Coding Efficiency of Hierarchical B-PicturesQPk = QPk-1 + (k=1 ? 4:1)

Problem : PSNR fluctuations High spatial detail and slow regular motion Fast and complex motion

Visual QualityComparison of visual quality

Finer detailed regions of the background using larger GOP sizes.

IBBP GOP 16

MCTF Versus Hierarchical B-PicturesDrawbacks of MCTF

Open-loop encoder controlSignificant cost in update stage

Minglei Liu and Ce Zhu

ICME 2007

Multiple Description Video Coding using Hierarchical B

Pictures

Concept of Multiple Description CodingMultiple bit-streams are generated from

one source signal and transmitted over separate channels

MDC encoder

Decoder 1

Decoder 2

Decoder 3

Channel 1Channel 2

MDC decoder

S1

S2

Source signal

Decoded signal

from S1

Decoded signal

from S1 and S2

Decoded signal

from S2

The proposed architecture for MDCGOP size = 8Two output streams (S1, S2) are generated

GOP

GOP

S1

S2

Combination … …

i i+8

i+1 i+9

i i+8i+1 i+9i+3 i+5 i+7i+6i+4i+2

Coding Efficiency (1/2)Improvement of coding efficiency

Increasing QP values for higher layersTransmitting MVs only for higher layersSkipping frames at higher layers

Coding Efficiency (2/2)

Central distortion Central distortion

Side distortion Side distortion

Max. QP = 51 for highest level

Huifeng Shen, Xiaoyan Sun, Feng Wu, and Shipeng Li

ISCAS 2006

Rate-Distortion Optimization for Fast Hierarchical Picture

Transcoding

Rate Reduction Transcoding (1/3)Cascaded pixel-domain transcoding

structureFully decoding the original signal, and then

re-encoding it

A. Vetro, C. Christopoulos, and H. Sun, "Video transcoding architectures and techniques: an overview", IEEE Signal processing magazine, March 2003.

Rate Reduction Transcoding (2/3)Open-loop transcoding in coded domain

Partially decoding the original signal and re-quantizing DCT coefficients

drift


Rate Reduction Transcoding (3/3)Closed-loop transcoding with drift

compensationPartially decoding the original signal, and

then compensating the re-quantized drift data


Hierarchical B Pictures TranscodingOpen-loop transcoding method can be used

Motion information is unchanged; DCT coefficients are truncated, re-quantized, or partially discarded

Drift inside a GOP will not propagate to other GOPs

However, motions are more important in hierarchical B-pictures structureAt low bit-rate, most bits are spent on motion

informationProposed RDO model – combination of

texture RDO and motion RDO

Traditional Rate-Distortion ModelRD model

S = (S1, …, Sk) denotes k MBsI = (I1, …, Ik) denotes k coding parameters of S

Fully decoding and re-encoding is needed!

),(),()|,(

)|,(minarg*

ISRISDISJ

ISJI

totaltotal

I

Proposed Rate-Distortion Model (1/4)Proposed RD model

Claim

Rtexture: rate spent in coding quantized DCT coefficientsRmotion: rate spent in coding MB modes, block modes,

and MVs

Dtexture: distortion caused by downscaled texture with unchanged MVs

Dmotion: distortion caused by motion adjustment relative to the unchanged motion case

))|,(((min))|,((min)|,(min ISJISJISJ textureImotionII ),(),()|,( ISRISDISJ motionmotionmotion ),(),()|,( ISRISDISJ texturetexturetexture

motiontexturetotal DDD

motiontexturetotal RRR

Proposed Rate-Distortion Model (2/4)Texture RDO model

To minimize the RD function, Let

0

texture

texture

texture

texture

R

D

R

J

0

texture

texture

R

D

bQD

aQR

N.Kamaci, Y. Altunbasak, and R.M. Mersereau, "Frame bit allocation for the H.264/AVC video coder via Cauchy-density-based rate and distortion models", IEEE Trans. on CSVT, Vol 15, No. 8, Aug. 2005.

cQR

Q

Q

D

R

D

cQR

D

texture

texture lglglg

2.54 -5.35

54.2

41

1Q

Proposed Rate-Distortion Model (3/4)Motion RDO model

Rmotion can be easily computed, but Dmotion is unknow

Dmotion can be approximated by mv mean-square error

l

nll

yxmv

mvlmotion

nG

wdwwwS

mvmvD

DGD

2

1

2

22

212

22

)2

1(21

))(()2(2

1

4

1

A. Secker and D. Taubman, "Highly scalable video compression with scalable motion coding", IEEE Trans. on Image Processing, Vol. 13, No.8, August 2004.

Proposed Rate-Distortion Model (4/4)Motion adjustment

Original

Adjustment

…

…

Simulation results

All related researchesRate control optimizationBit allocationTrade-off between coding efficiency and

delayMulti-viewTemporal scalable coding in SVCElimination of PSNR fluctuation?More efficient hierarchical structures?

kai-chao yang hierarchical prediction structures in h.264/avc

Documents

hierarchical b pictures

s2 slide

mctf slide

hierarchical b pictures

avc slide

frame buffer slide

update stage slide

complex motion slide