kai-chao yang hierarchical prediction structures in h.264/avc
TRANSCRIPT
Kai-Chao Yang
Hierarchical Prediction Structures in H.264/AVC
OutlineAnalysis of Hierarchical B Pictures and
MCTF ICME 2006
Multiple Description Video Coding using Hierarchical B Pictures ICME 2007
Rate-Distortion Optimization for Fast Hierarchical Picture Transcoding ISCAS 2006
All Related Researches
Heiko Schwarz, Detlev Marpe, and Thomas Wiegand
ICME 2006
Analysis of Hierarchical B Pictures and MCTF
Hierarchical B-Pictures (1/2)Key pictures
Hierarchical prediction structuresDyadic structure
Non-dyadic structure
IDR I/P I/P I/P
GOP GOP GOP
…Hierarchical
prediction
Hierarchical
prediction
Hierarchical
prediction
… … … … … … … …
Hierarchical B-Pictures (2/2)Coding delay
Minimum coding delay = hierarchy levels – 1
Memory requirementMaximum decoded picture buffer (DPB): 16Reference picture buffering type
Sliding window Adaptive memory control
Memory management control operation (MMCO)o 0: End MMCO loopo 1: mark a Short-term frame as “Unused”o 2: mark a Long-term frame as “Unused”o 3: assign a Long-term index to a frameo 4: specify the maximum Long-term frame indexo 5: reset
Minimum DPB size = hierarchy levels
0 1234Coding order 5
0 1 2 … … N-2 N-1 N
Short-termframes
Long-termframes
New
Old
replaceThomas Wiegand, “Joint Committee Draft (CD),” Joint Video Team, JVT-
C167, 6-10 May, 2002
Frame buffer
Coding Efficiency of Hierarchical B-PicturesQPk = QPk-1 + (k=1 ? 4:1)
Problem : PSNR fluctuations High spatial detail and slow regular motion Fast and complex motion
Visual QualityComparison of visual quality
Finer detailed regions of the background using larger GOP sizes.
IBBP GOP 16
MCTF Versus Hierarchical B-PicturesDrawbacks of MCTF
Open-loop encoder controlSignificant cost in update stage
Minglei Liu and Ce Zhu
ICME 2007
Multiple Description Video Coding using Hierarchical B
Pictures
Concept of Multiple Description CodingMultiple bit-streams are generated from
one source signal and transmitted over separate channels
MDC encoder
Decoder 1
Decoder 2
Decoder 3
Channel 1Channel 2
MDC decoder
S1
S2
Source signal
Decoded signal
from S1
Decoded signal
from S1 and S2
Decoded signal
from S2
The proposed architecture for MDCGOP size = 8Two output streams (S1, S2) are generated
GOP
GOP
S1
S2
Combination … …
i i+8
i+1 i+9
i i+8i+1 i+9i+3 i+5 i+7i+6i+4i+2
Coding Efficiency (1/2)Improvement of coding efficiency
Increasing QP values for higher layersTransmitting MVs only for higher layersSkipping frames at higher layers
Coding Efficiency (2/2)
Central distortion Central distortion
Side distortion Side distortion
Max. QP = 51 for highest level
Huifeng Shen, Xiaoyan Sun, Feng Wu, and Shipeng Li
ISCAS 2006
Rate-Distortion Optimization for Fast Hierarchical Picture
Transcoding
Rate Reduction Transcoding (1/3)Cascaded pixel-domain transcoding
structureFully decoding the original signal, and then
re-encoding it
A. Vetro, C. Christopoulos, and H. Sun, "Video transcoding architectures and techniques: an overview", IEEE Signal processing magazine, March 2003.
Rate Reduction Transcoding (2/3)Open-loop transcoding in coded domain
Partially decoding the original signal and re-quantizing DCT coefficients
drift
A. Vetro, C. Christopoulos, and H. Sun, "Video transcoding architectures and techniques: an overview", IEEE Signal processing magazine, March 2003.
Rate Reduction Transcoding (3/3)Closed-loop transcoding with drift
compensationPartially decoding the original signal, and
then compensating the re-quantized drift data
A. Vetro, C. Christopoulos, and H. Sun, "Video transcoding architectures and techniques: an overview", IEEE Signal processing magazine, March 2003.
Hierarchical B Pictures TranscodingOpen-loop transcoding method can be used
Motion information is unchanged; DCT coefficients are truncated, re-quantized, or partially discarded
Drift inside a GOP will not propagate to other GOPs
However, motions are more important in hierarchical B-pictures structureAt low bit-rate, most bits are spent on motion
informationProposed RDO model – combination of
texture RDO and motion RDO
Traditional Rate-Distortion ModelRD model
S = (S1, …, Sk) denotes k MBsI = (I1, …, Ik) denotes k coding parameters of S
Fully decoding and re-encoding is needed!
),(),()|,(
)|,(minarg*
ISRISDISJ
ISJI
totaltotal
I
Proposed Rate-Distortion Model (1/4)Proposed RD model
Claim
Rtexture: rate spent in coding quantized DCT coefficientsRmotion: rate spent in coding MB modes, block modes,
and MVs
Dtexture: distortion caused by downscaled texture with unchanged MVs
Dmotion: distortion caused by motion adjustment relative to the unchanged motion case
))|,(((min))|,((min)|,(min ISJISJISJ textureImotionII ),(),()|,( ISRISDISJ motionmotionmotion ),(),()|,( ISRISDISJ texturetexturetexture
motiontexturetotal DDD
motiontexturetotal RRR
Proposed Rate-Distortion Model (2/4)Texture RDO model
To minimize the RD function, Let
0
texture
texture
texture
texture
R
D
R
J
0
texture
texture
R
D
bQD
aQR
N.Kamaci, Y. Altunbasak, and R.M. Mersereau, "Frame bit allocation for the H.264/AVC video coder via Cauchy-density-based rate and distortion models", IEEE Trans. on CSVT, Vol 15, No. 8, Aug. 2005.
cQR
Q
Q
D
R
D
cQR
D
texture
texture lglglg
2.54 -5.35
54.2
41
1Q
Proposed Rate-Distortion Model (3/4)Motion RDO model
Rmotion can be easily computed, but Dmotion is unknow
Dmotion can be approximated by mv mean-square error
l
nll
yxmv
mvlmotion
nG
wdwwwS
mvmvD
DGD
2
1
2
22
212
22
)2
1(21
))(()2(2
1
4
1
A. Secker and D. Taubman, "Highly scalable video compression with scalable motion coding", IEEE Trans. on Image Processing, Vol. 13, No.8, August 2004.
Proposed Rate-Distortion Model (4/4)Motion adjustment
Original
Adjustment
…
…
Simulation results
All related researchesRate control optimizationBit allocationTrade-off between coding efficiency and
delayMulti-viewTemporal scalable coding in SVCElimination of PSNR fluctuation?More efficient hierarchical structures?