barcelona keynote web
DESCRIPTION
TRANSCRIPT
Video Coding For Compression
. . . and Beyond
Bernd GirodBernd GirodIInformation Systems Laboratorynformation Systems Laboratory
Department of Electrical EngineeringDepartment of Electrical Engineering
Stanford UniversityStanford University
Bernd Girod: Video Coding for Compression and Beyond 2
Bit Consumption of US Households
Total for 70M households ~230 Exabyte/year
Television 94%
Radio 1.7%
Recorded Music 0.4%
Newspaper 0.0003%
Books 0.0002%
Magazines 0.0002%
Home video 3.3%
Video games 0.6%
Internet 0.0003%
[Source: UC Berkeley: How much Information]
Bit equivalent, assuming state-of-the-art compression, year 2000
Bernd Girod: Video Coding for Compression and Beyond 3
Desirable Compression Ratios
DSL
~200 kbps~ 1,000 : 1
Dial-up modem, wireless link
~ 20 kbps~ 10,000 : 1
ITU-R 601166 Mbps
CIF
QCIF
SDTV broadcasting ~2 Mbps
~ 100 : 1
Bernd Girod: Video Coding for Compression and Beyond 4
Outline
Video compression – state-of-the-art Beyond compression
– Rate-scalable video– Wavelet video coding– Error-resilient video transmission– Unequal error protection– Optimal scheduling for packet networks– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond 5
Outline
Video compression – state-of-the-art Beyond compression
– Rate-scalable video– Wavelet video coding– Error-resilient video transmission– Unequal error protection– Optimal scheduling for packet networks– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond 6
“It has been customary in the past to transmit successive complete images of the transmitted picture.” [...]“In accordance with this invention, this difficulty is avoided by transmitting only the difference between successive images of the object.”
Bernd Girod: Video Coding for Compression and Beyond 7
Motion-Compensated Hybrid Coding
EntropyCoding
Deq./Inv. Transform
Motion-Compensated
Predictor
ControlData
Quant.Transf. coeffs
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Transform/Quantizer-
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Video in
Bernd Girod: Video Coding for Compression and Beyond 8
Motion-Compensated Hybrid Coding
EntropyCoding
Deq./Inv. Transform
Motion-Compensated
Predictor
ControlData
Quant.Transf. coeffs
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Transform/Quantizer-
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Video in
¼-pixel accuracy
Bernd Girod: Video Coding for Compression and Beyond 9
Motion-Compensated Hybrid Coding
EntropyCoding
Deq./Inv. Transform
Motion-Compensated
Predictor
ControlData
Quant.Transf. coeffs
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Transform/Quantizer-
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Video in
Adaptive block sizes. . .
Bernd Girod: Video Coding for Compression and Beyond 10
Motion-Compensated Hybrid Coding
EntropyCoding
Deq./Inv. Transform
Motion-Compensated
Predictor
ControlData
Quant.Transf. coeffs
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Transform/Quantizer-
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Video in
Multiple Past Reference Frames
Bernd Girod: Video Coding for Compression and Beyond 11
EntropyCoding
Deq./Inv. Transform
Motion-Compensated
Predictor
ControlData
Quant.Transf. coeffs
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Transform/Quantizer-
Motion-Compensated Hybrid Coding
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Video in
Generalized B-Frames
Bernd Girod: Video Coding for Compression and Beyond 12
Rate-Distortion Optimized Coder Control
Minimize Lagrangian cost function
Strategy: minimize Ji for each block i separately, using a common
Lagrange multiplier
i i ii i
J D R D R J
Totaldistortion
Totalbit-rate
Distortion
for block iRate
for block iLagrangian
cost
for block i
Bernd Girod: Video Coding for Compression and Beyond 13
Multiple Reference Frames in H.264/AVC
Mobile & Calendar (CIF, 30 fps)
0 1 2 3 426272829303132333435363738
R [Mbit/s]
PS
NR
Y [d
B]
PBB... with generalized B pictures PBB... with classic B pictures PPP... with 5 previous references PPP... with 1 previous reference
~15%
Bernd Girod: Video Coding for Compression and Beyond 14
Mobile & Calendar (CIF, 30 fps)
0 1 2 3 426272829303132333435363738
R [Mbit/s]
PS
NR
Y [d
B]
PBB... with generalized B pictures PBB... with classic B pictures PPP... with 5 previous references PPP... with 1 previous reference
>25%
Multiple Reference Frames in H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond 15
Mobile & Calendar (CIF, 30 fps)
0 1 2 3 426272829303132333435363738
R [Mbit/s]
PS
NR
Y [d
B]
PBB... with generalized B pictures PBB... with classic B pictures PPP... with 5 previous references PPP... with 1 previous reference
~40%
Multiple Reference Frames in H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond 16
Outline
Video compression – state-of-the-art Beyond compression
– Rate-scalable video– Wavelet video coding– Error-resilient video transmission– Unequal error protection– Optimal scheduling for packet networks– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond 17
??
Internet video streaming
Surprising Success of ITU-T Rec. H.263
What H.263 was developed for . . .
Analog videophone
. . . and what is was used for.
Bernd Girod: Video Coding for Compression and Beyond 18
Internet Video Streaming
How to accommodate heterogeneous bit-rates? How to react to network congestion? How to mitigate late or lost packets?
Streaming client
DSL
dial-up modem
Media Server
Internet
wireless
Bernd Girod: Video Coding for Compression and Beyond 19
Fine Granular Scalability (FGS)
~2dB gap
H.264 with/without FGS option
Foreman sequence (5fps)Base layer
20 kbps
Enhancement layervariable bit-rate
Efficiency gap
Bernd Girod: Video Coding for Compression and Beyond 20
Wavelet Video Coder
TemporalWavelet
Transform
TemporalWavelet
Transform
Spatial Wavelet
Transform
Spatial Wavelet
Transform
76
54
32
10
HH
LLL LLHLH
LH
Originalvideoframes
HHH
HHHH
HHHH
HHHH
H
EmbeddedQuantization &Entropy Coding
EmbeddedQuantization &Entropy Coding
[Taubman & Zakhor, 1994] [Ohm, 1994] [Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others
Bernd Girod: Video Coding for Compression and Beyond 21
Lifting
P U
Even Frames
Synthesis:
Odd Frames
Low Band
High Band11G
10G
P U
Even Frames
Analysis:
Odd Frames
Low Band
High Band
0G
1G
Motion Compensation
[Secker & Taubman, 2001] [Popescu & Bottreau, 2001]
Bernd Girod: Video Coding for Compression and Beyond 22
MC Wavelet Coding vs. H.264/AVC
2.02.01.81.81.61.61.41.41.21.21.01.00.80.80.60.60.40.40.20.2
3636
3434
3232
3030
2828
2626
2424
2222
2020
3838L
umin
ance
PSN
R (
dB)
Lum
inan
ce P
SNR
(dB
)
bit-rate (Mbps)bit-rate (Mbps)
ScalableScalableMC 5/3 WaveletMC 5/3 Wavelet
Non-scalableNon-scalableH.264/AVCH.264/AVC
Sequence: Mobile CIF
H.264/AVC• high complexity RD control• CABAC• PBBPBBP . . . • 5 prev/3 future reference frames• data courtesy of M. Flierl
[Taubman & Secker, VCIP 2003]courtesy D. Taubman
Bernd Girod: Video Coding for Compression and Beyond 23
Wavelet Synthesis with Lossy Motion Vector
d
MC WaveletTransform
MC WaveletTransform
MotionEstimator
MotionEstimator
EmbeddedEncoding
EmbeddedEncoding
EmbeddedEncoding
EmbeddedEncoding
DecoderDecoder
DecoderDecoder
InverseWaveletTransform
InverseWaveletTransform
Videoin
Videoout
d
[Taubman & Secker, ICIP03]
MinimizeJ=D+R
MinimizeJ=D+R
Bernd Girod: Video Coding for Compression and Beyond 24
R-D Performance with Lossy Motion Vector
BitBit--Rate (kbps)Rate (kbps)
Vid
eo P
SN
R (
dB)
Vid
eo P
SN
R (
dB)
00 200200 400400 600600 800800 10001000 120012002424
2626
2828
3030
3232
3434
3636
3838
4040
Embedded wavelet coefficientsEmbedded wavelet coefficients
Lossless motionLossless motion
Non-embeddedNon-embedded
single-ratesingle-rate
Embedded wavelet coefficientsEmbedded wavelet coefficientsLossy motionLossy motion
CIF ForemanCIF Foreman
[Taubman & Secker, VCIP 2003]courtesy D. Taubman
Bernd Girod: Video Coding for Compression and Beyond 25
Outline
Video compression – state-of-the-art Beyond compression
– Rate-scalable video– Wavelet video coding– Error-resilient video transmission– Unequal error protection– Optimal scheduling for packet networks– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond 26
redundancy symbols
enhancement layerbase layer
Priority Encoding Transmission (PET)
information symbols
block of packets
Ree
d-S
olom
on c
odew
ord K
N-K
[Albanese, Blömer, Edmonds, Luby, Sudan, 1996] [Davis & Danskin, 1996]
[Horn, Stuhlmuller, Link, Girod, 1999] [Puri, Ramchandran, 1999]
[Mohr, Riskin, Ladner, 2000] [Stankovic, Hamzaoui, Xiong, 2002]
[Chou, Wang, Padmanabhan, 2003] . . . and many more . . .
packet network
…
Bernd Girod: Video Coding for Compression and Beyond 27
Packet Delay Jitter and Loss
delay
lead-time
lossprobability
lead-time
lossprobability
loss
Bernd Girod: Video Coding for Compression and Beyond 28
Smart Prefetching
Idea: Send more important packets earlier to allow for more retransmissions
Server Client
InternetInternet
Request stream
Request stream
Rate-distortionpreamble
Rate-distortionpreamblePacket
Schedule
PacketSchedule
Video packetsVideo packets
UpdatedPacketSchedule
UpdatedPacketSchedule
UpdatedPacketSchedule
UpdatedPacketSchedule
UpdatedPacketSchedule
UpdatedPacketSchedule
UpdatedPacketSchedule
UpdatedPacketSchedule
[Podolsky, McCanne, Vetterli 2000] [Miao, Ortega 2000] [Chou, Miao 2001]
Bernd Girod: Video Coding for Compression and Beyond 29
Rate-Distortion Preamble
Each media packet n is labeled by− Bn — size [in bits] of data unit n
− dn —distortion reduction if n is decoded
− tn — decoding deadline for n
P PI
I
B B B P P PI
I
B B B P …
…
…
Bernd Girod: Video Coding for Compression and Beyond 30
PB
Rate-Distortion Preamble
Each media packet n is labeled by− Bn — size [in bits] of data unit n
− dn —distortion reduction if n is decoded
− tn — decoding deadline for n
P PI
I
B B P PI
I
B B B P …
…
…
For video: dn must be made“state-dependent” to accurately capture concealment
For video: dn must be made“state-dependent” to accurately capture concealment
Bernd Girod: Video Coding for Compression and Beyond 31
Markov Decision Tree for One Packet
... N transmission opportunities before deadline
send: 1
ack: 1
0
0
0
send: 1
0
send: 1
0
ack: 1
01
01
0
0
1
1
1
0
0
0
0
tcurrent tcurrent+t tcurrent+2t
Action Observation
“Policy“ minimizing
J = D + R“Policy“ minimizing
J = D + R
Bernd Girod: Video Coding for Compression and Beyond 32
R-D Optimized Streaming Performance
40 60 80 100 120 14024
25
26
27
28
29
30
31R-D OptimizedPrioritized ARQ
Foreman 120 frames 10 fps, I-P-P-… H.263+ 2 Layer SNR
scalable 20 frame GOP Copy Concealment 20 % loss forward
and back Γ-distributed delay
– κ = 10 ms– μ = 50 ms– σ = 23 ms
Pre-roll 400ms
Foreman 120 frames 10 fps, I-P-P-… H.263+ 2 Layer SNR
scalable 20 frame GOP Copy Concealment 20 % loss forward
and back Γ-distributed delay
– κ = 10 ms– μ = 50 ms– σ = 23 ms
Pre-roll 400ms
PS
NR
[dB
]
Bit-Rate [kbps]
~50 %
Bernd Girod: Video Coding for Compression and Beyond 33
Naive Coding Questions
1. To achieve graceful degradation in case of channel error for a digitally encoded signal, is an embedded signal representation (aka layers, aka data partitioning) always needed?
2. Can one, in general, send refinement information for an analog (i.e. uncoded) signal transmission over a noisy channel?
Bernd Girod: Video Coding for Compression and Beyond 34
Digitally Enhanced Analog Transmission
Forward error protection of the signal waveform Information-theoretic bounds [Shamai, Verdu, Zamir,1998]
“Systematic lossy source-channel coding”
Wyner-Ziv
Encoder
Wyner-Ziv
Encoder
DigitalChannel
DigitalChannel
Wyner-Ziv
Decoder
Wyner-Ziv
Decoder
Sideinfo
AnalogChannel
(uncoded)
AnalogChannel
(uncoded)
Bernd Girod: Video Coding for Compression and Beyond 35
Forward Error Protection of Compressed Video
Any OldVideo
Encoder
Video Decoder with Error
Concealment
Err
or-P
rone
cha
nnel
S S’
Wyner-Ziv Decoder A S*
Wyner-Ziv Encoder A
Wyner-Ziv Decoder B S**Wyner-Ziv
Encoder B
Graceful degradation without a layered signal representation
Analog channel (uncoded)
[Aaron, Rane, Girod, ICIP 2003]
Bernd Girod: Video Coding for Compression and Beyond 36
Wyner-Ziv MPEG Codec
Cha
nnel
Slepian-WolfEncoder
Wyner-Ziv Encoder
ED T-1Q-1 +
MC
S*MPEGEncoder
main
S
Side information
MPEGEncoder
coarse
T-1q-1ED +
MC
S’
R-SDecoder
ReconstructedFrame atEncoder
MPEGEncoder
coarse
R-SEncoder
[Rane, Aaron, Girod, VCIP 2004]
Bernd Girod: Video Coding for Compression and Beyond 37
Graceful Degradation with Forward Error Protection
Main Stream @ 1.092 MbpsFEC (n,k) = (40,36) FEC bitrate = 120 KbpsTotal = 1.2 Mbps
WZ Stream @ 270 KbpsFEP (n,k) = (52,36)WZ bitrate = 120 KbpsTotal = 1.2 Mbps
Bernd Girod: Video Coding for Compression and Beyond 38
Visual Comparison of Degradation at Same PSNR
With FEC1 Mbps + 120 kbps
(38.32 db)
Foreman 50 CIF frames @ symbol error rate = 4 x 10-4
With FEP1 Mbps + 120 kbps
(38.78 db)
Bernd Girod: Video Coding for Compression and Beyond 39
Superior Robustness of FEP
With FEC1 Mbps + 120 kbps
(33.03 db)
Foreman 50 CIF frames @ symbol error rate = 10-3
With FEP1 Mbps + 120 kbps
(38.40 db)
Bernd Girod: Video Coding for Compression and Beyond 40
X
Lossy Compression with Side Information
'XSource Encoder Decoder
Y Y
X'X
Source Encoder Decoder
Y Y Y
[Wyner, Ziv, 1976] For mse distortion and Gaussian statistics, rate-distortion functions of the two systems are the same.
[Wyner, Ziv, 1976] For mse distortion and Gaussian statistics, rate-distortion functions of the two systems are the same.
Bernd Girod: Video Coding for Compression and Beyond 41
Ultra-Low-Complexity Video Coding
Interframe DecoderIntraframe Encoder
K’Interpolation
/ Extrapolatio
n
Key frames
KConventional
Intraframe coding
Conventional Intraframe decoding
X’Scalar Quantizer
Turbo Encoder
Buffer
WZ frames
X Turbo Decode
r
Request bits
Slepian-Wolf Codec
Reconstruction
Y
[Aaron, Zhang, Girod, Asilomar 2002][Aaron, Rane, Zhang, Girod, DCC 2003]
Bernd Girod: Video Coding for Compression and Beyond 42
R-D Performance Ultra-Low-Complexity Video Coder
8 dB
3 dB
Sequence: Foreman WZ frames - even frames Key frames - odd frames Side information - motion
compensated interpolation of key frames
Bernd Girod: Video Coding for Compression and Beyond 43
H263+ Intraframe Coding 330 kbps, 32.9 dB
Wyner-Ziv Codec 274 kbps, 39.0 dB
Ultra-Low-Complexity Video Coder
Bernd Girod: Video Coding for Compression and Beyond 44
H263+ I-B-I-B 276 kbps, 41.8 dB
Wyner-Ziv Codec 274 kbps, 39.0 dB
Ultra-Low-Complexity Video Coder
Bernd Girod: Video Coding for Compression and Beyond 45
Stanford Camera Array
Courtesy Marc Levoy, Stanford Computer Graphics Lab
Bernd Girod: Video Coding for Compression and Beyond 46
Stanford Camera Array
Courtesy Marc Levoy, Stanford Computer Graphics Lab
Bernd Girod: Video Coding for Compression and Beyond 47
Light Field Compression
Rate: 0.11 bppPSNR 39.9 dB
Rate: 0.11 bppPSNR 37.4 dB
Wyner-Ziv, Pixel-Domain JPEG-2000
Bernd Girod: Video Coding for Compression and Beyond 48
Conclusions
Video compression is very important. . . but there is more to video coding than compression
Rate-scalable video representations: mc lifting break-through Robust video transmission
– Virtual priority mechanisms by packet scheduling– RD gains easily larger than from super-clever compression
Distributed video coding: radically different approach– Graceful degradation w/o layers– Ultra-low-complexity coders
Ubiquitous J=D+R
AcknowledgmentsAcknowledgments
Anne M. AaronAnne M. AaronJacob Chakareski Jacob Chakareski
Philip A. ChouPhilip A. ChouJ=D+J=D+RR
Markus FlierlMarkus FlierlSang-eun HanSang-eun HanMark KalmanMark KalmanMarc LevoyMarc Levoy
Yi Liang Yi Liang Shantanu Rane Shantanu Rane
David Rebollo-MonederoDavid Rebollo-MonederoAndrew SeckerAndrew SeckerDavid TaubmanDavid Taubman
Thomas WiegandThomas WiegandXiaoqing ZhuXiaoqing ZhuRui Zhang Rui Zhang
Progress is a wonderful thing,Progress is a wonderful thing,if only it would stop . . . if only it would stop . . .
Robert MusilRobert Musil