understanding the internet low bit rate coder jan linden vice president of engineering global ip...
TRANSCRIPT
Understanding the Internet Low Bit Rate Coder
Jan Linden Vice President of Engineering
Global IP Sound
Presented by
Jan Skoglund
Sr. Research Scientist
Global IP Sound
iLBC – Background info
• Development started in Summer 2000• Contributed to IETF as an internet draft in Feb 2002• Accepted as work item in IETF AVT group Mar 2002• Contributed to CableLabs RFP in June 2002• Improved version to IETF, Fall 2002• ECR submitted in May 2003• Support for 20 ms frames spring 2003• Successful interoperability events• Past Working Group Last call in IETF Jan 2004• April 2004 added as a mandatory codec in PacketCable 1.1 • December 2004 IETF process finalized (became
Experimental RFC 3951 and 3952)
Design Principles
• Free of 3rd party IPRo extensive experience in speech coding patents by design teamo patent and research situation monitored since 2000o has been public in IETF since March 2002 and reviewed by independent
speech coding researchers
• Packet independencyo no coding interdependency between frameso increased packet loss robustnesso suitable for IP networks
• Linear Predictive Codingo well know highly successful coding modelo novel coding techniques of residual signal
iLBC Features
• Sampling Rate: 8 kHz • Supports 30 ms and 20 ms speech frame modes• Bitrate
o 13.3 kbps (399 bits, packetized in 50 bytes) for 30 ms frameso 15.2 kbps (303 bits, packetized in 38 bytes) for 20 ms frames
• Computational complexity (TI C54x)o 30 ms frames: appr. 18 MIPS/channelo 20 ms frames: appr. 15 MIPS/channel
• Memoryo 400 Words/channel state memory (RAM)o less than 4 kWords table memory (ROM)o Stack and program memory requirements similar to other low bit rate
codecs (e.g. G.729A)
The Core iLBC method
• Start state encoding• Gain-shape waveform matching forward in time• Gain-shape waveform matching backward in time• Pitch enhancement• Packet loss concealment
iLBC Encoding
Incoming speech
Packets to network
iLBC Decoding
Packets from network
Decoded speech
20 ms vs 30 ms sub-blocks
• 20 ms frame size mode - 4 sub-blocks with the total length of 160 samples
• 30 ms frame size mode - 6 sub-blocks with the total length of 240 samples
0 39 79 119 159 +---------------------------------------+ | 1 | 2 | 3 | 4 | +---------------------------------------+ 20 ms frame 0 39 79 119 159 199 239 +-----------------------------------------------------------+ | 1 | 2 | 3 | 4 | 5 | 6 | +-----------------------------------------------------------+ 30 ms frame
20 ms vs 30 ms mode – bit allocation
240 samples encoded to 399 bits = 13.3 kbit/s (50 oct)
Parameter Bits
LPC
Start state position
Start state scale
Start state samples
Shapes
Gains
40
4
6
174
115
60
Total 399
160 samples encoded to 303 bits = 15.2 kbit/s (38 oct)
Parameter Bits
LPC
Start state position
Start state scale
Start state samples
Shapes
Gains
20
3
6
171
67
36
Total 303
Advantage over CELP
original
iLBC
g729
g723
PLC Staterecovery
iLBC Performance vs G.729A & G.723.1old version from Winter 2002
Source: Dynastat
iLBC Performance
Equivalent or slightly lower performance than G.729E in clean.
Improved robustness to packet loss compared to G.729E.
iLBC showed better than G.728 in other testing.
Implementation
Floating Point Source
Fixed Point Source
DSP Source
• Significant signal processing skills necessary
• Quality / efficiency trade-off
• ~ 6 Months
• Optimization skills
• ~ 4 Months
iLBC Specifications
• Available in floating point , fixed point ANSI C, TIc54x, TIc55x, TIc64x,…
• Supports 20 and 30 ms speech frames• Algorithmic delay: Same as frame size• Sampling Rate: 8 kHz• Bit rate: 13.333 kpbs for 30ms and 15.2 kpbs for 20ms
ProductFrame size
Complexity (max)
Program Memory
Data Memory Static Data
Memory DynamicEncoder Decoder Fix
Per channel
GIPS iLBC TIc54x
20 ms11.5 MIPS
4.1 MIPS
17.6 2.4 1.4 2.3GIPS iLBC TIc54x
30 ms13.5 MIPS
4.4 MIPS
GIPS iLBC TIc55x
20 ms 7.5 MIPS 3.0 MIPS
15.6 2.4 1.4 2.3GIPS iLBC TIc55x
30 ms 8.8 MIPS 3.1 MIPS
Memory in kWord16