mpeg audio - · pdf filempeg audio references 1. cd 11172-3. iso/iec jtcisc 29n 071, december...

51
MPEG AUDIO (Pages: 214-256)

Upload: nguyencong

Post on 18-Mar-2018

221 views

Category:

Documents


10 download

TRANSCRIPT

Page 1: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG AUDIO (Pages: 214-256)

Page 2: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG AUDIO

REFERENCES

1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio fo r digital s torage media at up to about 1.5 Mbitls". Par t 3 Audio. TS: AuC fW3.

2. "The SR Report on The MPEGIAudio Subject ive Listening Test Stockholm AprilIMay 1991" ISO/IEC J T C l / S C 2 / W G l l , MPEG 911010, Revised June 1991.

3. H. G. Musmann, "The I S 0 audio coding standard," Globecom190, pp. 551-517, Dec. 2-5, 1990. San Diego, CA.

4. G. Stoll, and Y. F. Dehery, "High quality audio bit-rate reduction system family for different appl icat ions," Super Comm., ICC 1990, Atlanta, GA, April 1990.

5. L. V. D. Kerkhof, K. H. Brandenburg, J. D. Johnston, and T. Senoo, "Preliminary text for MPEG audio coding standard," SO-IEC JTCl /SC2/WGl l , MPEG 901265, Sept. 13, 1990.

6. N. S. Yayant, "High quality coding of telephone speech and wideband audio," IEEE Communications Magazine, pp. 10- 20, Jan. 1990. Also published in Proc. ICC 1990, pp. 927- 931, Atlanta, GA, April 1990. I

7. N. Kitawaki, H. Nagabuchi, M. Taka, and K. Takahashi, "Speech Coding Technology for ATM Networks," IEEE ~ o m m u d i c a t i o n s Magazine, pp. 21-27, Jan. 1990.

8. S. Furui and M. M. Sondhi (Editors), "Advances in speech signal processing," New York, NY, Marcel Dekker, Inc. 1992.

9. G. Maturi, "Single chip MPEG audio decoder," International Confer'ence bn Consumer Electronics, Chicago, IL, June 2-4,

2 . ( L sr L a c z c ) . ~h~ r E E ~ T'~n/l. C c 9 * s * w Q ~ -1. 3 8

J - A 9 m R Ah?. r g t t .

Page 3: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

11. Recommendations and reports of the CCIR, 1990. XVIIth Plenary Assembly, Dusseldorf, 1990. Volume XI - Part 1. Broadcasting Service (Television). Rec. 601- 1 "Encoding parameters of digital television for studios," Volume X. Rec. 953 "Encoding parameters of digital audio."

12. TEEE Draft Standard "Specif~ation far the irnpkmntation d 8 x 8 inverse discrete cosine transform." PI1 SW2, July 18,1990. . ,- % . -

. . 13. L. D. Fielder and G. A. Davidson, "AC-2: A farnilv of low complexity msfa rm b&

coders," AES 10th (MUSIC) International Conference, London, UK, Sept. 1991. - 14. SSN 74ACT6350 MPEG Audio Decoder' (Sin le Chip ISO-WEG (Layer 1 and 2) Audio

$Rx;&), Texas Instruments brochun, sep&.

15. L64111 MPEG Audio Decoder, LSI Logic Corpn., Brochure, April 9, 1992.

16. AT & T VCOS Multimedia Development Environment, (VMDE) DSP Modules. Includes MPEG Audio EncodmDecoder (ARIEL).

17. G. Benbassat, K. Cyr and S. Li, "A low-cost MPEG-Audio decoder IC by Texas Instruments," Custom Integrated Circuits Conference, Proc. pp. , San Diego, CA, May 1993.

. . J. R. Deller, Jr., J. G. PKfakis and J. H. L. Hansen, "Discrete-time Processing,of Speech

- Signals," Macmillan, 1993.

f '. * 19. B. %, A d , D. V. Cupeman and A. Gersho, "Speech and audio coding for wireless and network qpiications," 'ghm, MA: Kluwer Academic, 1993. I .-. -

. 1 - (

ts . D. Shha and A. H. Tewfik, "Low bit rate transparent audio compression using adapted . -. ., , . , . wrtwclets," IEEE Trans. Signal Processing, vol.+E , pp. , Dec. 1993. Also published in , h. o f ICASSP'93. 34.H -3474. N. Jayant, "Signal Compression: Technology targets and research directions," IEEE J. sqks!$ areas in commun., vol. 1 Q, pp. 796-81 8, June 1992.

: >::, . - ,. . . . -._ -- O . . J.H. e&citiif; "k r o i v - d e t a a ~ ~ ~ coder for the C C m 16 Kbls speech coding . . - .

standard," IEEE J. selected areas in commun., voL 10, pp. 830-849, June 1392.

23. S. Wang, A Sekey and A. Gersho, "An ob' ' . .

ure for ~redictingsublectlve auality , ' '-'C. of speech coders," IEEE J. selected areas e' vol. 10, pp. 819-829, June 1992.

d

Page 4: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

26. K. Bradenburg et al. " The ISO-MPEG-Audio Codec : A generic standard for coding of high quality digital audio. " J. Audio Engineering Society (preprint).

27. R.N.J. Veldhius. " Bit rates in audio source coding. " IEEE J. Selected Areas in Communications, Vol. 10, pp. 86-96, Jan. 1992.

28. D. Pan. " An overview of the MPEGIaudio compression algorithm. " IS&T/SPIE's Symposium on Electronic Imaging : Science & Technology, Vol. 2187, pp. 260 - 273, San Jose, CA, Feb. 1994.

29. B. Link. " AC3 : flexible perceptual code for audio transmission and storage. " DSPX EGO. and symposium, Proceedings pp. , San Francisco, CA, June 1994.

30. L. Bergher et al. " MPEG audio decoder for consumer applications. " (Thornson Consumer Electronic Components) International Conference on Consumer Electronics, Proceedings pp. , Chicago, IL, June 1994.

3 1. M. Lodman et al, " A t e r e o AC-3 audio decoder" (GI, Motorola and Dolby) International ~ o n f e r e n c e ~ r o c e e d i n g s pp. 23 4 - 23 5, - - - chicago, IL, June 1994.

32. B. Gall, L.M. van de Kerkhof and C. Vanden-bulcke, "MPEG 1 and MPEG 2 audio decoding algorithms and implementation", (Philips Semi-conductors) International Conference on Consumer Electronics, Proceedings pp. 236 - 237 , Chicago, IL, June 1994.

33. K. Brandenburg et al, "The ISOMPEG audio codec: A generic standard for coding of high quality digital audio", 92nd AES convention, preprint 3336, Vienna, Austria, 1992.

34. D. Pan. " Digital audio compression. " Digital Technical Journal, Vol. 5, pp. 28 - 40, Spring 1993. ,

35. K. Pohlman. " Principles of digital audio. " Indianapolis, IN : Howard W. Sams and Co., 1989.

36. C. Todd et al. " AC-3 : Flexible perceptual coding for audio transmission and storage." - %th AES, Amsterdam, Netherlands, Feb. 1994.

37. C.L. McCarthy. " A low cost audio/video decoder solution for MPEG system streams. *

Intl. Conf. on Consumer Electronics, pp. 3 12-3 13, Chicago, IL, June 1994.

Page 5: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

38. A. Makivirta et al, "Error performance and error concealment strategies for MPEG audio coding," Australian Telecommun., Networks & Applications Conf., (ATNAC '94), pp. , Melbourne, Australia, Dec. 1994.

39. MPEG Audio in cyberspace IUMA (Internet Underground Music Archive) at sunsite.unc.edu (Mosaic http: //sunsite.unc.edu - select exhibitions and expositions, then IUA4.A)

40. ISOIIEC JTClISC29MG11 NO803 1 lMov.11994. Generic coding of moving pictures and associated audio: Audio, ISOlIEC 13 8 13-3. International Standards.

41. A.S. Spanias, "Speech coding: A tutorial review," Proc. IEEE, vol. 82, pp. 1541- 1582, Oct. 1994.

42. J. Watkinson, "The art of digital audio," XI Edition, Oxford, UK: Focal Press, 1994.

43. Bergher, "MPEG audio decoder for consumer applications," IEEE Custom Integrated Circuits Conference, pp. , Santa Clara, CA, May 1995.

44. A.M. Kondoz, "Digital speech: coding for low bit rate communications systems," New York, NY: John Wiley, 1994.

45. J. Watkinson, "Digital compression in video and audio," Focal Pr. Ltd., U.K./ Butteworth-Heinemann, Boston, 1 995.

46. Luc Baert et al (Ed.), "Digital audio and compact disc technology," Focal Pr. Ltd., U.K./Butterworth-Heinmann Oxford, England, 1995.

47. F. R. Jean, H. I. Lin and H.C. Wang, "Near transparent audio coding at low bit-rate based on minimum noise loudness criterion," Intl. Conf on Consumer Electronics, pp. 360-361, Chicago, IL; June 1995.

48. H. Jung et al, "Digital audio processor for consumer-use digital VCR," Inti. Conf on- Consumer Electronics, pp. 4 14-4 1 5, Chicago, IL, June 1995.

B. Wesen, "A DSP-based decompressor unit for high-fidelity MPEG-Audio over TCP/IP networks," M. S. Thesis, Lund U. of Technology, 1997. (MPEG-1 Layer 3 audio decompression) www.rnp3.com, Lund, Sweden.

Page 6: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

L. Bergher et al, "MPEG audio decoder for consumer applications," IEEE CICC, Santa Clara, May 1995.

T. Oberthuer and M. Tilmann, "Flexible MPEG audio decoder core with low power consumption and small gate count," The signal processing conf. at DSPX'96, San Jose, CA, March 1996.

T. Ziegler, "An optimrzed architecture for ISO/MPEG audio, layer 111," ICSPAT, Boston, MA, Oct. 1995.

E. Carlson, "The future of 16-bit audio standards for the multimedia PC," ICSPAT, Boston, MA, Oct. 1995.

Y. C. Jeung et al, "A 4-channel sub-band filter for the multi-channel extension of MPEG-2 audio decoder," ICSPAT, Boston, MA, Oct. 1995.

R. K. Jurgen, "Broadcasting with digital audio," IEEE Spectrum, vol. 33, pp. 52-59, March 1996.

European efforts in DAB http: / /www.o~-cab/digital .html - Canadian Assocn. of Broadcasters' home page for digital radio http://www.ottawa.net/-cabldigital. html *

J. C. Liu and C. M. Liu, "A new intensity stereo coding scheme for MPEG 1 audio encoder-Layers I and 11," ICCE, Chicago, IL, June 1996.

S. C. Han at al, "An ASIC implementation of the MPEG-2 audio decoder," ICCE, Chicago, IL, June 1996.

Speech and audio coders: Analysis & Evaluation, 1996, Will Strauss, Forward concepts, e-mail: wis@fwdcon~epts.com, http://www.fwdconcepts.com

M. Bosi et al, "ISOIIEC MPEG-2 Advanced Audio Coding," lOlst AES Convention, Los Angeles, Nov. 96.

ITUR Document TG 10-213-E only, "Basic audio quality requirements for digital audio bit-rate reduction systems for broadcast emission and primary distribution," 28 Oct. 1991.

H. Sakomoto et al, "A Dolby AC-3MEG-1 audio decoder core suitable for audio/visual systems integration," IEEE CICC97, Santa Clara, May 1997.

S. Li et al, "An AC-3 MPEG multistandard audio decoder," IEEE CICC 97, Smh Clara, May 1997. c-

Page 7: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Audio

1. W.R. Daumer, "Subjective evaluation of several efficient speech coders," IEEE Trans. on Commun., vol. COM-30, pp. 655-662, April, 1982.

2. N. Kitawaki and H. Nagabuchi, "Quality assessment of speech coding and speech synthesis systems," IEEE Commun. Mag. vol. 26, pp. 36-44, Oct. 1988.

3. S. Wang, A. Sekey and A. Gersho, "An objective measure for predicting subjective quality of speech coders," IEEE J. on Selected areas in Commun., vol. COM-10, pp. 8 19-829, June 1992.

4. E. Zwicker and H. Fastl, "Psychoacoustics: Facts and Models," Springer, Berlin, 1990.

5. S. Bergman, C.Grewin and T. Ryden, "The SR report on the MPEG/Audio subjective listening test, Stockholm, ApriVMay 199 1," ISO/IEC JTC 11SC2. WGl1, June 199 1.

6. C. Grewin and T. Ryden, "Subjective assessments on low bit-rate audio codecs," Proc. of the 10th Intl. AES Conference, pp. 9 1 - 102, London, 1991.

7. F. Feige and D. Kirby, "Report on the MPEGIAudio multichannel formai subjective listening tests," ISO/IEC JTC lISC29MrGll MPEG941063, March 1994.

8. Annex H. Audio subjective test methods for low bitrate codec evaluations (MPEG-4 Testing and Evaluation Procedures Documents Final Versiori, July 1995) TQF / wd- VOW&& CVOUY). DOUWQ& MI+*) . WI6.

9. IEEE Standard 293. Recommended practice for speech quality measurements ( S o on subjective tests for speech quality and list of test

lo. ITU-T Rec. P. 80 (Guidelines for audio quality

1 1. ITU-R BS- 1 1 16 (Standard audio test procedures) On ~ e r ~ ~ & d a ~ & ~ \

IEEE (Standards Board) Operation Center 345 East 47th Street 445 Hoes Lane P.O. Box 1331 Piscataway, NJ 08855- 133 1 Ph: 1-908-98 1-0060

Page 8: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

67. N. Gilchrist and C. Grewin, "Collected papers on Digital Audio Bit Rate Reduction," AES, New York, NY, 1997.

68. L. Mainard, "A real time PC based high quality MPEG layer 11 codec," 10 1 th Convention of Audio Engineering Society, Preprint 4345, Los Angeles, C q Nov. 1996.

M. C. Hans and V. Bhaskaran, "A fast integer-based CPU scalable, MPEG-1 layer-2 audio decoder," lolth, Convention of Audio Engineering Society, Preprint 4359, Los Angeles, CA, Nov. 1996.

. - #- A

70. H. Neinhaus, et al, "Flexible MPEG audio decoder layer III chips architecture," IEEE ISCAS'98, Monterey, CA, June 1998.

71. H. Malvar, "Enhancing the performance of subband audio coders for speech signals," IEEE ISCAS'98, Monterey, CA, June 1998.

72. L. G. Chen and T. H. Tsai, "A low-cost architecture design with efficient data arrangement and memory configuration for MPEG-2 audio decoder," IEEE ISCAS'98, Monterey, CA, June 1998.

Page 9: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG ALGORITHMS (Audio)

4 algorithms custered fiom 14 proposals

1. Transform coding with overlapping blocks AT&T, DTB, FHG, France Telecom. (ASPEC)

2. Transform coding with nonoverlapping blocks Fujitsu, Sony, JVC, NEC, W A C )

3. Subband coding with < 8 subbands BTRL, NTT (GB, J), (SBIDPCM)

4. Subband coding with > 8 subbands CCETT, IRT, Philips, Matsushita (MUSICW

..

Based on evaluation of these four algorithms at Swedish Radio in Stockholm, combination of algorithms (1) and (4) was proposed for W e r study. (Tested in July 1990).

Coding stereo quality wideband audio at bitrates of 128-256 Kbps. Monophonic audio at bitrates of 64,96,128, and 192 Kbps.

Sampling frequency: 32 KHz FM (broadcasting), 44.1 KHz (CD), and 48 KHz - fDAn. (16 bit uniform quantization) - - Near CD quality at 128 Kbps High Quality digital stereo 192 and 128 Kbps: Contribution (studio) quality. sound indistinguishable

96 and 64 Kbps: Distribution quality fiom CD quality = 256 Kbps

Page 10: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

ATAC: Adaptive Transform Aliasing Cancellation

SBDPCM: + % Subband Coding with 8 subbands. Of these four, two coding algorithms ASPEC (transform coding) and MUSICAM (Subband coding) were extensively tested at Swedish Broadcasting.

SCORING OF SUBJECTIVE AND OBJECTIVE TESTS

Algorithm Subjective tests Objective tests Total

ASPEC 3272 4557 7829

I * . . ASPEC: Audio Spectral perceptual' lZntrbpY Coding n - .

- -

MUSICAM. Masking Pattern Universal Subband Integrated Coding and Mup tiplexing

i t

, . 'L . . j -

The ASPEC and MUSICAM groups collaborated and prepared a draft standard combining the most efficient components of the two algorithms, leading to International Standard.

Page 11: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG AUDIO * : -*- - . -

. * . ,* ., Digital representation of a stereo cif irb";idl kign;i inm'a'ifudi'ii

format: i = 48 KHz, 16 bit PCkl(48 x 16 = 768 KB5).

768 KBPS for monophonic sound signal CD ROM-640 Mbytes 1.536 MBPS for sterophonic sound signal (74 minutes of play on a single CD)

HDCD-3.7 Gbyte~ Stereo sound signal of a CD = 2 x 7eO KBPS SD D W 9 . 6 a p e s

= 1.41 2 MBPS 6=44.1KHz,16bitPCM

Also digi tal audio broadcasting system (DAB)

I " 1.15 MBPS for video signal -0.256 MBPS for stereo sound signal, (2 x 128 KBPS)

. .I .406 MBPS 1.41 2 MPBS rate of a CD 1 - ,

h> - Compression of audio to 2 x 64 KBPS leads to distribution of digital rstweo sound signals via an ISON Basic Access 2 x 64 KBPS (28 + D).

. b

Applications

1. Radio sound-programme emission (terrestrial and satellite, stationary and mobile reception),

L. I FIIUVIYIVII YVYIIY WIIIIYYIWII \&-I IVYII I ~ I I v , uwIIvwImbuwmIaI ~ ~ L W I I I L W I v ,

EDW and HDTV

3. Contribution links (music, commentary and reporting .circuits),

4. Distribution links,

5. Production (tapeless studio, editing system, post-processing) and

6. Storage (studio and consumer). p-,

' F 7. Audio synchronized with video in cable TV i i d - e n d and set tops, direct f",

I 1d broadcast satellite set tops, home TV receive-only set tops, wireless

1 1 k

cable TV set tops, video 1as)r-discs, CD-I, CDTV, stand-alone audio use in digital audio broadcasting, digital compact cassettes, digital VCRs, stuQo quality digital audid transmission, digital audio workstations and

dja cards.

8. C h ~ t m k , Vrd#, *bane, video on demand, Karaolq games.

Page 12: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG- 1 AUDIO REOUIREMENTS

Input sampling rate: The algorithm must function at 32,44.1 and 48 KHz.

Output sampling rate: This should match input sampling rate.

Input resolution: 16 bit uniform.

Bit Rate: CoderDecoder pair must work at 64,96, 128 and 192 KBPS. A stereo coder should use twice the bit rate of a monophonic decoder.

Access unit length: < 100 miiiiseconds

Addressable unit length: < 1/30 second

Total System Delay: Total encoding, transmission and decoding delay must be < 80 ms at a bit rate of 2 x 128 KBPS and a sampling rate of 48 KHZ.

Access unit: Smallest part of encoded bit stream that can be decoded by itself (fully reconstructed sound).

Addressable unit: Maximum distance between closest entry points in the encoded bit stream measured in milliseconds.

Page 13: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

HIGHLIGHTS OF MPEG AUDIO CODING

1. Properties of the human sound perception. d-

2. Spectral and temporal masking effects of the ear. . .

3. Quantizer is designed such that resulting quantization noise is just below the masking threshold.

4. Details of signal below the masking threshold are not coded.

5. Division of broadband audio signal spectrum into 32 subbands /

and digital frames. 4

6. In each frame, the maximum magnitude attained by each subband signal, called 'scale factor' is quantized and transmitted. -

7. For each of the 32 subbands using FFT, the minimum of the -eshold i.e.. the maximum level of the unnoticeable quantizing noise is calculated.

8. Dynamic bit allocation. This is based on masking thresholds and as needed to quantize the subbands.

9. Side information: Scale factors and dynamic ti t allocation.

10. Those subbands which are completely masked by more important components of the adjacent subbands are not coded.

1 1. Error detection and correction of important bits such as side information and MSBs of subbands at low frequencies.

Page 14: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG-1 AUDIO

Input: 16 bit PCM, sampling rates 32,44.1 and 48 (KHz).

Output: Compress to 32-448 KBPS Layer 1 (In discrete steps) 32-384 KBPS Layer 2 32-320 KBPS Layer 3

Two filter banks: Equally spaced subbands: Equal number of samples in the original signal and in the subbands. Layers I & I1 have 32 subbands. In each subband, 12 or 36 samples are grouped for processing. Layer 111, filter bank has signal dependent resolution: 6 X 32 or 18 X 32 frequency bands.

. , ~, . .

A A

(S MR) \ Signal to

Psychoac~ust /~ Mask Ratio Model

Page 15: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

G A L Psvchoacoustic Model

Calculates just noticeable noise-level for each subband based on the minimum masking threshold. This noise level is used in the bit or noise allocation to determine actual quantizers and quantizer levels. Two psychoacoustic models. In practice Model 1 for Layers I and 11, and Model 2 for Layer 111.

Output is Signal-to-Mask Ratio (SMR) for each band (Layers I and 11) or group of bands (Layer 111).

Bit or Noise Allocation

Bit allocation (Layers I and 11) and noise allocation (Layer 111) are controlled by filter bank output samples and SMRs. This allocation satisfies both bit rate and masking requirements. In Layers I and I1 bit allocation is # of bits assigned to each sample (or group of samples) in each subband. Quantization parmeters and quantized output samples are sent to the bit stream formatter.

Bit Stream formatter

Encodes and formats the quantized filter bank output, bit allocation (Layers I and 11) or noise allocation (Layer 11fand other side information. In Layer I1 quantized samples may be grouped. In Layer I11 Huffman Codes (VLC) are used. Bit , stream formatter varies from layer to layer. In layers I and I1 a

! fixed PCM code is used for each subband sample.

Page 16: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

ENCODING

The encoder processes the digital audio signal and produces the compressed audio bitstream for storage. Encoder algorithm is not standardized. The encoder output, however, must be such that a decoder confirming to the specifications must produce audio suitable for intended application. Various means for encoding such as estimation of the auditory masking threshold, quantization and scaling can be used.

Page 17: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

.INVERSE TRANSCODING

FILTERING TRANSCODING D Z (ITA L I

C?CrJ/IL . .

FIUPSO 2lGITAL '

I ' DIOYONAL

c FILTER I A -

FIG. Blockdiagram of the MUSICAM encoder (above) and decoder (below).

-:

I Xlrtlr

C

W K .

*

I ' /

u T I m I I

- D E

4-8

C 0 D X

DEEM~INATION OF

-FACTORS-*'

HZ I I I I

DATA REDUCTION

I - u DElERMINATION OFhUSKING THRESHOLD

INVERSE DATA

REDUCnON ;.

~ y ~ ~ * M U

I

t !i

-

-+ DYNAMIC I

BK ALUICATION -

A - I 9 8 t<ffi! I I I I * 1

0 D l N G

,\,,

Page 18: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at
Page 19: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

L> -. . 7&

I$&-.: 9 - .sj.-%* . A

r j . ' Rate and di5ortion control loop PCM Input -

- 1 + Analysis MDCT Scaler and

- Huffinan ; - Filterbank with Dynamic = Quantizer Coding

Windowing A 4 Mux b

Coding Masking of Side-

Thresholds b

Information

1 f f i 3 d . 0 ~ c - E ~ MODE L

Digital Channel

( F F T A I ~ P & C & ~ M )

Structure of MPEG-l audio en& and decoder, layer III. ' f l p 3 D

PCM Output 4

Demux : A 4 A

Decoding of Side-

Information

r

Synthesis Filterbank

Inverse MDCT Dequantizer Huffinan

with Dynamic and Descaler Decoding Windowing

Page 20: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at
Page 21: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

- .

ral Stereo Decoder Flow Chart -

Page 22: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

1 . . FIGURE 34.2; -' Layer I, I1 encoder flow chart kk .+ d .

I.

i - - .

. SUBBAND ANALYSIS I

I 1 • r= L -

SCAUEFAClOR , a m m o N O F uwuunm h d ~ s ~ u ~ o ~ ~ .

J L

. auxmnmo~ 4 tmREQUIRED .

~ r r - m n m v

i

, D-AnONOFNON

QU~ZXONOFSUB-BANDSAMPLES - - *'

mDm0 OF-

Page 23: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Figure 3-A.1. Layer I and I1 decoder flow chart

( BEGIN

INPUT ENCODED BIT m E A M

DECODING OF BIT AUOCATION

DECODING OF SCALEFACTORS

REQUANnZATION OF SAMPLES

SYNTHESIS SUBBAND FILTER

Page 24: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Bark - Unit of critical band rate

Critical Band Rate - Psychoacoustic measure in the spectral domain which corresponds to the frequency selectivity of the human ear.

Critical band - Part of the spectral domain which corresponds to a width of one Bark.

Frame (Audio) - A part of the audio signal that corresponds to a fixed number or audio PCM samples.

Granules (Layer II) - 96 subband samples, 3 consecutive subband samples for all 32 subbands that are considered together before quantisation.

Granules (Layer 111) - 576 frequency lines that carry their own side information.

Intensity Stereo - A method of exploiting stereo irrelevance or redundancy in stereophonic audio programmes based on retaining at high frequencies only the energy envelope of the right and left channels.

/7, Joint Stereo Coding - Any method that exploits stereophonic irrelevance or stereophonic redundancy.

Joint Stereo Mode - A mode of the audio coding algorithm using joint stereo coding.

Masking Threshold (Audio) - A function in frequency and time below which an audio signal cannot be perceived by the human auditory system.

Masking - Property of the human auditory system by which an audio signal cannot be perceived in the presence of another audio signal.

YS Stereo - A method of exploiting stereo irrelevance or reduncancy in stereophonic audio programmes based on coding the sum and difference signal

C - instead of the left and right channels.

Non-tonal components - A noise-like component of an audio signal.

Psychoacoustic model - A mathematical model of the masking behaviour of the human auditory system.

Tonal Component - A sinusoid like component of an auditory system.

235

Page 25: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

LAYERS

Depending on the application, different layers of the coding system with increasing encoder complexity and performance can be used. An IS0 MPEG Audio Layer N decoder is able to decode bits stream data which has been encoded in Layer N and all layers below N.

Laver I:

I - This layer contains the basic mapping of the digital audio input into

. - . 32 subbands, fixed segmentation to format the data into blocks, a psychoacoustic model to determine the adaptive bit allocation, and quantization using block companding and formatting customer applications such as digital home recording on tapes and discs.

This layer provides additional coding of bit allocation, scalefactors and samples. Different framing is used. Also more precise quantization. Applications in consumer and professional studio like audio broadating, TV, Recording, telecommunication, multimedia, audio work stations and storage media like winchester disks or magneto- optical disks.

Page 26: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

t i -

Layer III consists of the basic modules of the layer I and 11 coding schemes in combination with a hybrid filterbank. Additional frequency resolution is provided by the use of the hybrid filterbank. Every subband is thereby further split into higher resolution frequency lines by a linear transform that operates on either 6 or 18 subband samples (switched transform) in each subband. Nonuniform quantization, adaptive segmentation and entropy coding of the quantized values are employed for a better coding efficiency. The application of this layer is expected to be telecommunication, in particular with narrowband ISDN and audio applications with requirements for very low bitrates.

Joint Stereo coding can be added as a additional feature to any of the layers.

Storage

Various streams of encoded video, encoded audio, synchronization data, systems data and auxiliary data may be stored together on a storage medium. Access to storage may involve remote access over a communication system. Access is assumed to be controlled by a fbnctional unit other than the audio decoder itself This control unit accepts user commands, reads and interprets data base structure information, reads the stored information from the media, demultiplexes non-audio information, and passes the stored audio bitstream to the audio decoder at the required rate.

Page 27: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

BEGIN

4 For i = 511 down to 32 do

= 5-32

For i=31 down to 0 do X, = next-input-audio-sample

4 Window by 512 Coefficients

P r o d u c e V ~ Z for i=O to 511 do

4 = Ci'X,

Partial CaA&tion fori=Oto63do

I Calculate 32 Samples by Matrixing

for i=O to 51 do

Output 32 Subband

END

Figure 3 .C. 1 Analysis subband filter flow chart.

ICJILE the broadband si@ with sampling hquency f jmb 32 rnbbds with samphg frequencies f 1 32.

23'1

Page 28: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

32 equally spaced subbands Initially

f m ,... &,&...&,,

Delete mao thru k5,,

Rearrange

Add 32 new audio samples

For C, see Table 3.C. 1 (Some have negative values). C, = coefficients of the analysis window.

M, = cos (2i+ l)(k +16)lr

64

Page 29: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

+ move these by 64 locations

Input 32 New Subband Samples

Si i = 0, ..., 31

I Shifting for i = 1823 down to 64 do I Wi=UiDi

Matrixing for i = 0 to 63 do I Build a 512 values vector U

fori=Oto7do fmi=Oto31do

Window by 5 12 coefficients Produce vector W for

i 4 to 5 1 1 do Wi-Ui*Di

Calculate 32 samples for j=O to 31 do

I Output 32 reconstructed PCM Samples

(For Di see Table 3.B.3) i = o , f ' , * * * , s4i

(Di is from 0 to 1.4499) now 4lV-e -due)

15 sj = wj+32i

i=O

j = 0, ..., 31

Page 30: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Valid for Layers I and II

Layer I: A new bit allocation is calculated for each . block of 12 subband samples (12 X 32 = 384 input PCM samples) Layer II: a new bit allocation is calculated for three blocks of 36 subband samples (36 X 32 = 1152 input PCM samples)

Bit allocation for the 32 subbands is based on the SMRs of all the subbands. For each subband find the maximum signal level and the minimum masking threshold which is derived by the FFT of the input PCM signal followed by a psychoacoustic model calculation. The calculation of the signal-to-mask-ratio is based on the following steps:

Step 1 : - Calculation of the FFT for time to frequency conversion.

Step 2:

$ - Determination of the sound pressure level in each subband.

Step 3: - Determination of the threshold in quiet (absolute threshold).

Step 4: - Finding of the tonal (more sinusoid-like) and non-tonal (more noise-Lke)

components of the audio signal.

Stsp 5: ( Decimation of tonal and nontonal masking components. ) - Decimation of the maskers, to obtain only the relevant maskers. To reduce the # of maskers which influence the global masking threshold.

Step 6: - Calculation of the individual masking thresholds.

step 7: - Determination of the global masking threshold.

smp a: - Determination of the minimum masking threshold in each subband. 'I

-to-mask ratio in each subband

240

Page 31: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

7-

,A . 'v E p o i n t for Layer I 1024 . +. -point for L a ~ e r s 2 and 3

.-. .. . 7- ->

- - t

' ~ r a n s f h length - 48 KHz

size 32 KHz

frequency (In FFT Domain)

resolution .

Hann window m1 _ II

Nl2 Power density spectrum

rsis

Layer I

512

10.67 ms 11.6 ms 16 ms

Layer II

1024

21.3 ms 23.2 ms 32 ms

h( i )=f i (+I lcos (%)I , 0 s i s N-1

2?#kl DFT [ ~ g ~ l ) d r ) N I=0 om(-)] N

N- 1 L

Power density spectrum x(k). lo 1% 1 N [ P O 2 HO dO =p(=! N

, f , . kIiskEg threshold is derived from an estimate of the power density b -

- - spectrum. -I -- a 1

Psychoamustic model 11. Layer - 3 is based on this model.

'-7

Page 32: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

c ' 2.2.2 Semantics for the Audio Bitstream Svntax

\- I - 2.4.2.1 Audio Sequence General

I Frame - Layer I and Layer Ik

Part of the bitstream that is decodable by itself. In Layer I it contains information for 384 samples and in Layer 11 for 1152 samples.

Layer III

Part of the bitstream that is decodable with the use of previously acquired side and main information. In Layer III it contains information for 1 152 samples.

1 2.4.2.2 Audio Frame

i header - part of the bitstream containing synchronization and state infomation error-check - part of the bitstream containing information for error detection.

audio-data - part of the bitstream containing information on the audio samples.

ancillary-data - part of the bitstream that may be used for ancillary data

2.4.2.3 Header The first 32 bits (four bytes) are header information which is common to all layers.

- syncword - the bit string 'llll Illlllll'. ID - one bit to indicate the ID of the algorithm. Equals '1 ' for MPEG audio, '0' is reserved.

Layer - 2 bits to indicate which layer is used, "1 1" Layer I "01" Layer III "10" Layer 11 'W' reserved

protection-bit - one bit to indicate whether redundancy has been added in the audio bitstream to

facilitate emx detection and concealment.

Bit rate (Numbers are in KBPS) bit-rateindex Layer I Layer 11 Layer III '0000' ke free ke '0001' 32 32 32 '0010' 64 48 40 '001 1' 96 56 48 '0100' 128 64 56

I '0101' 160 80 64 '01 10 192 96 80

I ‘0111' 224 112 96

1 ' 1000' 256 128 112 '1001' 288 160 128

320 192 160 . 352 224 192 384 256 224

i ni. 416 320 256 448 -. 384 320

Page 33: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

'10' 32 kHz '1 1' reserved

Paddingbit - if this bit equals '1' the frame contains an additional slot to adjust the mean bitrate to the sampling hquency, otherwise this bit will be '0'. Padding is only necessary with a sampling frequency of 44.1 kHz.

private-bit - bit for private use. This bit will not be used in the future by ISO.

'00' stereo 0 1 ' joint-stem (intensity-stereo W o r ms-stereo) ' 10' dual-channel ' 1 1 ' single-channel

mode-extension - these bits are used in joint-stem mode. In layer I and 11 they indicate which subbands are in intensit.stere0. All other subbands are coded in stereo.

'00' subbands 4-3 1 in intensity-stereo, bound==4 '01 ' subbands 8-3 1 in intensity-stereo, bound==8 ' 10' subbands 12-3 1 in intensity-stereo, bound= 12 ' 1 1 ' subbands 16-3 1 in intensity-stereo, bound=16

In Layer lII they indicate which type of joint stera coding method is applied.

intensity-seo ms-stereo

off on off on

off off on on

if this bit equals to '0' there is no copyright on the coded bitstream, '1' means copyright

origin- - this bit equals '0' if the bitstream is a copy, ' 1 ' if it is original

- indicates the type of de-emphasis that shall be used

'00' no emphasis '01 "15 microsee. emphasis '1W resaved '11' CCFIT" J.17

2.4.2.4 Error Check

1 crc-cbk - a 16 bit parity-check word is uscd for optional ermr detection within the encoded

I

Page 34: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

6 2.4 .2 .I Audio data, Layer 1

allocation [sb] - indicates the number of bits used to code the samples in subband sb. Valid for single-channel subbands and for subbands in intensity-stereo mode, (Valid for both channels.).

Code Bits

14 15

invalid

Page 35: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEGIAudio Standards: Phase 2 ISOIIEC 13818-3

D. Pan, "An overview of the MPEGIaudio compression algorithm IS&T/SPIE, vol. 2187, pp. 260-273, San Diego, CA, Feb. 1994.

International Standard, Nov. 1994.

Provision for visually impaired and hearing impaired.

1. Multichannel audio support: 5 high fidelity audio channels plus a low frequency enhancement channel (1 5- 120 Hz) also known as 5.1 channels.

2. Multilingual audio support: up to 8 commentary channels.

3. Lower compressed audio bit rates: Additional lower compressed bit rates down to 16 Kbps. -

4. Lower audio sampling rates: Besides 32,44.1 and 48 KHz, sampling rates of 16, 22.05 and 24 KHz.

The new standard is compatible with the present MPEGIaudio standard. Phase 2 MPEGIaudio decoders can decode all present MPEGIaudio bitstreams. Backward compatibility is achieved by combining suitably weighted versions of each of the up to 5.1 channels into a "down-mixed" left and right channels. These two channels fit into the audio data framework of a phase 1 MPEGIaudio bitstream. Information needed to mcaver the original left, right, and remaining channels fit into the ancillary data portion of a phase 1 MPEGIaudio bitstream. Compatibility with the phase 1 standard compromises the audio quality of the multichannel coder. There should be an addendum to the phase 2 standard that specifies a non-backward compatible (NBC) multichannel coding mode.

Subjective listening tests on MPEG-2 multichannel audio standard which is backward compatible with MPEG-1 audio were conducted at Telekom and BBC during Jan. - March 1994. Two NBC codecs from AT&T and Dolby were also used in these tests. At low bit rates (3201384 Kbps), the average performance of most codecs was satisfactory but all codecs showed deviations from transparency for some of the test items.

LFE: Low Frequency Enhancement channel (Audio signals in the range 15- 120 Hz)

To enable listeners (optional) to extend low fiequency content of the repoduced program in terms of both frequency and level. This is same as the LFE channel proposed by the

industry for its digital m n d systems. - - -. . v ' ..

[$$18-7 - -:..f':: M p ~ d a - . ( N a C ) A U D I O 7 . - . .

. I ,- . . ,. P ... ,

- . - - ? --. ..

Page 36: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG-2 Audio

ISO/IEC 1 38 1 8-3 IS, Nov. 1994.

Three additional sampling rates for ISO/IEC 1 1172 layers I, TI and 111.

Duration of audio frame

Syntax and Semantics of MPEG-1 audio are maintained except for a new definition a: of sampling frequency field, bit rate index field and bit allocation tables.

A universal and compatible multichannel audio system satellitelterrestrial television broadsasting, DAB and as well other non-broadcasting media.

Multilingual capability. Multichannel stereo performance and bilingual programs or multilingual

commentaries. Exam~le: Bilingual 210 stereo program or one 210,310 stereo sound plus accompanying services (clean dialog for the hard of hearing, commentary for visually impaired, multilingual commentary etc.,)

The Syntax, Semantics and Coding techniques of MPEG-1 audio are maintained except for a new definition of sampling rates bit rates and bit docation talbes

Page 37: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

312 S ~ ~ m u n d Sound

L. : C : Center, R : Right LS : L.e.fi Surround RS : Right Surround

MPEG- 1 Audio

. +- . Multichannel Audio

Cs

Figlre 11.31: The possible configurations of multichannel audio for one pr0gtw.n or two programs.

MPEG - 1

-&St

Audio

Page 38: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

- T1 ,

Jd , MPEG-1 b MPEG-1 TI' , 1.' , channel Encoder Decoder

c b T2' , C' , R 1 T3 , ~ 3 ' , Inverse

Matrix Matrix R' b LS , T4 , T4' ,

MPEG-2 MPEG-2 LS' , Extension Extension T5' ,

.MPEG-2 Encoder MPEG-2 Decoder Figure 1 1 .32: MPEG-2 audio codec compatible with MPEG- 1 audio

L o 3 L t f 3 . 7 0 7 C + ~ . 7 ~ 7 LS a. = a + b . 7 0 7 c +. 0-30-1 U S

Multichannels

Channeb compatible with MPEG-1 audio

Extension channels

Low frequency enhancement channel,' capable of handling signals 15-120 Hz.

To enhance the low frequency content of the program in terns of both frequency and level

Page 39: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

ackward Compatibility

Lo and % Left and right stereo channels of MPEG-1

audio.

L, C, R., LS and RS Multichannels of MPEG-2 audio. 1 x and y are nearly - JZ

jS is derived fiom LS and RS by calculating the mono

component, bandwidth limitation to 100-7000 Hz range, half

~ o l b f B- type encoding and 90" phase shifting (prologicR

surround matrixing). Dolby and Prologic are registered trademarks

Page 40: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

MPEG-1 Header

r MPEG-I Audio MPEG-1 Ancillary I signal I Data

MPEG 2 Ancillary

MPEG-I Header MPEG-2 Header

MPEG-1 Audio

I MPEG-2 Extension

I si@al

Data

Data format of MPEG audio bit streams (a) MPEG-I bit stream and (b) backwards-compatible MPEG-2 bit stream.

Page 41: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

C - W E MICROSYSTEMS 1778 McCarthy Blvd. Milpitas, CA 95035

Phone : 408-944-6300 Fax : 408-944-63 14

Vista Com Inc. 20395 Pacifica Drive Suite 109 Cupertino, CA 95014 Phone : 408-253-5 165 Fax : 408-253-5 170

Analog Devices DSP Devices One Technology Way P.O. Box 9106 Norwood, MA 02062-90 16

Phone: 617-461-3881

Cxystal Semiconductor Corp. P.O. Box 17847 4210 S. Indugtrial Dr. Auks, "IX. 78744

Phone : 5 12-445-7222 1-800-888-50 16

Fax : 5 12-445-758 1

MPEG AUDIO

MPEG AUDIO ENCODER VideoRISC Layers 1 and 2 Real time MPEG 1

(32-3 84 KBPS) , Encoder Input 32,M.l & 48 KHz, 16 bit Development PCM stereo analog or digital input. Station Can multiplex audio and videc into (Oct. 1993) MPEG system bitstream

(TMS 320C3 1)

VCIlAC16 Universal Audio,codec Company board for 16-64 KBPS13.1-7 KHz brochure 16 KBPSB. 1 KHz (G.728) 48-64 KBPSl7 KHz (G.722) ..

AD 1846 and AD 1847 :Complete Company digital audio "systems cm a chipM brochure, Implements CCITI'. Rec. G728 Winter 1994 audio coding algorithm.Progra.- DSpX able multi-function PC sound Expo. & card chip set. Symposium,

San Francisco, CA, June 1994

CS 4920 multi-standard audio Company decoder (single chi@. Program- brochure a e - DSP supporting industry 1993 st&-dard and propri m y DSP

. algorithms. ~oftwmefar MPEG - audio layers 1 urd 2 and AC-2 is available. - f a = 32, 44.1 and 48KH.z

sampling rates. AISO at fd. Accepts MPEG:2 PES la* and m G - 3 packet la er. Decodes 0-8~ + 10 part of MPEG

Page 42: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

British Teleoom 81 Newgate Street London EC LAIIAJ U.K.

SGS-Thomson MicroeIectronics 1 7 Ave des Martyrs, BP 21 7 380 19 Grenoble, France

Phone: 76-58-56-10 Also

1000 East Bell Road Phoenix, AZ 85022 Phone: 602-867-6279

Optivision 1477 Drew Avenue Suite 102 Davis, CA 95616

Phone : 1-800-562-8934 9 16-757-4850 916-756-1309

PC Videop:tane kit. Application/ Company software, VC 8000 multimedia

C brochure

communication card, video camera,connection unit and a Technology) dedicated audio unit. Video at QCIF (H.261) Audio G.728, G.711 and G.722 compliant. up to 128 KBPS

MPEG 1 audio decoder 1C.Layers Single chip I and I1 at all sampling rates, all ISOhEC modes (mono, dual, stereo, 11 172-3 joint stereo) and all bit mes. decoder.

STi 4500145 10

MPEG real-time audiolvideo encoder and decoder. Flexible user control up to 5 Mbps. (one board each for encoder and decoder)

Company brochure March 1994

LSI Logic 1 55 1 McCarthy Blvd: Milpitas, CA 95035

Phone: 408-433-8000 Fax: 408-434-8000

Gnrghics Communication Engineering

Column-Minami Aoyama Bldg 6F 7- 1-5 Minarni-Aoyamo Mimito-Ku Tokya, Japan 107 P k 81-33498-7141 - Fax: 81-3-3498-7543

Y -.

S i e chip MPEG-1 audio decoder. Brochure on Layers I and I ' and MUSICAM. L64 1 1 1 MPEG Output sampling 32,44.1 and 48 audio decoder KHz. Decodes audio data h m 32 to 192 Kbps mono ard 64-384 ,Kbps stereo. Supports up to 15 Mbps sustained channel data rate.

Single chip MPEG-1 real time Brochure on DA decoder. Decodes audio data fiom 7190 MPEG 64- 192 Kbps mono and I283 84 audio decoder Kbps stereo. Output sampling 32, chip 44.1 and 48 KHz. Soft muting and demuting of output data

Page 43: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

P.O. Box 655303 m- " & . . Dallas, TX 75265

Phone: 214-91 7-3881 I A rl

I [ r . . ' ry . 1:- Ariel 433 River Road Highland Park, NJ 08904 Phone: %8-249-2900 Fax: 908-249-2900

S i e chip MPEG audio decoder. Brochure Layers I and Kt. Decodes mono, TMS32OAV110 dual, stereo and joint stereo mode. MPEG audio Output sampling 32,44.1 and decoder 48 KHz.

Real time MPEG audio coding1 Company decoding. AT&T VCOS develop- brochure ment platform (Multimedia on the PC)

Atlanta Signal Processors Inc. 1375 Peachtree St., NE Suite 690 Atlanta, GA 30309-3 1 15

Phone: 404-892-7265 Fax: 404-892-25 12

CLI 2860 Junction Ave. San Jose, CA 95134 P h : 408-43 5-3 000

1-800-53 8-7542

Wpdai Electronics America 166 Baypointe Parkway Sag Jose, CA 95 134

MPEG Audio EncoderDecodkr. Company Layers I, 11 and III., mono, stereo, brochure joint stereo and dual channel modes. Elf DSP platform for AT compatible PCs together with Elf Coprocessor board and digital audio interface board.

MPEG Audio EncoderDecoder. Company Layers I and 11 ,56-192 Kbps Mono brochure 112-384 Kbps Dual or joint stereo

HDM8211M single chip operates on IS01 1172 WEG-11 or 138 18 (MPEG-2) bit streams. It demultiplexes, decompresses and

Company brochure

synchronizes audio and video and produces digital output ready for DIA conversion. Decodes main profile @ Main Level video streams.

1

I Matrox Electronics Systems Inc. MPEG Layers I and Kt decoder. 1055 St. Regis Bhd., 32,44.1 and 48 KHz sampling

rates. Canada BP2T4 Phone: 5 14-685-2630 1-800-36 1-4903

Company brochure

Page 44: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Microware System Corp. 1900 NW, 1 14th Street Des Moines, Iowa 50325 Phone: 5 15-224- 1929 Fax: 5 15-224-1352 Internet: info@microware. com

Microwarefs 0s-9 operating system Company together with Motion Picture File brochure Manager can decode MPEG audio1 .

video files for playback in real time

Optiiase Inc. 5000 Quorum Dr. Suite 700 Dallas, Tx 75240 Phone: 1-800-45 1-5 101

214-239-5242 2 14-239-1273 Fax:

Philips Semiconductors 8 1 1 E. Arques Ave. P.O. Box 3409 Sunnyvale, CA 94088 Phone: 8 18-880-6304

Xing Technology Corp. 1540 West Branch Street Arroyo Grande, CA 93420

Phone: 805-473-0145 1 -800-2-XINGIT

805-473-0147 Fax:

Sigma Design Inc. 4790 Bayside Parkway Fmont, CA 94538 Phoae: 510-770-0100 Fax: 5 10-770-2640

IIT 2445 Mission College Blvd. Santa Clara, CA 95054

MPEG };udio support available. Company. MPEG LAB Pro Real-time digital brochure videolaudio compression. Also decompression. MPEG layer I 32-448 KBPS. Layer 11 32-384 KBPS. f, = 32,44.1,48 KHz.

Demultiplexes MPEG audio bit Brochure SAA stream for layers I and 11 and. 2500 (Aug. 1993) decodes. Broadband audio output signal at 16, 18,20, or 22 bits

MPEG audio compression software Company R'eal time CD-quality compression brochure and playback to use& of 16 bit 44KHz sound cards with DSP chips from Analog Devices (OTHER OPTIONS) Xing: Sound : Real time MP~~-co&atiile audio compression software for desktop PCs

MPEG Layers I & II audio decoder. Company 44.1 KHz sampling rate brochure

IIT VCP single chip video codec Company and multimedia co&unications brochure processor. VCP can decode MPEC (both I8cII) system protocol and decode MPEG 1gt11'video. MPEG audio is processed externally and passed to the VCP th a serial audio brs

Page 45: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

N o r t h V w MPEG-1 E e include0 audio --& ' ':" ' ' ' layers 1 and-- bitratesand

15262 NW audio c h i s . Greenbrier Parkway Beaverton, OR 97006 Phone: 503-236-6121 Fax; 503-690-2320

Bran Carp. 1705 Wyatt Dr. Santa Clan, CA 95054

Phant: 408-986-13 14 Fur: 408-9S1240

M k v 8 systeim I

, . 2933 Bunkerhill h I Suite 202

Saada Clara, CA 95054 I I Phune : 408-970-1780 i Fax : 408-982-9877

Winbond Electronic Corpn. No. 4 Creation Road III Scienoe-Based Industrial Park, Hdiachu 30077, Ta3wan, ROC Ph: 886-35-770066 Fax: 886-3 5-78947

ZR 385 1 1 t ~ o channel MPEG-1 audio processor can decode Layen I and 11 various sampling rates, bit rates and d o channds. ZR 38000 and ZEt3WM1 c a a b e ~ t b r M P E G l a y e n I a n d I1 encoding and decoding in a single processor. Alra Doby AC-3 DigitJSurround Audia E d 1 Decoder.

GAO Research & Consulting Ltd. 85 Dundalk Drive Scarborough, Ontario Canada M 1 P4 V1 Ph: 41 6-292-0038 F13~: 416-292-2364

~ ~ t u n i t ~ t d y d i o ~ n b ~ ~ d ( MPEG audio Layers I and I1 ) AudioNideo decoder board

W99 10 MPEG-1 Audio Decoder Layers 1 and 2, (32,44.1 and 48 KHZ sampling rates)

MPEG Audio Codec

Page 46: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

LOW BIT RATE AUDIO CODING

Digital Voice Systems Inc. One Kendall Square Bldg 300, Cambridge, MA 02139 Ph: 6 17-494-6549 Fax: 6 17-494-5 10 1

Atlanta Signal Processors Inc. 1375 Peachtree street NE Suite 690 Atlanta, GA 30309-3 1 15 Ph: 404-892-7265

. i q o @ -pi . con

VC-20, Voice Code Module. Speech coder. 2.4 to 9.6 KBPS

US Federal Standard 1 0 1 6 CELP (4.8 or 7.2 KBPS) North American Digital Cellular VSELP. (7.95 KBPS)

M P E ~ Aud;~ E ~ C O A P G MELP (1.6 to 2.4 KBPS) LPC-10 (2.4 KBPS) CCITT G.72311 24,32,40KBPS

G.722 48,56,64 KBPS G.728 16 KBPS G.711 64 KBPS

CVSD (16 KBPS), Subband Coder (All run on a single TMS 320 C3 x DSPCHIP) C . 7 ~ 9 C C J - A C E L P ) d.72 6 A D P c n

- I ' ; D$P Software Engineering Inc. (Jan. '94 brochure) Gf.723

suite 206 G.711, G.722, G.726, G.728, LPC-lOe 165 MldBlesex Turnpike vocoder, 10 16 CELP vocoder, IS-54 Bedford, MA 0 1730 VSELP vocoder, LS-celp 8kbls low

I ph: 6 17-275-3773 delay vocoder (TMS 320 C3 x based 1 ." ill . ax: 6 17-275-4323 audio coders. 2.4 to 64 KBPS audio

coders) singe chip fixed point G.728

Page 47: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Dr. Jean-Pierre Adoul Univ. of Sherbrook Fax: 819-821-7937

Analog Devices DSP Division One Technology Way P.O. Box 9 106 Nonvood, MA 02062-9 106 Ph: 617-46 1-3672 Fax: 617-461-3010

TI TI Customer response Center (800-336-5236

Ext. 35 10) P.O. Box 809066 Ddks, TX 7538009066

NEC Electmaics Inc. & Philips Kommunikations L~Justrie AG

for audio/vidw conferencing applications, single 40 MTP TMS--32n c 5x implementation. W . 3 a-& 3 L 7 \ f l - h -

I C.-- - ~lgebraic CELP (ACELP) 16 kbps coder. Also at 56 kbps (G.722) 9.6 kbps speech coder 4.8 kbps, 8 kbps (transparent toll quality) (TMS 320 C30lC3 1 based coders)

G.728 (LD-CELP) 1 6 kbps speech coder. AD1 846 and AD 1847: Single- chip sigma-delta stereo digital audio codecs. Programmable multifunction PC sound card chip set. DSP' Expo & Syrnp, San Francisco, CA, June 1991.

TMS 320 SS16 transcoder (low bit rate toll-quality speech) 64 kbps PCM 32 kbps ADPCM (G.721) 16 kbps subband. TMS 320 C3 x processors G.722 and G.723 coders. TMS320C80 G.728 and LPC CELP

A fixed-point singe-chip implementa- tion of the CCITT G.728 using a next generation version of NEC's ,~PD77016 16-bit DSP. DP' Expo & Symp, San Francisco, CA, June 1994.

. A single-chip DSP with D/A converter for audio decoding. DSP~ Expo &

~~9tisz.m ?8m Symp, San Francisco, CA, June 1994. Ph: 512445-7222 (m: ... 7$#,)

I

- Baa--8~8 - 5616 ~td. Gem

Page 48: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

Analogical systems

.,: >. 2 9 9 Ca&i~'forsi& A r e "* Kgo A U + C ~ ~ + I Q ~ 2

ULSI Systems Development Labs, NEC Corp.

Dentsche Bundespost Telekom Forschungs-und Technologiezentnun

U.S. Dept. of Defense

GC Technology Corp. Column-Minami Aoyama Bldg. 6F 7- 1-5 Minami-Aoyama,

Minato-Ku Tokyo, Japan 107 Phone : 8 1-3-3498-714 1 FAX : 8 1-3-3498-7543

Soniteeh Ind. Inc. 14 Mica Lane Wellesley, MA 02 18 1 Phone : 6 17-235-6824 Fax : 6 17-235-253 1

Real-time implementation of CCITT G.728 fixed-point 16 Kbps low-delay CELP speech compression algorithm. DSP* Expo & Symp, San Francisco, CA, June 1994. b f f i w - d 4 0 ~ 4 -72%

C .713.1, C.72&& C-?A 8

4 Kbls multimode learned (ML - ) CELP implementation on the rPD77016. DSP' Expo & Symp, San Francisco, CA, June 1994-

ITU standardization process to define a 8 Kbls speech codec with wireline quality. DSP' Expo & Symp, San Francisco, CA, June 1994.

The new government standard 2400 bps speech coder. DSP= Expo & Symp, San Francisco, CA, June 1994.

G.71 UG.722 chipset, bitrate : 48/56/64 Kbpst C f . 7 ~ 3 . 1

4.8 Kbps voice coder (SPIRIT-CELP). Complies to US Federal Standard 10 16.

Page 49: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

. . : . . . . .& ;

Implements JPEG , .d . r . ' p e 2882 Sand Hill Road Baseline system :, compression hardware Suite 115 5 ,- , including color - implementation with Menlo Park, CA 94025-7022 conversion ie, extensions for fixed- Ph: 4 15-496-5705 RGB or CYMK rate and compressed Fax: 4 1 5-854-8740 to Y,C,,C,. image editing

applications" IS&T/ SPIE Symp., vol. 21 87, San Jose, CA, Feb. 1994.

Vistacom Inc. - 20395 Pacifica Drive

Suite 109 Cupertino, CA 95014 Ph: 408-253-5 165 Fax: 408-253-5 170

m, PAA-TSA . - Rmm Telecom

38-40 rue dzl General Leclerc

F a 1 3 1 ISSY-Les- emlx, Frmce

-

- a: 33-1 -45-29-38-06 F a : 33-1-45-29-54-35

Several boards (some PC based) that can implement the H-series (H.26 1 etc) network Company brochures interfaces, communication processors for CIF at rates form 56 Kbps to 2 Mbps.

DSP based video codec low bitrate video telephony communication on a PC. 14.4 to 28.8 Kbitls (includes speech, video and data). To provide video telephony on ISDN, PSTN or local networks.

G. Eude and J.C. Schmitt, "optimized hybrid transform coding for very low bitrate: Video- telephony communi- cation on PC" IS&T/ SPIE Syrnp. on Electronic Imaging: Science and Tech- nology, vol. 21 87, pp. San Jose, CA, Feb. '94

Page 50: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

, . . - .. 7' ' - - .. .

. ,?+ . - . ;,g .

r, I

TI Inc., Literature Response Center P.O. Box 17228 Denver, CO 802 17 TI SC Product Literature 1-800-477-8924 ext. 200 1

! Vistacom, Inc. 20395 Pacifica Drive

!I Suite 109 Cupertino, CA 95014 Phone : 408-253-5 165 Fax : 408-253-5 170

I COMSAT Labs 22300 COMSAT Drive CWburg, MD 20871 Phone : 30 1-428-4553 Fax : 301-428-4534 http : //www.cornsat.com

Universal audio codecs for desktop applications (includes echo cancellation) G.722 and G.728 Standards

SkyWave Electronics

Qua1 Comm VLSI products 1 6455 Lusk Blvd ., -! h Dicgo, CA 92 12 1-2779

' . 61%5&58a5

.corn

2F *- %F-

TCS320IS54B chipset. 3 chip digital cellular baseband chipset. Includes TLV320AC3X voice band audio processor (IS-54C Standard)

VCOM CL (VCI-10 CL) H.32PNPEG-4 (Pre-standard methods) for PSTN, 9.6 to 28.8 Kbps using regular modems (auxiliary equipment includes camera, microphone, speaker and cables)

1200 bps voice codec (performance equivalent to 2400 bitsls DOD LPC- 1 Oe)

Digital Voice coders based on TMS 320C25 2.4,4.8,9.6, 14.4 or 16.0 KBPS

Page 51: MPEG AUDIO - · PDF fileMPEG AUDIO REFERENCES 1. CD 11172-3. ISO/IEC JTCISC 29N 071, December 6, 1991, "Coding of moving pictures and associated audio for digital storage media at

I

I L

i Work stations Technologies 1 Phone: 7 14-250-8983

i Fax: 7 14-250-8969 I .. ;

.P

r24>-. . . AT & T Gcro-electronics 555 Union Blvd. " - :

I

, Allentown, PA 1 8 103 (800-372-2447)

1 Castleton Network Systems Corp. Phone: + 604-293-0039

. E-mail: [email protected] Web: http://www.castleton.com

WIN Strauss "Speech and Audio Coders: Analysis and Evaluation Forward Concepts

I 1575 W. Univ. Drive]

I Tempe, AZ 85281

i http://~.fivdconcepts.com Ph: 602-968-3759 Fax: 602-968-7 145

i [email protected] >,: '

! nkiid Vaice Systems Inc Advanced multiband excitation (AMBE) €hie Vande -Drive Vocoda 2 way Voice Commun. at 2400bps k d h g b n , h4A 01803 W: 1-617-270-1030 F~x: 1-617-270-0136 . .. . - PictureTel' -

G.723.1, G.729, G.711, G.722, & G.728 100 Minuteman Road audio codecs with videoconferencing systems Andova, MA 0 1810 www.pictel.com ;

\

, . .v? SigmlstSefwtweM - I TI TMS 320C5X family of DSPs. G.711 Ph: 44@) 181 426-9533

4 = : , G.722 & G.728 and acoustic echo canceller '-

869-1 182 - - / -

..st' - 7 , . . -9' . , I - . - .b * t ,:?* - >,--.>j$ . .y

., :.*, : *:' >.A- . t

.

. . i

Compression codec for TMS320C3 1 that convert G.711 (64 Kbps) signals to U.S. Federal standard 10 16 (4.8 Kbps CELP)

DSP audio modules for G.7 1 1, G.722, G.728 and a~oustic echo cancellation. AVP Video/ Audio. processor, AV4400A. Programmable variety of audiolvideo Compression/decompression algorithms including H.26 1, H.263, G.723, JPEG, motion JPEG and MPEG. 1.

G. 729 ITU voice compression on the TMS 320 C5 x DSP. Toll quality fill-duplex operation.