information to users -...

INFORMATION TO USERS

This manuscript has been reproduced tram the microfilm m8lt8r. UMI films

the text directly from the original or copy lubmitted. Thul. sorne "sil .nd

dissertation copies are in typewrfter tEe. while oIherI may be tram any type of

computer printer.

The qU111ty of thl. Npracluction 1. dependent upon the qU811ty of the

copy aubmltted. Broken or indistinct print. coknd or poor qUillity illustrations

and photographl. prfnt bleedthrough. subStlndlrd margi1s, and improper

alignment ca" adversely 8ffect naproduction.

ln the unlikely event ht the 8Uthor did not send UUI • complete m8nuscript

and the,. are milling pegel. the.. will be noted. AllO, if unauthorized

copyright material had to be removed. a note will indicate the deletion.

Oversize materi81s (e.CI., mapl. dl'llWingl. eh8rt1) are reproduced by

sectioning the origi".l. begiming 8t the upper 1Ift·tw1d corner 8I1d continuing

from left to right in 8qUIII sections with small ov.rlapa.

Photographl induded in the origiNlI manuscript h8v. been reprodUCld

xerographically in thil copy. Higher ~Iity 8- x 9- black Ind white

photographie printl .re IY8ilabie for InY photogl'lPhl or illultnltionl~ring

in thi. copy for an Idditional charge. Con.et UMI dirKtly ta arder.

Bell & HoweIllnfoImation 8nd Luming300 North ZHb ROId. Ann ArborI MI 48108-1348 USA

800-521-œoo

NOTE TO USERS

Page(s) not ineluded in the original manuscriptare unavailable from the author or university. The

manuseript was mierofilmed as reeeived.

••• •III-IV

This reproduction is the best copy available.

UMI

•

•

•

MPEG-2 Transport over ATM Networks with BestEffort Service

Song PuSchool of Computer Science

McGill University

Montréal, Québec, Canada

A Thesis submitted to theFaculty of Graduate Studies and Research

in partial fulfillment of the requirements for the degree of

Master of Science

@ Song Pu, 1998

1+1 National Ubraryof Canada

Acquisitions andBibliographie Services315 weIingIon StfMt0IIawa ON K1A 0N4c.n.dI

BibliothèQue nationaledu Canada

Acquisitions etservices bibliographiques

385. rue wellingtonOttawa ON K1 A0N4c.n.da

The author bas granted a nonexclusive licence allowing theNational Library ofCanada toreproduce, loan, distnbute or sencopies ofthis thesis in microform,paper or electronic formats.

The author relains ownership ofthecopyright in this thesis. Neither thetbesis nor substantial exttacts from itmay he printed or otberwisereproduced without the author' spenmsslon.

L'auteur a accordé une licence Donexclusive permettant à laBibliothèque nationale du Canada dereproduire, prêter, distnbuer ouvendre des copies de cette thèse sousla forme de microfiche/film, dereproduction sur papier ou sur fonnatélectronique.

L'auteur conserve la propriété dudroit d'auteur qui protège cette thèse.Ni la thèse ni des extraits substantielsde celle-ci ne doivent être imprimésou autrement reproduits sans sonautorisation.

0-612...50861 ...7

•

•

•

-

To my lovely wile

and my parents

NOTE TO USERS

Page(s) not included in the original manuscriptare unavailable trom the author or university. The

manuscript was microfilmed as received.

••• •III-IV

This reproduction is the best copy available.

UMI

•

•

•

CONTENTS

RÉSUMÉ

ABSTRACT

ACKNOWLEDGMENTS

1 ÜUTLINE AND MOTIVATION

2 MPEG-2 STANDARD: A REVIEW2.1 History...................................2.2 Color Representation . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3 Digital Video Format . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.1 CCIR-601 Recommendation .2.3.2 Source Input Format (SIF) and Common Interchange Format

(CIF) .2.4 MPEG Carling . . . . . . . . . . . . . . . . . . .

2.4.1 Carling Principles . . . . . . . . . . . . . . . . . . . . . . . . .2.4.2 Discrete Cosine Transform Cading .2.4.3 Quantization .2.4.4 Entropy Coding . . . . . . . . . . . . . . . . . . . . . . . . . .2.4.5 Motion-Compensated Inter-Frame Prediction .2.4.6 Picture Types in MPEG . . . . . . . . . . . . . . . . . . . . .

2.5 MPEG-2 Video Standard . . . . . . . . . . . . . . . . . . . . . . . . .2.5.1 Differences between MPEG-2 and MPEG-1 .2.5.2 Scalability and Data Partition .2.5.3 MPEG-2 Systems Layer .

3 SERVICE CLASSIFICATION AND ADAPTATION LAYER OF ATM3.1 Classification of Services in ATM Networks .

3.1.1 How These Services Wark Together . . . . . . . . . . . . . . .3.2 ATM Adaptation Layer (AAL) .

3.2.1 Common Part Convergence Sublayer (CPCS) of AAL-5 . . . .3.2.2 Segmentation and Re-Assembly Sublayer .3.2.3 Error Detection . . . . . . . . . . . . . . . . . . . . . . . . . .

4 ISSUES IN MPEG-2 OVER ATM4.1 Service Class Selection . . . . . . . . . . . . . . . . . . . . . . . . . .4.2 Choice of Adaptation Layer . . . . . . . . . . . . . . . . . . . . . . .4.3 Transport Stream Encapsulation. . - . . _ . . . _ . . . . . . . . . . .4.4 Factors Meeting Picture Quality .

4.4.1 Data Losses Due to CeU Errors .4.4.2 Data Lasses Due to Burstiness and Excessive Delays .

v

ix

xi

xiii

1

678

1010

10Il12131415151617181920

2425 .2727303131

32333435363637

4.5 Congestion Control and Switch Discarding Scheme 374.5.1 Priority Assignation Scheme " 38

4.6 Error Correction and Concealment 384.6.1 Forward Error Correction (FEC) in AAL Layer . . . . . . .. 39

4.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 394.7.1 FEC-Service Specifie Convergence Sublayers . . . . . . . . .. 404.7.2 Switch Discarding Schemes. . . . . . . . . . . . . . . . . . .. 404.7.3 Priority Assignation Scheme . . . . . . . . . . . . . . . . . .. 41

5 VIDEO QUALITY OF SERVICE CONTROL FRAMEWORK 435.1 Dynamic Extended Priority Assignation Scheme (Dex-PAS) ..... 445.2 Slice-Based MPEG-2 TS Packets Encapsulation Strategy . . . . . .. 465.3 AAL-5 Service Specifie Convergence Sublayer with FEe Support. .. 47

5.3.1 Requirement of SSCS with FEe support . . . . . . . . . . .. 475.3.2 Behavior of Sender and Receiver Entities . . . . . . . . . . .. 48

5.4 Selective and Adaptive Partial Slice Discard Scheme (SA-PSD) . . .. 535.4.1 The Algorithm Introduction " 535.4.2 SA-PSO Parameters 545.4.3 SA-PSD Operation Modes . . . . . . . . . . . . . . . . . . .. 55

6 EXPERIMENT AND RESULT 586.1 Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . .. 59

6.1.1 The NIST ATM SiUlulator . . . . . . . . . . . . . . . . . . .. 596.1.2 Network Madel . . . . . . . . . . . . . . . . . . . . . . . . .. 60

6.2 MPEG-2 Trace File . . . . . . . . . . . . . . . . . . . . . . . . . . .. 616.3 Several Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . .. 626.4 Parameters 64

7 DISCUSSION 667.1 Results at Cell-level. . . . . . . . . . . . . . . . . . . . . . . . . . .. 677.2 Results at Slice-Level . . . . . . . . . . . . . . . . . . . . . . . . . .. 717.3 Distance Effect .... . . . . . . . . . . . . . . . . . . . . . . . . .. 727.4 Redundancy Vs Data Ratio . . . . . . . . . . . .. 74

8 CONCLUSION 788.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 798.2 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 81

•

•

•

vi

REFERENCES

CONTENTS

82

• FIGURES AND TABLES

FIGURES

2.12.22.32.42.52.62.7

3.13.23.33.4

4.1

• 5.15.25.35.4

6.16.26.3

Basic abjects defined in MPEG-2 . . . . . . . . . . . . . . . . . . .. 13OCT coefficients in a coding block is scanned in zig-zag order .... 15Motion compensation and motion estimation . . . . . . . . . . . . .. 16MPEG encoder . . . . . . . . . . . .. . . . . . . . . . . . . . . . . .. 17Example of MPEG video sequence 18Scope of MPEG-2 systems specifications . . . . .. . . . . . . .. 20PES encapsulation using fixed length packet . . . . . . . . . . . . .. 22

Service bandwidth allocation. . . . . . . . . . . . . . . .. . . . . . .. 27Traffic classes and AAL types . . . . . . . . . . . . . . . . . . . . .. 29Structure of the convergence sublayer .. . . . . . . . . . . . . . . . .. 29AAL-5 CPCS-PDU Header. . . . . . . . . . . . . . . . . . . . . . .. 30

Mapping of MPEG-2 transport packets . . . . . . . . . . . . . . . .. 36

Slice-based PES encapsulation using variable length packet . . . . .. 46AAL-5 multi-Ievel FEC-SSCS using grouping mode 1 . .. 49Control block structure used in FEe scheme . . . . . . . . . . . . .. 51Buifer thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 55

Network topology used in the simulation . . . . . . . . . . . . . . .. 60Number of ATM cells per slice . . . . . . . _ . . . . . . . . . . . . .. 62Number of ceUs per slice after multiplexing of all sources with time shift 63

•

7.1 Cell loss ratio (Aggregate) . . . . . . . . . . . . . . . . . . . . . . . .7.2 Cellioss ratio (I-frame) . . . . . . . . . . . . . . . . . . . . . . . . ..7.3 Cellioss ratio (P-frame) . .7.4 Cell 1058 ratio (B-Crame) .. . . . . . . . . . . . . . . . . . . . . . . .7.5 Mean cell traosfer delay .7.6 Buffer occupancy in No-RD . . . . . . . . . . . . . . . . .7.7 Buffer occupancy in Ex-SeO .7.8 Buffer occupancy in Ex-PSO .7.9 Buffer occupancy in Oex-SA-PSD . . . . . .. . . . . . . . . . . . . . .7..10 Slice loss ratio (Aggregate) . . . . . . . . . .. . .. . . .. .. .7..11 Slice loss ratio (I-frame) .7.12 Slice 1088 ratio (P-frame) . . . . . . . . . . . . . .. . . . . . . . . . . .7.13 Slice lOBS ratio (B-frame) .7..14 CLR(Aggregate) with different distance .7.15 CLR(I-frame) with different distance .7.16 CLR(P-frame) with different distance .

vii

676767 .67697070707071717272727273

•viii FIGURES AND TABLES

7.17 CLR(B-frame) with difl'erent distance. . . . . . . . . . . . . . . . .. 737.18 SLR(Aggregate) with different Distance. . . . . . . . . . . . . . . .. 747.19 SLR(I-frame) with different Distance . . . . . . . . . . . . . . . . .. 747.20 SLR(P-frame) with diff'erent Distance. . . . . . . . . . . . . . . . .. 757.21 SLR(B-frame) with different Distance. . . . . . . . . . . . . . . . .. 757.22 CLR(Aggregate) with different redundancy . . . . . . . . . . . . . .. 767.23 CLR(I-frame) with different redundancy . . . . . . . . .. 767.24 CLR(P-frame) with different redundancy . . . . . . . . . . . . . . .. 767.25 CLR(B-frame) with different redundancy . . . . . . . . . . . . . . .. 767.26 SLR(Aggregate) with different redundancy . . . . . . . . . . . . . .. 777.27 SLR(I-frame) with different redundancy . . . . . . . . .. 777.28 SLR(P-frame) with different redundancy . . . . . . . . . . .. 777.29 SLR(B-frame) with different redundancy . . . . . . . . . . .. 77

TABLES

2.1 Comparison of YUV various ratio formats . . . . . .. 92.2 CCIR-601 video frame scanning parameters 102.3 SIF and CIF video frame scanning parameters . . . .. .. Il

•

•

3.1 ATM Layer Service Categories . . . . . . . . . . . . .. ..3.2 Support Operations for AAL Classes .

5.1 New Ex-CLP Field Mapping .

6.1 Statistics data of MPEG-2 trace file used in simulation6.2 Data unit definitions . . . . . . . . . . . . . . . . . . . . . .6.3 Performance Parameters Definitions .

2628

45

616465

•

•

•

RÉSUMÉ

ix

Avec l'intérêt croissant dans la transmission d'applications audio-visuelles (par exem

ple MPEG-2) sur les services "meilleur effort" d'ATM, des mécanismes de contrôle

efficaces et orientés vidéo pour améliorer la qualité vidéo en présence de pertes doivent

être coneus. Dans cette thèse, nous proposons et évaluons une nouvelle infrastructure

de contrôle de Qualité de Service (QoS) pour le service Unspecified Bit Rate modifié

(UBR+).Nous avons étudié un certain nombre de problèmes liés au codage et contrôle des

flots de données MPEG-2 transmis sur les réseaux ATM, analysés le facteur réseau

qui affecte la Qualité de Service des applications vidéo temps réel et montré comment

cette infrastructure de contrôle de Qualité de Service orientée vidéo proposée améliore

la performance de ce type de service.L'infrastructure présentée ici consiste de quatre composants: un plan de rejection

orienté vidéo, qui ajuste le niveau de rejection de facon adaptative et sélective en

fonction de la mémoire tampon du commutateur, des types de payload vidéo et de

la tolérance de rejet du Forward Error Correction (FEC); un mécanisme de partitiondes données de priorité au niveau trame basé sur la structure de données MPEG

et la réaction du réseau; un ATM Adaptation Layer type 5 (AAL5) amélioré avecune nouvelle stratégie d'encapsulation MPEG-2 par tranche ; et un mécanisme FEC,qui est implementé dans la sous-couche convergence spécifique au service AAL5 pour

fournir la capacité de détection et correction d'erreurs.Cette infrastructure de livraison vidéo n meilleur effort" est evaluée avec des données

MPEG vidéo simulées et réelles. L'objectif d'ensemble de cette infrastructure de

contrôle de Qualité de Service est double. D'une part, assurer une dégradation

élégante de la qualité d'image en minimisant la probabilité de perte de cellules pour

les données vidéo critiques tout en garantissant un délai de transfert de cellules borné.

D'autre part, optimiser le débit en réduisant la transmission de données inutiles.En comparaison avec les approches précédentes, l'évaluation de performance a

demontrée une réduction significative du mauvais débit et une minimisation des pertes

de trames codées Intra et Predictive au niveau de la tranche vidéo.

•

•

•

x RÉSUMÉ

•

•

•

ABSTRACT

xi

With increasing interest in the transmission oC audio-visual applications (e.g. MPEG

2 ) over ATM best effort services, such as Available Bit-Rate (ABR) and Unspecific

Bit-Rate (UBR), efficient video-oriented control mechanisms for improving the video

quality in the presence of loss have to be designed. In this thesis, we proposed and

evaluated a new quality of service control Cramework for use with modified Unspecific

Bit Rate service.

We surveyed a number of issues related to the coding and control of ~(PEG-2 video

data streams transmitted over ATM networks, analyzed the network factors affecting

the quality of service of real-time video applications and showed how this proposed

video-oriented QoS control Cramework improve the performance Cor such services.

The presented framework relies on four components: a dynamic Crame-Ievel pri

ority data partition mechanism based on MPEG-2 data structure and feedhack Crom

the network; an enhanced ATM Adaptation Layer type 5 (A.AL-5) associated with

a new slice-based hlIPEG-2 encapsulation strategy; a forward error correction (FEe)

mechanism, which is implemented at the AAL-5 service specifie convergence sublayer

to provide the error detection and recovery capability, and a video-oriented cell dis

carding scheme, which adaptively and selectively adjusts discard level according to

switch buffer occupancy, video cell payload types and FEe drop tolerance .

This best-effort video delivery Cramework is evaluated using simulation and real

MPEG-2 video data. The overall objective of this proposed framework is twofold.

First, ensuring a graceful picture quality degradation by minimizing cellioss proba

bility Cor critica! video data while guaranteeing a bounded cell transCer delay. Second,

optimizing the network effective throughput by reducing the transmission of non use

fuI data.

In comparison to previous approaches, the performance evaluation has shown a

significant reduetion of the had throughput and minimization of losses of Intra- and

Predictive-coded frames at the video slice layer.

•

•

•

xii AOSTRACT

•

•

•

ACKNOWLEDGMENTS

xiii

xiv ACKNOWLEDGMENTS

• To my wife Tao, Cor her love and support despite of the many lonely hours.

To my supervisor ProCessor Nathan Friedman and Karim El Guemhioui, for theirguidance and support throughout the course oC my studies.

To Ahmed Mehaaua, whom 1 have been worked with all the time in these 1 yeu, Cor

his aid in every aspect of this work, from experimental design ta data analysis.

Ta Rauof Boutaba, who leaded me into this field and gave me lots of instructions at

the start time of this project.To Eric Leung-Tack, Linda Gu, mingchen Zhang, Yang Ling, Zijun Hu, Bonnie Wu,

Song Hu, Yanmei Zhang and Adel Ghlamallah, for their constant support in this

project, and for their friendship.Ta Michael R. Izquierdo for providing us with MPEG-2 trace file for the simulation

study.Ta the system administrators in CRIM, Yves Belanger, Daniel Choiniere for their

assistance in computer system support.Ta Franca Cianci, Judy Kenigsberg and Erica Huber for many helps on my study.

•

•

•

•

•

MPEG-2 Transport over ATM Networks with BestEffort Service

•

•

•

1

OUTLINE AND MOTIVATION

1

2 1 OUTLINE AND MOTIVATION

• Asynchronous Transfer Mode (ATM) is an emerging technology for broadband net

works that allows the transmission of a wide range oC traffic types - ranging from

real-time video (e.g. MPEG-2 application) to best-effort data (such as, e-mail) - to

be multiplexed in a single physical network. A key benefit of ATM technology is its

ability to provide quality-of-service (QoS) guarantees to applications with different

traffic characteristics. These QoS guarantees are in the form of bounds 00 end-to-end

delay, delay jitter and packet loss rate. Several classes of service have been defined in

the context of ATM networks ta satisfy the QoS needs of various applications. Among

them, The Constant Bit-Rate (CaR) and real-time Variable Bit-Rate (VBR-rt) ser

vice classes provide upper bounds on delay, jitter, and loss rate. These classes are

intended for real-time applications that require low delay and jitter. The non real

Ume Variable Bit-Rate (VBR-nrt) service class is intended for applications where

no jitter control is needed, but a delay guarantee is still required. The Available

Bit-Rate (ABR) service class is intended for delay-tolerant hest-effort applications

and uses sorne kind of feedhack approach to regulate source bit rates to avoid poten-

• tial congestion. The Unspecified Bit-Rate service (VBR) does not offer any service

guarantees and, thus, has the lowest priority amoog all the classes.

This thesis studies the transport of real-time traffic generated by MPEG-2 appli

cations in an ATM network using a modified VBR service (UBR+). MPEG-2 is an

emerging standard for audio and video compression. Being capable of exploiting both

spatial and temporal redundancies, it achieves compression ratios up ta 200:1 and can

encode a video or audio source to almost any level of quality. MPEG-2 standard offers

two ways ta multiplex elementary audio, video or private streams ta form a program:

the MPEG-2 Program Stream and the MPEG-2 Transport Stream formats.

•

The transport oC MPEG-2 over ATM introduces severa! issues that must be ad

dressed in order to solve the problem on an end-to-end basis. These inc1ude the

selection of service type, choice of the adaptation layer, method of encapsulation of

MPEG-2 packets in adaptation layer packets, strategy of scheduling algorithms in the

ATM network for control oC delay and jitter, and the error control scheme.

The ATM adaptation layer (AAL) is responsible for m&king the network behavior

transparent to the application. There are (our types of adaptation layers currently

•

•

•

3

defined for ATM networks: AAL-1, AAL-2, AAL-3/4 and AAL-5. Each of these is

designed for supporting specifie services and has different functionality. The choice

of an adaptation layer involves a number of tradeofFs (1). For instance, the use of a

circuit-emulation type of adaptation layer (AAL-l) would eliminate the various syn

chronization problems associated with MPEG-2 but ean be used only with constant

bit-rate MPEG-2 streams. In the more general variable bit-rate (VaR) case, such

an adaptation layer cannot be used. An alternative is AAL-5 which was initially

proposed to carry datl traffie over ATM networks. The drawback is that it is too

simple to provide reliable eonnection for sorne multimedia applications. In this thesis,

AAL-5 is selected, sinee it is currently the most commonly used adaptation layer in

industry and can support variable bit-rate MPEG-2 traffie. Sorne modifications are

proposed to improve its reliability for the transport of real-time MPEG-2 video data.

Different proposals have been made for seleeting the type of service under which

MPEG-2 is to be transported over AT~I [2, 3, 4). For constant bit-rate MPEG-2

streams, the CBR class of service is the natural choice. For the variable bit-rate case,

three main approaches have been proposed. The VBR service with rate renegotiation

tries to maximize the multiplexing gain by eapturing the VBR nature of MPEG-

2 [2, 5]. Aecording to this approach, the effective bandwidth of the source during

a specifie interval is used in order to allocate resources in the network. If enough

resources are not available the quality is degraded. The rate is renegotiated in the

long time ron and the way the renegotiation points are selected depends on the exact

algorithm. The second approach, ABR service, usually uses feedhack information _

in order to change the coding rate at the output of the MPEG-2 encoder to suit

the available bandwidth [6, 3, 4]. In this approach, the service is considered best

effort with some minimum guarantees. In the last approach, UBR service, provides

statistical service without any guarantees, like the one used in the Internet today.

The overall quality relies totally on the load of the network, thus, no QoS can be

guaranteed at ail. Of these, the last two approaches, ABR and UBR, are primary

designed for data traffic, which have a bursty unpredictable behavior. However, since

these best effort services will be widely available in the future and are based on the

excess bandwidth in the network with lower usage cost, it is predicted that they will

4 1 OUTLINE AND MOTIVATION

• also SUpport a non-negligible part oC the multimedia traffic.

ThereCore in this thesis, we propose a quality oC service control framework for

the delivery oC best effort video applications over UBR service. The aims of this

Cramework is twoCold. First, minimize 10ss for critical video data with bounded end

to-end delay for the arriving cells. Second, reduce the bad throughput crossing the

network.

•

•

In order ta ensure end-to-end acceptable quality, each component along the trans

port path must be designed ta provide the desired level of service. Therefore,opti

mizing only specifie components in the path may not be adequate for ensuring the

desired quality for the application. For example, designing a goad forward errar re

covery scheme for the adaptation layer whiIe using a poor cell discarding algorithm

(e.g. randomly discarding) Cor the switch will not be sufficient ta maintain the end-ta

end performance of video application at the receiver. Therefore, the adaptation layer,

encapsulation scheme, scheduling discipline in the ATM switches and error recovery

mechanisms at the receiver must aIl he designed ta provide the desired level of quality

at the receiver. Consequently, the proposed Cramework relies on three schemes : an

intelligent video data partition and prioritization mechanism located at the source,

a slice-based MPEG-2 packet encapsulation strategy, an AAL sublayer with Corward

errar correction control capability, and finally, an efficient switch scheduling strategy

with adaptive discarding technique.

The rest oC this thesis is organized as Collows: first, we introduce sorne fundamen

tal concepts of video coding, compression, and the MPEG-2 standard, Collowed by a

discussion of the Cunctionality of the system layer oC MPEG-2 . Aspects dealing with

the current types of services in ATM networks and the ATM adaptation layer with

principal traffic and QoS control approaches Cor traditional packet-oriented applica

tions over ATM are presented in Chapter 3. In Chapter 4, issues oC MPEG-2 video

traffic over ATM best effort services are addressed, Collowed by a discussion of the

insufficiency ofsome approaches that have been proposed in the literature. Chapter 5

is devoted to the description of the four components of the proposed best effort video

delivery Cramework. In Chapter 6, we present the network model, the characteristic

of MPEG-2 tracing file we use, and the investigated performance parameters. We

•

•

•

5

discuss the experimental results in Chapter 7. Finally, chapter 8 gives the conclusion

and proposes areas for future research.

•

•

•

2

MPEG-2 STANDARD: A REVIEW

6

In this chapter we overview digital video coding and compression, and the MPEG-2

standard. we begin with a short history of MPEG standards and proceed to discuss

the video standard, which includes principles of MPEG-2 coding and compression sucb

as quantization and inter-frame prediction. Then, We conclude with a description of

the MPEG-2 systems standard, in which the main components of the transport stream

will he explained, including elementary streams, packetized elementary streams, and

program specifie information.

•2.1 History 7

•

•

2.1 History

Two important standardization efforts related to digital video coding were started

in the late 1980s. One is the ITU-T standard for video conferencing and video

telephony, known as H.261. The other one came under the name of MPEG (Moving

Pictures Experts Group) from ISO/IEC in order ta define a video coding algorithm

for application on digital storage and transmission. In addition, audio coding was

added and the scope of the targeted applications was extended ta cover aJmost a11

applications, from multimedia systems to high definition television.

MPEG's first effort led to the MPEG-l standard, that was puhlished in 1993 as

ISO/IEC 11172. It is divided into three parts: audio compression, video compression,

and system level multiplexing for applications that need video and audio ta he played

back in close synchronization. MPEG-1 is being used in a variety of applications. For

example, CO-I and VideO-CD technology use MPEG-1 as the compression algorithm

for video and audio. It was designOO to support video coding up to 1.5 Mbps with

VHS quality, audio coding at 192 Khps/channel (stereo CD-quality) , and is optimized

for non-interlaced video signaIs.

MPEG's second effort started in 1990. The main objective was to design a com

pression standard capable of different qualities depending on the bit-rate, &om TV

broadcast to studio quality, and ta enable the transmission of video and audio in

broadband networks. This work 100 to the MPEG-2 standard which is based on

MPEG-l but is more sophisticated and optimized for interlaced pictures. FUrther

more, it is targeted to cope with lossy communication media. The MPEG-2 standard

is capable ofcoding standard TV at about 4 to 9 Mbps, and HDTV at 15 to 25 Mbps.

8 2 MPEG-2 STANDARD: A REVIEW

• In the audio part of the standard, it supports multi-channel surround sound coding

while being backward compatible with the MPEG-l audio definition.

Since this thesis is mainly concerned about the MPEG data transport in network

environment, we will use MPEG-2 data only in our network simulation. The MPEG

standard will be discussed in Section 2.4 with an emphasis on MPEG-2 .

2.2 Color Representation

•

•

To understand the process of video compression, it is a good idea to start with the

color representation.

Color is the perceptual result of light in the visible regjon of the spectrum, having

wavelengths in the region of 400nm to 700nm, incident upon the retina. Because

there are exactly three types of color photo-receptors in the eye, three numerical

components are neeessary and sufficient to describe a color, providing that appro

priate spectral weighting funetions are used. This is the coneern of the science of

colorimetry. Usually, color is represented by the intensity distribution of the three

primary colors, Red, Green and Blue (R,G,B) or equivalently of one luminance (Y)

and two chrominance components(U,V or Cr, Cb). RGB color space is widely used

in computer environment, while YUV and YCrCb find their place in television. The

video compression technique used in ~IPEG-l and MPEG-2 works with the later.

It is believed that the human eye can detect as many as 4 million colors. Anything

more than that is potentially wasted. RGB uses a specifie number of colors for each

picture element. These are usually referred to in a ratio sucb as 5:5:5 or 8:8:8. This

means that for each component of the pixel (R, G, or B) there are 5 or 8 bits of color

respectively. 5:5:5 R,G,B is also referred to as 15 bit color and gives a total of 32,768

colors for any and every pixel on the screen. Common R,G,B color resolutions are 8

bits/pixel (256 colors), 16 bits/pixel (65,535 colors) and 24 bits/pixel (16.7 million

colors). Sa, to deliver an R,G,B video image that meets the needs of the human eye,

24 bits/pixel graphies are required. This is often referred to as True-Color. YUV

and YCrCb define and display color in a slightly ditrerent way. Because the human

eye can detect brightness (luminance) better than it can detect color (chrominance),

the video community developed the Y,U,V component format to reduce the amount

of data required ta deliver full fidelity calor, while retaining maximum brightness.

This is done by limiting the rate at which chrominance changes relative to the rate at

which luminance changes, thus allows the Y,U,V format ta reduce the amount of color

information per pixel while retaining the maximum brightness per pixel. Therefore,

•2.2 Color Representation 9

•

•

Component 4:1:1

Pixell Pixel 2 Pixel 3 Pixel 4

y 8 bits 8 bits 8 bits 8 bits

U (Cr) 8 bits Shared with pixel 1 Shared with pixel 1 Shared with pixel 1

V (Ch) 8 bits Shared with pixell Shared with pixel 1 Shared with pixel 1

4:2:2


U (Cr) 8 bits Shared with pixel 1 8 bits Shared with pixel 1

V (Cb) 8 bits Shared with pixell 8 bits Shared with pixell

4:4:4


U (Cr) 8 bits 8 bits 8 bits 8 bits

V (Cb) 8 bits 8 bits 8 bits 8 bits

Table 2.1: Comparison of YUV various ratio formats

Y,U,V video color resolutions are a1so reCerred to in a ratio format. Typical Y,U,V

formats are 4:1:1 (4:2:0), 4:2:2, and 4:4:4. The numbers in the format refer ta the

amount of color data that is shared between groups of 2 or 4 pixels. Assuming 8 bits .

per pixel of data, a 4:1:1 Y,U,V format means that within a 4 pixel group there will

be 8 bits of Y for each pixel but only 8 common bits of U and V shared over 4 pixels,

4:2:0 format is a special case of 4:1:1, where the chrominance values are calculated

and therefore represent a value that is offset from luminance samples. A 4:2:2 Y,U,V

format means that within a 4 pixel group there will be 8 bits of Y for each pixel but

only 8 common bits of U and V shared over each 2 pixel pair. A 4:4:4 Y,U,V format

means that for each pixel within a 4 pixel group there will be 8 bits of Y, 8 bits of U,

and 8 bits ofV. Essentially, 4:4:4 is identical in color depth to 24 bit R,G,B. Table 2.1

show the relative comparison of 4:1:1, 4:2:2 and 4:4:4 and how its data is sampled

10

• and spread across pixels.

2.3 Digital Video Format

2.3.1 CCIR-601 Recommendation

2 MPEG-2 STANDARD: A REVIEW

•

The video signal is obtained through a process known as scanning. Scanning can he

either progressive or interlaced. Progressive scanning scans aIl the horizontallines to

fonn the complete frame and is used hy the computer industry. Interlaced scanning

is used by the TV industry. CCIR Recommendation SOI (now with the name of ITU

BT-60l) defines the digital video format, which serves as a standard for TV industry.

Table 2.2 shows the scanning parameters defined in CCIR-SOl.

PARAMETER CCIR-601 NTSC CCIR-SOI PAL/SECAM

Active Pixels/Line 720 720

Active Lines/Picture 485 576

Sampling Structure 4:2:2 (4:4:4) 4:2:2 (4:4:4)

Temporal Rate 60 fields/s 50 fields/s

Frame/Sec. 30 25

Aspect Ratio 4:3 4:3

Interlacing 2:1 2:1

Table 2.2: CCIR-601 video frame scanning parameters

2.3.2 Source Input Format (SIF) and Common Interchange Format

(CIF)

However, the number of picture element (pixel) in CeIR 601 is tremendously high

to he coded at reasonable bit-rates. For example, consider a Ce1R-601 NTSC video

signal at a frame rate of 30Hz and 24 bits/pixel resolution, the requirements are

approximately 250 Mbps Cor 4:2:2 sampling ratio. For a two bouts movie, the stor-

• age requirement would simply translate to an astronomical 225 Gbytesl Thus to

accommodate slower communication channeis and computer buses, a SIF (Standard

Interchange Format) source format has been defined as MPEG encoder input signal

(MPEG-l onlyaccepts SIF whereas MPEG-2 accepts both SIF and CCIR-601).•2.4 MPEG Coding

PARAMETER SIF NTSC SIF PAL/SECAM CIF

Active Pixels/Line 352 352 352

Active Lines/Picture 240 288 288

Sampling Structure 4:2:0 4:2:0 4:2:0

Temporal Rate 60 fields/s 50 fields/s 60

Frame/Sec. 30 25 30

Aspect Ratio 4:3 4:3 4:3

Il

•

•

Table 2.3: SIF and CIF video frame scanning parameters

Briefty, the CCIR 601 to SIF conversion is done by three distinct operations.

Reducing the picture size by a factor of two, removing the interlacing (e.g. removing

odd fields from CCIR 601), and a 2 to 1 decimation of the chrominance information,

since it is less important ta the human visual system.

CIF (Common Interchange Format) was developed in the ITU-T H.261 recom

mendation in order to have a common format to which PAL- and NTSC-based

frames could be converted. The parameters of SIF and CIF are bath summarized

in Table 2.3.

2.4 MPEG Coding

Ta understand the cading and compression process, it is important to recognize the

different redundancies present in the video signal data- spatial, temporal, psycho

visual and coding. Spatial redundancy occurs because neighboring pixels in each

individual frame of a video signal are related, in other words, have sorne degree of

correlation. The pixels in consecutive frames of a signal are aIso correlated, leading

to substantial temporal redundancy. In addition, the human visual system does

not treat ail the visual information with equal sensitivity. This leads to psych~

visuaI redundancy. For example, as we mentioned in Section 2.3, the eye perceives

changes to a greater extent in the luminance than in the chrominance. The eye is


• also less sensitive ta high frequencies. Finally, not all parameters occur with the same

probability in an image. As a result, they would not require equal number of bits

ta code them, leading ta coding redundancy. For any compression algorithm ta be

effective, it must exploit these redundancies.

•

2.4.1 Coding Principles

The basic MPEG algorithm consists of the following stages: a motion compensation

stage, a transformation stage, a lossy quantization stage and a last lossless coding

stage. The motion compensation stage takes the difference between the current image

and a shifted view of the previous one. The transformation stage then tries to concen

trate the information energy inta the first transform coefficients, MPEG uses Discrete

Cosine Transform (OCT) for this purpose. The quantization step that follows causes

a loss of information that takes inta account psychO-visual limitations of the human

eye, and the last coding stage is nothing more than an entropy cading pracess that

further compresses the data.

MPEG defines a number of basic abjects that are used ta structure video infor

mation (see Figure 2.1). Before discussing the cading algorithm, let's introduce these

basic units that are used in MPEG algorithm:

Block: A block is the smallest corling unit in the MPEG algorithm. It is made up

of 8 x 8 pixels and can be one of three types: luminance (Y), red chrominance

(Cr) and blue chrominance (Cb) as described in Section 2.2. Block is the basic

unit in intra-frame OCT coded frames.

Macro-block: A macra-block is the basic coding unit in the MPEG algorithm. It

consists of a 16 x 16 pixel segment. Since MPEG's video main profile uses the

4:2:0 chrome fonnat, a macrO-block consists of four Y, one Cr and one Cb block.

It is the motion-compensation unit.

Slice: A sUce is a horizontal strip within a frame and it is the main processing unit in

MPEG. Coding of blocks and macro-blocks is feasible only when all the pixels

of a sUce are available. Besides, coding of a slice is being done independently

• from its adjacent slices, making it an autonomous unît. Thus, slices serve as

resynchronization units.

•2.4 MPEG COCÜlJg

Video Sequence

13

\..y

üroup of PiC:IUR

) \. y

üroup of Piclure

)

Figure 2.1: Basic objects defined in MPEG-2 .

EB-!

D

•

Mac:robloc:k

Block

1

1Slice

1

1

1

1

•

Picture: A picture in MPEG is a single frame in a video sequence.

Group-of-pictures:The group-of-pictures (Gap) is simply a small sequence of pic

tures in which random access is provided. Typical values are 12 and 15 pic

tures/group. The GOP concept was mandatory in the MPEG..1 standard

whereas it is optional in the MPEG-2 standard.

Sequence: The sequence consists of a series of pictures (or a series of GOPs if these

are used).

2.4.2 Discrete Cosine 'Iransform Coding

Usually, the video energy of a image has low spatial frequency that varies very slowly

and 50 a transformation Crom space to frequency domain can concentrate the energy in

very few coefficients. For this transformation, the actual image is divided into blocks

to decrease the complexity. Every block (8 x 8) is transformed according to a tw~

dimensional Discrete Cosine Transform which cao be thought of as a one dimensional


• DCT on the columns and a one dimensional OCT on the rows. Each coefficient is

associated with a specifie fonction of horizontal and vertical frequencies and its value

(after the transformation) indicates the contribution of these frequencies in the image

block. An explicit fonnula for the 8 x 8 two-dimensional OCT can he written in terms

of the pixel values f(i,j) and the frequency domain transform coefficients F(u, v):

F( t) = !C( )C( )~ ~[f(··) (2i + l)u1l" (2j + 1)V7r1u, v 4 u v ~~ 1.,) cos 16 cos 16I=OJ=O

where

CCx) = {~ for x = 01 otherwise

However, this transformation itself does not reduce the number of bits required from

the black representation. The reduction is being done after the observation that the

distribution of coefficients (F(u, v)) are non-uniform. The transformation concen

trates as much of the video energy as possible into the low frequencies leading to

many coefficients being zero or almost zero. The compression is achieved by skipping

• all those near zero coefficients (quantizing) and variable-Iength coding the remaining

ones, as described in the following.

•

2.4.3 Quantization

The quantization stage cornes after the OCT transformation stage. The idea here is

to transmit the OCT coefficients in a way to minimize the bit-rate, in order to achieve

this, we could reduce the number of precision for the DCT coefficients (i.e. reducing

the required number of bits). As an example, we could skip a few least significant

bits of these coefficients and transmit only the rest. This is based on observations

showing that numerical precision of the OCT coefficients may be sacrificed without

affecting image quality signifieantly. Moreover, this stage takes into consideration the

impact of this transform to the human vision. PracticallYt high-frequeney coefficients

are more coarsely quantized than low-frequeney ones, because the human eye is less

sensitive to the former.

MPEG is a Jossy compression scheme due to this stage sinee the coding data lost

some precision thus the reconstructed picture is not identical to the original. However,

if without this loss then the compression ratio would have been very low (compared

ta 100:1 that it is typical in MPEG) since the least significant bits of each color

component become progressively more random, thus harder ta code.•2.4 MPEG CodilJg 15

2.4.4 Entropy Coding

The final compression stage starts with the serialization of the quantized OCT co

efficients and attempts ta exploit any redundancy left. The way the serialization is

done affects the final compression. The OCT coefficients are rearranged in a zig-zag

manner as it is shawn in Figure 2.2. The scanning starts from the coefficient with the

•

Mleralle SCia

A /1 A A A V1 j

IV), IV) IV) /1 ( ( JI ( .J j

) ) ) IV ) /' 1

/ / / 1/ J / 11 J / J

j 1 1 Il Il1 1

""- -

•

Figure 2.2: DCT coefficients in a corling black is scanned in zig..zag arder (regular and altemate).

lowest frequency (OC coefficient) and follows the zig-zag pattern until it reaches the

last coefficient. In MPEG-2 there is an altemate scan pattern that is more efficient

with interlaced video signais. The sequence of coefficients is then entropy-coded using

a variable length code (VLC). The way the VLC allocates code lengths depends on .

the probability that they are expected ta occur, these codes could he obtained by

using Huffman algorithm.

2.4.5 Motion-Compensated Inter··Frame Prediction

The inter-frame prediction is being used in arder ta exploit temporal redundancies

found in the video sequence. The idea is to check the displacement of the varions

macro..blocks and to encode the best resulting difference (see Figure 2.3). The MPEG

syntax specifies how to represent the motion information: one or two motion vectors

per 16x16 macro..block of the picture depending on the type of motion compensa·

•16 2 MPEG-2 STANDARD: A REVlEW

(-frame

MOlion Vcclor(Mh. Mv)

Colocalcd Macroblucks

Figure 2.3: Motion compensation and motion estimation.

•tion (forward or backward-predicted, see next section). However. the method used

in computing the motion vectors is not specified in the standard. This can be done

either exhaustively or using different techniques depending on many parameters. For

example, in stationary scenes the predictor may use the same block from the reference

frame. If the scene is not stationary then one way ta compute motion vectors is ta

find the d!fference between the current black and a black that is shifted appropriately

in the reference frame. "Block-matching" techniques are likely ta be used for this

purpose [7}. The actual ways of computing the motion vectors is left ta the imple

menters. The whole encoding process is shown in Figure 2.4 in the block-diagram of

a hypothetical MPEG encoder.

2.4.6 Picture Types in MPEG

In MPEG (bath MPEG-l and MPEG-2 ) there are three types of pictures that are

defined [8]:

Intra-frames or I-frames: These are pictures that are coded autonomously with

out the need of a reference ta another picture. Temporal redundancy is not

• taken into account. Moderate compression is achieved by exploiting other three

redundancies. An I-frame is always an access point in the video sequence.

•2.5 MPEG-2 Video Standard

videosueam

P-otB·fnme

......----..----oej VLC ~---.

17

OCT: Disc:rete Cosine TransformQ: QuantizationIQ: Inverse Qu:mtimionIDCT: Inverse OCTMCP: Motion-Compenwcd PredictionVLC: Variable Lcnlth Ccldcr

•

•

Figure 2.4: MPEG encoder.

Predictive or P-frames: These frames are coded with respect ta a previous 1- or poo

frame using a motion-compensated prediction mechanism. The coding process

here exploits ail kind of redundancies.

Bidirectionally-predicted or B..frames:The B-frames use bath previous and future

1- or P..frames as a reference for motion-estimation and compensation. This

achieve the highest compression ratios. However, because they reference bath

past and future frames, the coder has to rearder the pictures that are involved

in this process 50 that each B-frame is produced aCter ail the frames it refer

ences. This intraduces a reordering delay which depends on the interval between

consecutive B-frames.

A typical MPEG video sequence is shown in Figure 2.5.The I-frame is coded first.

tben the next P-frame and then the interpolated B-frames between the two. The

process repeats with the next P-frame and B-frames.

2.5 MPEG-2 Video Standard

The MPEG-2 standard is similar to MPEG-l but bas extensions to cover a wider range

of applications. The primary application targeted during the MPEG-2 definition

process was the all-digital transmission of broadcast quality video at coded bit-rates

between 4 and 9 Mbits/sec. However, the MPEG-2 standard proved to be efficient

for other applications &Iso that need higher rates, sucb as HDTV.

•18

Time

2 MPEG-2 STANDARD: A REVIEW

- [·frame

- P·frame

Cl B·frame

Figure 2.5: Example of inter-dependence among various picture types in a MPEG video sequence.

2..5.1 Differences between MPEG-2 and MPEG-l

• Sorne of the important differences between MPEG-2 and MPEG-1 standards are

summarized below:

1. MPEG-2 is optimized for interlaced pictures and can represent progressive video

sequences a1so, whereas MPEG-l's syntax is strictly meant for progr~ssive se

quences and was optimized for CD-ROM or applications at about 1.5 Mhit/sec..

2. The second main improvement of MPEG-2 in comparison to MPEG-l is the

possibility ta efficiently transmit video over networks and Dot from a local CD

ROM player.

3. MPEG-2 has more profiles and levels and supports scalable profiles. This is an

important feature that could he taken advantage in the network environment

as described later in this section.

4. Additional prediction codes for motion-compensation were introduced, as weIl

as more chrome formats.

• 5. Severa! other more subtle enhancements (e..g. adaptive quantization, 10-bit

nCT nc precision, non-linear quantization, VLC tables. improved mismatch

control.) were introduced that improved the coding efficiency even for progres

sive pictures.•2.5 MPEG-2 Video Standard 19

•

•

2.5.2 Scalabilityand Data Partition

As listed above, An important difference between MPEG-2 and MPEG-l is that

MPEG-2 could achieve scalability by using its structure syntax. Four scalable com

pression modes are defined in the MPEG-2 toolkit [7]. These coding techniques

subdivide MPEG-2 video into numerous layers (base, middle, and high layers) mostly

for prioritized transmissions [8]. At the destination, the lowest priority bitstreams,

referred as enhancement layers, can be added to the base layer to display a higher

quality picture. A brief summary of these different modes are presented below.

• Spatial Scalability: this mode codes a base layer at lower sampling dimensions

(i.e. resolution) than the upper layers. This is useful in simulcasting, where a

standard TV set needs only ta decode the CCITT-S01 720 x 480 base channel,

and leave the higher HDTV 1440 x 960 data.

• Temporal Scalability: the higher priority bitstream codes video at a lower frame

rate (e.g. 15 Hz), and the intermediate frames are coded in a second bitstream

ta achieve a full frame rate (e.g. 30 Hz).

• SNR Scalability: the layers are coded with differing picture quality by using

different quantization step sizes.

• Data Partitioning: it is a frequency domain method that breaks the block of 64

quantized DCT coefficients into two bitstreams. The first, higher priority bit

stream contains the lowest frequency coefficients and side information (sncb as

motion vectors, macroblock headers, etc. ). The second lower priority bitstream

canies the remaining higher frequency AC coefficients.

One application of the scalable syntax concept might be the following: one layer con

tains the video information for a standard (PAL or NTSC resolution) TV program,

this layer, called the "hase layer" in MPEG-2 , could then he combined with another


• information stream, the "enhancement layer" , which contains additional video infor

mation to get the RDTV quality video. This is very useful when data are transmitted

over a resource-limited network environment.

Another idea similar to this is used for data partitioning, in this case, the most

important syntax elements could be transmitted with a higher priority, the less im

portant elements \Vith lower priority. Then, when transmitted in a best effort service

class, like UBR, the elements with lower priority would be considered to discarded first

if network congestion happens. This could preserve the critical data in applications

thus ensure graceful end-to-end degradation.

2.5.3 MPEG-2 Systems Layer

The ~IPEG-2 standard also defines system layer specification that describe how more

than one stream (video or audio) should be multiplexed together to form an actual

program. A program is considered a single broadcast entity service. For example,

"The 11 Q'clock CTV news" is considered a program that has individual streams of

• video, audio and maybe other data such as caption text. The standard defines the

\Vay the different streams are multiplexed. Figure 2.6 shows the scope of the MPEG-2

Systems part in relation to the video and audio part and the network equipment.

Video DataVideo Encoder

Audie EncoderAudio Dala

Network Eqllipmenl

Scapc ofMPEG-2 Sysrcms

Figure 2.6: Scope of MPEG-2 systems specifications.

•Two schemes are used in the MPEG-2 standard for the multiplexing process.

• Program Stream: This is analogous and similar to MPEG-l Systems layer. It

is a grouping of video, audio and data elementary streams that have a common

time base and are grouped together for delivery in a specifie environment. Each

program stream consists of only one program. The program stream is often

called Program Stream Multiplex.

•2.5 MPEG-2 Video Standard 21

•

•

• Transport Stream: The transport stream combines one or more programs inta

a single stream. The programs may or may not have a cammon time base. This

type of multiplexing is used in environments where errors are likely and is the

default choice for transport uver a computer network. The transport stream is

often called Transport Stream Multiplex.

The program stream is mainly focused on using CD-RDM and hard-disc media,

thus, it uses long data structures to transport video and audio data. This could

ooly be done io "low-error environment" , since a loss of any of these structures could

results in seriaus problems with the quality of the video information transferred. The

transport stream is used in the oetworked environment, it uses fixed length, relatively

short data structures that can be well processed in network environment. Since this

thesis concerns MPEG-2 video in ATM networks, we will only focus on the transport

stream.

Transport stream layer deals with sorne special entities. The whole process starts

from the uncompressed data which cornes directIy from the actual video sequence,

each frame is uncompressed and is called a "presentation unit". ·The encoder com

presses each frame according ta the standard and each frame is then called an "access

unit". The stream produced by the access units is called "elementary stream" in

MPEG terminology. This process is shawn in Figure 2.7.

Arter the creation of the elementary stream, the next step is its packetization. The

resuiting stream is now called "packetized elementary stream" and the packets are

called "PES packets". The way the PES packets are formed is independent from the

actual multiplexing procedure.

A PES packet consists of a header and a payload. The payload is nothing more

than data bytes taken sequentially from the original elementary stream. There is

no specific format for encapsulating data bytes in a PES packet, i .. e.. there is no

requirement ta align the start of the access units and the start of the PES packets.

This means that an access unit may start at any point within a PES packet as shown in


• Figure 2.7. In addition, more than one access unit may he present in one PES packet.

The way this packetization is done, however, cao significantly affect the nature of the

actual packetized stream. For example, if each PES packet contains exactly one video

frame (in the case of a video elementary stream), the decoder cao determine the start

and end of a frame easily. Similarly, network transport and control policies can take

henefit of this structure ta offer a guaranteed packet-oriented service. This, however,

requires use of variable size packets and iDcreases the complexity of processing in

the encoder. On the other hand, if the PES packets are of fixed length, then the

packetization process at the encoder is simpler.

Presentation Unit

Elementary Stream

",

Frame 3

.......... '

Frame: 1 Frame 2 Frame: 4 Frame: S r.' '' '.~

........············.:: ······o,nê~~presscd Video StreamO' ~. • • .. .

Ac:cess Unit :: , .. ' ....

~1......_-q=EJ---.~--Ittrtt=-p-,.......[: :: :..\"':'::.:---...

I_PES__Pac_kct~I.Il_---I···I·I__··l""I""--"'--_Packetizcd EJcmenwy Stream (PES)

oPœE;;C4EJ 0F"lXed Lcngtb Payload ( 184 bytes)

Transpon Stream (TS)

"' .

rrs PlCk~1 \. ". ". ".DDIfDD

Adaptation Ficld(uscd or stumng)

•Figure 2.7: PES encapsulation using fixed length packet.

As shown in Figure 2.7, the transport stream consists ofshort, 6xed-length packets.

A transport packet has a length of 188 bytes. It comprises a 4-byte header followed

by an Uadaptation field" or a payload or bath. The PES packets from the varions

elementary streams are each divided among the payload parts ofa number of transport

packets. However, there are two constraints:

1. The first byte of a PES packet must he the first byte of a transport packet

payload.

• 2. Each transport packet must contain data from only one PES-packet, Î. e. a

single transport packet reCers to a specifie PES stream and thus to a specifie

•2.5 MPEG-2 Video Standard

elemeotary stream.

23

•

•

Because of the two constraints stated above, a transport packet may Dot be eom

pletely full siDee it is unlikely that a PES-packet will fit exactly into an integer number

of transport packets. The stuffiog bytes that oeeded ta fill the packet are placed in

the adaptation field (see Figure 2.7). The amount of this stuffing cao be minimized

by careful selection of the PES packet length. Usually. long PES packets are better

in terms of bandwidth efficiency, but are more prone to synchronization problems.

•

•

•

3

SERVICE CLASSIFICATION AND ADAPTATION LAYER OF ATM

24

Asynchronous Transfer Mode (ATM) is a cell-based switching and multiplexing tech

nology designed to be a general-purpose, connection-oriented transfer mode for a wide

range of services. It is used bath in WAN and LAN environment, public and private

network, as specified by the ATM Forum.

The primary unit in ATM is the celle The ATM standard defines the cell with

a fixed-size length of 53 octets (bytes), comprised of a 5-octet header and a 48

octet payload. The fixed cell size simplifies the switching and multiplexing process

and enables implementations of these at very high-speeds. The fixed cell size aIso

eliminates the problems of short packets being delayed behind larger ones. This

allows ATM to provide good service ta such things as voice and video, where large

transmission time variation is unacceptable.

The vision of ATM is that an entire network can be constructed using ATM Appli

cation Layers (AALs). and switching and multiplexing principles, to support a wide

range of service. In this manner, ATM provides multiple QoS classes for differing

application requirernents on delay and loss performance.

•

•

3.1 Classilication ofServices in ATM Networlcs 25

•

3.1 Classification of Services in ATM Networks

As mentioned above, in ATM networks, a large number of services can be provided.

These include low-speed services such as telemetry, tele-control, tele:alarm, voice,

tele-fax, medium-speed ones like Hi-Fi sound, video telephony, and high-speed ones

like high-quality video distribution. In addition, the conventiooal "best-effort" appli

cations will aIso be included, giving a large variety of services provided.

These different services are based 00 a variety of desired communication attributes

(see Table 3.1), such as cell loss rate (CLR), cell transfer delay (CTD), cell delay

variance (COV), sustainable cell rate (SCR), peak eell rate (PCR) , minimum cell rate

(MCR), whether or not the flow control is applied. By taking difrerent combinations

of these attributes, we have Cour basic service classes:

• Constant Bit-Rate (CBR): Used for emulating circuit switching, where the re

quired bandwidth is constant and known in advance (e.g. voice and television).

This service provides guarantees on both delay and delay variance.

•

•

•

26 3 SERVICE CLASSIFICATION AND ADAPTATION LAYER OF ATM

QoS

Attribute CBR VBR ABR UBR

realtime 1 non-realtime

CLR Specified Specified Unspecified

CTD Specified Unspecified

COV Specified 1 Unspecified Unspecified

SCR N.A. Specified N.A.

PCR Specified Specified

MeR N.A. Specified N.A.

Controlled No Yes No

Table 3.1: ATM Layer Service Categories

• Variable Bit-Rate (VBR): Allows users ta send at a variable rate that could

he characterized in advance (e.g. video conferencing). The traffie is described

in terms of the PCR, SCR, and MeR. VBR has two sub-categories: real-time

(VBR-rt) and non-real-time (VBR-nrt) [9]. The fomler needs specifie quality-of

service guarantees from the networks, sinee it carries traffie with a fixed timing

relationship between samples. The VBR-nrt is intend ta carry variable bit-rate.traffie in which there is no timing relationship between sample, but a guarantee

of end-to-end delay is still needed.

• Available Bit-Rate (ABR): Designed for classieal data traffic that cannot (or is

hard to) predict in advanee and is not time sensitive. It proposes a guaranteed

minimum rate and uses a rate-based feedback approach ta control congestion

in the netwark.

• Unspecified Bit-Rate (UBR): Designed for those data applications that want

ta use any leftover bandwidth and are Dot sensitive ta cell loss or delay. This

service does not offer any service guarantees and thus, has the minimum priority

among all the other classes.

3.1.1 How These Services Work Together

First, a given amount of bandwidth is guaranteed ta CBR, and VaR connections.

Although the entire guaranteed bandwidth is not always used by the connection, the

connection bas access ta all of its reserved bandwidth, ifnecessary (Figure 3.1). ABR

then belps fiU in this otherwise wasted bandwidth witb regular data traffic. UBR was

created ta do approximately the same thing, except that it had no MeR and was

•3.2 ATM Adaptation Layer (AAL) 27

•

•

AvDilable For ABR(UBR) Servie

Alloc:alcd Ta CBR Servic:es ~_-'- ----.l.~

Time

Uscd by CBR Servic:es

Figure 3.1: Service bandwidth allocation.

more of a send-and-pray protocol. To deal with the data loss and errOJ: in networks,

one way is, for the receiver, ta timeout aCter a given period of time, then ta ask for

a re-send of a missing cell. However, in a delay-sensitive flow (like MPEG-2 video

data), retransmission causes long time delay which is not acceptable. The alternate

way is using a forward error correction mechanism to recover from the error which

avoids the extra delay caused by retransmission.

3.2 ATM Adaptation Layer (AAL)

Another important concept is the ATM Adaptation Layer (AAL). The function of

this layer is to pro\ide generalized inter-working across the ATM oetwork. Generally

speaking, it is divided ioto two sub-layers: the Segmentation and Re-assembly (SAR)

and the Convergence Sublayers (CS) (see Figure 3.3).

The SAR is responsible for the segmentation of the outgoing Protocol Data Units


• (PDUs) into ATM cells and the re-assemblyof ATM cells back into PDUs. In the case

of data, for example, the AAL takes frames (blocks) of data delivered to it, breaks

them up into cells and adds necessary header information to allow rebuilding of the

original frame at the receiver.

The function of convergence sublayer covers the generation and recovery of timing

information, e.g. it can compensate for the effects of cell delay, variation, it takes

care of cell misinsertion, cell loss and cell mis-sequency, and also it 8ags possible

error condition to the upper layer.

AAL is designed ta cope with the different requirement of variant traffie. The ITU

T has defined four generic classes network traffic that need to be treated differently by

an ATM network. These classes are designated from class A to class 0 with regards

to the following operations [10]:

• Timing between sender and receiver (present or not present)

•

•

• Bit rate (variable or constant)

• Conneetionless or connection-oriented sessions between sender and receiver.

These four traffic classes are summarized in table 3.2.

Class A lB C 1 0

Timing Synchronous Asynchronous

Bit Transfer Constant 1 Variable

Connection Mode Connection-Oriented 1 Connectionless

Table 3.2: Support Operations for AAL Classes

Originally, there were four different AAL types proposed, one for each traffie class.

This changed during the standards definitian process as the problem came ta be better

understood. The current AAL types association with traffie classes are summarized

in Figure 3.2

As we can see, there are DOW Cour AAL types:

• AAL-l provides function for c1ass A.

•3.2 ATM Adaptation Layer (AAL)

Class A1

Class B1

Class C1

Class D

AAL-l AAL-2 AAL-S ...... .....···AAL-3/441.··::;,·,:::·· ...~

ATM Adaptation Layer

ATM Networking Layer

Physical Layer

Figure 3.2: Traffle classes and AAL types.

29

•

• AAL-2 provides the required function for variable-rate service class B. As yet

there are no defined standards in this area.

• AAL-3/4 provides service for bath class C and D. AAL 3 and 4 were combined

during the standards definition process as it was realized that the same process

could perform both functions. This type is quite complex and regarded by sorne

as over-designed.

• AAL-5 provides functions for both class C and D too, but is significantly simpler

(it is also less functional, however).

The intemallogical structure offour AAL types are shawn in Figure 3.3: As shown,

Type 1 Typc2 Type 3/4 TypeS

t

•

SSCS sscscs cs CPCSCPCS

SAR SAR SAR SAR

ATM

sscs: Service Specifie: Convcraenc:y Sublayu

CPCS: COlDfDon Pan Covefacnc:y Sublayer

Figure 3.3: Structure of the convergence sublayer.

the convergence sublayer has been divided for the type 3/4 and 5 traffie. The two

sublayers are the service specifie convergence sublayer (SSeS) and the common part


• of convergence sublayer (CPCS). As their name imply, SSCS is designed to support

specifie aspect of a data application, and CPCS supports generic functions common

to more than one type of data application.

Since AAL-5 is currently the most eommonly used adaptation layer in the industry

and for our interest, it ean support VBR MPEG-2 traffie (AAL-l can be used only

with CBR traffie), we will discuss it in details as follows.

AAL-S was originally designed for transporting data traffie with no real-time COD

straints over ATM. However, it has also been used for transfer multimedia data DOW

because of its simplicity and eflicieney. The CPCS of AAL-S can make the use of

variable length protocol data unit (PDUs) from 1 to 65,536 bytes. The SSCS pro

vides the flexibility of having a special sublayer for different services that need to use

AAL-5. The SSCS may also be null and in this case it does not perform any specifie

task.

3.2.1 Common Part Convergence Sublayer (epCS) of AAL-5

• The CPCS-PDU format for AAL-5 is shawn in Figure 3.4. The meanings of the fields

Payload (Max 6S 536 Bytes) Pad CPCS·UU CPI Lcngth CRC-32047 1 1 2 4

Figure 3.4: AAL-5 CPCS-PDU Header.

in the CPCS trailer are briefly described as follows:

CPCS-PDU Payload: If absence of an SSCS this will be just the data passed to

the AAL over in the service interface (the AAL service data unit, SDU). If an

SSCS is present it may perform other funetions such as blocking or re-blocking

or even transmit protocol data messages of its owo.

Pad: The CPCS pads out the data frame sa that the total length, including the

CPCS trailer, is a multiple of 48 bytes. This is 50 that the SAR does not have

to do any padding of its OWD.

• CPCS User ta User Indication (CPCS-UU): This is used ta pass information

from one CPCS to its communicating partner.

Data Length: This field is very important because it tells us how much of the

reœived data is CPCS-SDU and how much is pad. It is also a check on the loss

(or gain) of cells during transit.

•3.2 ATM Adaptation Layer (AAL) 31

•

•

Cyclic Redundancy Cheek (CRC): This field provides a validity check on the

whole cpeS-PDU.

3.2.2 Segmentation and Re-Assembly Sublayer

AU that the SAR sublayer does is to take the SAR-SDU (CpeS-PDU) and break it

up into 48-byte units. In the reverse direction it receives a stream of cells and builds

them inta a S.~R-SDU ta pass to the cpes.

3.2.3 Error Detection

For AAL-5 layer, in the case that the data unit is corrupted or lost, an indication

is sent ta the SSCS (or the service layer if the sses is null). However, according to

the current standard, AAL-5 does not do any error recovery. That is, AAL-S does

not provide enough protection against cell errors and cell lasses. This is because

that it was mainly designed for loss-sensitive data transfer applications that make

use of reliable transmission protocols to handle error correction with retransmission

mechanisms based on sorne kind of feedhack scheme.

•

•

•

4

ISSUES IN MPEG-2 OVER ATM

32

The MPEG-2 standard [Il] does not specify how a MPEG-2 video stream is to be

transported over a communication network. In order to ensure satisfactory quality,

a number of design issues have to be addressed, some of which are discussed in this

section.

•4.1 Service Class Selection 33

•

•

4.1 Service Class Selection

The 6rst problem that arises in the transport of MPEG-2 over ATlVI is the select of

the service classa To do this, a compromise must be made between two conflicting

requirements: quality-of-service guarantees and network utilization. There are several

approaches proposed at this time:

• Deterministic Constant Bit-rate (CBR) approach: In this approach, MPEG-2 is

considered CBR in the network and is treated as sucb. The constant rate has ta

be either computed in the case of a pre-existing MPEG stream or estimated in

a real-time application. Any smoothing necessary to deliver a constant bit-rate

stream must he done at the encoder via buffering.

• Variable Bit-Rate with rate renegotiation: The approach tries to maximize the

multiplexing gain by capturing the VBR nature of MPEG-2 [2, 5]. According

to this approach, the effective bandwidth of the source during a specifie interval

is used in order to allocate resources in the network. H enough resources are

not available the quality is degraded. The rate is renegotiated in the long time

run and the way the renegotiation points are selected depends on the exact .

algorithm. Source policing is required to ensure that the traffic source conforms

to the traffic contract it negotiated with the network when the connection was

established. However, this approach is not suitable for unpredictable traffic.

Also, the resource allocation and source policing add more complexity.

• Feedback-based Available Bit-Rate (ABR) best-effort service with or without

resource reservations: A number of schemes have been proposed for transport

ing video ovet a best-eff'ort service where the source adjusts its rate based on

available-rate information received !rom the network periodically. This requires

34 4 ISSUES IN MPEG-2 OVER ATM

• varying the eneoding rate at the source adaptively based on feedhaek informa

tion received from the network [6, 3, 4] .

• Unspecifie Bit-Rate (UBR) service without any guarantees: In this case the

stream is transported over the network in best effort mode with no feedhack

controls. The quality at the receiver depends on the current congestion level in

the network [12, 13, 14}.

•

Of these, the last two approaches are based on ATM best effort service, namely,

ABR and UBR which are primary designed for data traffic with bursty unpredictable

behavior. Since these best effort services will be widely available in the future and

are based on the excess bandwidth in the network with lower usage cast, they will

also support a non-negligible part of the multimedia traffic. However, in ABR, the

rate-based feedback mechanism requires one or more network round-trip times before

it reacts to congestion, since it has to wait until the network status information is

available. This limits its usage [8]. VBR is the simplest service in the sense that users

negotiate ooly theïr peak cell rates (peR) wheo setting up the coonection. Then,

they can send burst of video frames as desired at 80y time at the peak rate. If tao

Many sources send trame at the same time, the total traffic at a switch may exceed

the output capaeity causing delays, buff'er overfiows, and loss. The network ~ries ta

minimize the delay and 1055 but makes no guarantees. It is a true best effort service

and provides the least expensive service for the transport of multimedia applications.

Thus, in our quality of service control framework, we propose ta deliver MPEG-2

video applications over UBR service.

4.2 Choice of Adaptation Layer

Another important choice to transport MPEG-2 traffie over ATM is the Adaptation

Layer (AAL), The selection of a suitable adaptation layer for MPEG-2 needs to take

into aceount the specific requirements of MPEG-2 transport stream, such as jitter

removal, error detection and/or correction, end-to-end delay minimization for real-

• time applications, and support of both CBR and VBR applications. In our work, we

win choose AAL-S for the foUowing consideration:

1. AAL-5 is currently the most commonly used adaptation layer in the industry.

It is being used for encapsulating UNI 4.0 signaling messages and to carry best

effort traffic through the ATM network.

•4.3 'Iransport Stream Encapsulation 35

•

•

2. AAL-5 can support VBR MPEG-2 traffic. AAL-l can be used only with CBR

traflie, while AAL-2, which is proposed to support VBR traffic, is not suitable

for video beeause it has been recently standardized for mobile voiee communi

cations with a maximum AAL-2 PDU of 64 bytes, this is tao little for video

packet.

3. Since signaling is being done under AAL-5, ATM network interfaces will need

to support different types of adaptation layers if other AALs are used, which

makes such a choice expensive.

However, the present specified AAL-5 is inadequate for transmission of variable

bit rate video and requires extended features. For instance, due to the lack of more

sophisticated error detection functionality, the AAL-5 is unable ta know the position

of the cells 10st inside the POU and 50 no error correction can be applied (see Section

4.6). In this thesis, we propose a Service Specifie Convergence Sublayer (SSCS) in

AAL-5, which defines a robust Forward Error Correction (FEe) mechanism targeted

to MPEG-2 encoded video transmission.

4.3 Transport Stream Encapsulation

After the AAL has been chosen, the next issue is how MPEG-2 transport stream

packets are mapped ioto AAL-5 Service Data Units. Basically, 1 ta n transport .

packets can be mapped into one AAL-5 SOU. For AAL-5 with a "null" service specifie

convergence sublayer, ATM Forum requires that n = 2 must be supported for all

conformable equipment and with the following constramts:

• An AAL-5 POU shall contain two TS Packets, unless it contains the last TS

Packet of the Single Program Transport Stream.

• An AAL-5 POU shall contain ooly one MPEG-2 Transport Packet, if that

MPEG-2 transport paclœt is the last transport packet of the single program

transport stream.


• Figure 4.1 shows the mapping of two transport stream packets in an AAL-5 POU.

,. 188 bytes• 1 1•

188 bytes.1

MPEG-2 Transport Packet 11

MPEG-2 Transport Packet J

• 0"

AAL-S CPCS-PDU payload CPCS-Trailer

1- -1- -1376 bytes 8 bytes

Figure 4.1: Mapping of MPEG-2 transport packets according ta the ATM Forum.

Wc can see that the transport packets need 376 byte, which are mapped together

with the CPCS trailer of 8 byte inta the payload of exactly 8 ATM eells. n > 2 is

also allowed, as long as the stufling byte used in SAR sublayer is minimized.

• 4.4 Factors Affecting Picture Quality

After service class and adaptation layer have been seleeted, the approaches of quality

of service control in such a frame have to be addressed. If the QoS parameters, sucb

as cellioss ratio (CLR), cell transfer delay (eTD), and cell delay variance (CTDV)

are not limited ta certain level, the end-to-end application performance will endure

quality. degradation. Controlling the QoS of sncb applications is quite demanding for

MPEG-2 video transmission in the sense that they are submitted to both error-free

and real-time transmission constraints - both CLR and CTD have ta be bounded.

Let us first take a detour to analyze the various networking factors causing pieture

quality degradation. The approaches that could he used to address these problems

will he discussed afterwards.

4.4.1 Data Losses Due ta GeIl Errars

•Along the communication path or within the network nodes, random bit errors may

occur due to e1ectrical or physical prohlems, thus damage the quality of the decoded

pictures. At the celllevel when sucb bit errors oceur in the header, the cell is either

mis-delivered when errors and address modifications are undetected, or discarded by

the physical layer of the receiver in case of uncorrectable detected errars. In both

cases the whole cell should be considered as lost and the consequences can be serious

for the MPEG decoding process. If the error oecurs in the payload type of the cell, the

damage is obviously limited to the degradation of part of the MPEG-2 packet. If this

part belongs to the MPEG-2 transport stream packet header the entire packet may be

lost and the impact on the displayed pictures can be also very serious. Fortunately,

the probability of such data lasses is normally extremely low. For instance, in high

speed networks based on optical fibers, it is not exeeeding 10-13 • Nevertheless, even

for these transient error events, new mechanisms (i.e. error detection and correction

schemes) are required at the AAL level to ensure a low video quality degradation.

•4.5 CoDgestiol1 Control and Switch Discarding Scheme 37

•

•

4.4.2 Data Losses Due to Burstiness and Excessive Delays

With regards to real-time video service, variable bit-rate transmission has several

advantages over conventional constant bit-rate mode. However, VBR transmission

mode is an important cause of data lasses due ta peaks in traffie and subsequent

switch buffers overflowing. These heavy loads are mostly due ta inadequate network

resources allocation and multiplexing proeesses. Exceeding network capacity leads

to the cell discarding by either the congested nodes or the destination terminal if

the transmission delay exceeds a threshold. In the latter case, the MPEG-2 packet

arrives too late for playback on the terminal. Both cases leads ta a loss of data in

units of the MPEG-2 transport packets. Preventive actions must then be applied ta

minimize the QoS degradation, inside the network through intelligent discarding and,

at terminal nades, through fast data recovery schemes at bath ATM and MPEG-2

transport levels.

4.5 Congestion Control and Switch Discarding Scheme

As we have shown, the second factor listed above is mainly due to congestion. AIl

networks of finite capacity encounter congestion at varions times, ATM will not be

an exception. But with video, it needs extra effort to slow done the input rate to the

network in order to control congestion, for example, the EFCI has to be sent back to

source from the network to ÎDdicate congestion, which has to be processed and thus


• causes delay for reaction. Therefore, the best we can do is to throw some ceUs away

until the network retums to normal. However, discarding of a single cell may causes

many cells belong to the same packet useless, therefore, if we keep transmitting them,

these data would contribute to the congestion and waste the network resources.

In order to alleviate the congestion level and thus preserve the end-to-end quality

of video data, an important strategy that makes a lot of sense is to discard the whole

packet of data rather than individual cells. These strategies are called intelligent

discarding [24, 25, 27, 28}.

•

4.5.1 Priority Assignation Scheme

Prioritization is another important strategy when we apply the cell discarding scheme.

Instead of discarding cells (or packets) blindly, we can he quite selective depending on

the importance of them. This could take advantage of the CLR (Cell Loss Priority)

bit in the ATM cell header. As discussed in Section 2.5.2, encoding scheme like

MPEG-2 could be structured in such a way that two kind of cells are produced:

• Essential cells which contain basic information to enable the continued function

of the service, CLR bit could be set ta 0, indicates that the cells are of high

priority.

• Optional cells which contain information that improves the quality of service,

the CLR could be set to 1 to indicates their low priority.

When the network endure congestion, cells with CLR bit set to 1 will be considered

ta discard first, giving their buffer space to higher priority data.

4.6 Error Correction and Concealment

The question then comes up about what to do at the receiver when an expected cell

does not arrive due to errors or congestion in the network or a cell arrives with errors

in it due to the transmission media.

One simple technique for handling errors in video, involves using the information

• from the previons frame and whatever has been received oC the current frame ta

bund an approximation of the lost information. For example, we can just continue

displaying the corresponding line from the previous frame, or if ooly a single line is

lost, it is Ceasible to extrapolate the information from the Hnes on either side of the

lost one.

But, if many errors and losses happen, the performance degradation would ex

ceed the tolerable leveL In this case, necessary error recovery functions have to be

implemented in arder to ensure a low quality degradatian.

Recovering by retransmission is ruled out as the error control scheme sinee video

transmission requires a guaranteed maximum end-to-end delay. Forward error detec

tion and recovery is the best solution in this case.

•4.7 Related Work 39

•

•

4.6.1 Forward Error Correction (FEC) in AAL Layer

Forward error correction scheme could be applied ta the physical or application level,

as weIl as the AAL level. FEC scheme at the application level will work weIl, but we

can not expect high throughput by this approach. AIso, since the FEe scheme is at

application data unit level instead of packet level, latency for errar recovery will be

significantly larger than AAL-Ievel FEe scheme, when the FEC scheme in application

layer can not recaver the correct data. The FEC scheme at physical level can only

correct bit errors on the transmission medium, and thereCore can not provide ceIlloss

recovery. .When we apply an FEC scheme at the AAL level, data losses due ta both cases

explained in Section 4.4 can be recovered and we need not modify the upper layer

applications. Also, since the error recovery is performed within the packet (or slice

in video application), the error recovery latency, when the FEC can not get a correct

data, will be much smaller than that for FEC scheme at application level. Therefore,

as also discussed in [15}, an AAL-Ievel FEC scheme is necessary for the application

requiring some delay bound Cor end-to-end data transmission, like MPEG-2 .

4. 7 Related Work

Much work has been done to address the problems of transmission of MPEG-2 video

or traditional data over ATM networks. For instance, severa! protection and recovery

techniques have been proposed to minimize video quality degradation due to cell


• 1088 [16, 17, 18, 19, 20, 21]. Layered coding with prioritization has also been designed

to take the advantage of MPEG-2 hierarchy data structure [22, 23]. Furthermore,

sorne intelligent cell dropping strategies at packet or slice level in ATM switches are

also proposed.

4.7.1 FEC-Service Specifie Convergence Sublayers

As we mentioned in Section 4.6, forward error recovery schemes with destination con

cealment is a suitable method to cope with the problem of cell 10ss and cell error in

the video network environment. Several FEe schemes have been proposed. In [21],

A. McAuley describes a modified Reed-Solomon hurst error correcting code, based

on solving simultaneous equations. With h redundant symbols per black, this scheme

can fill in up to h missing symbols, or replace e missing symbols and detect d errored

symbols: where e + d = h. In [20], the author describes a two dimensional inter

leaving FEC frame scheme. It is applied to virtual paths (VP's) of ATM networks,

which reduce coding/decoding delays and support facility sharing. However, in these

• papers, a fixed-size FEC frame is suggested, and the re-ordering of data transmission

is required ta deal with bursty cell loss. Therefore, when transferring the variable

sized packets with these methods, the end-to-end latency increases significantly due

to the transmission of unnecessary cells and due to the re-ordering of transmission

data. Moreover, these schemes can not modify the length of appended data,. which

determines the error correction capability. In ATM networks, the cellioss ratio and

cellioss pattern is different for each service class and for each quality of service class,

also, the end...to-end service quality required by the application is not always the same

during a session. As a result, in many cases, the FEe scheme with fixed length of

appended data will not be effective or will require large and unnecessary transmission

overhead. The FEe approach proposed in this thesis will address these requirements.

4.7.2 Switcb Discarding 8cbemes

•Since UBR is a true 'best effort' service, with no flow control and no 1055 guarantees,

it provides the least expensive service for the transport of packet-based applications.

However, because of its simplicity, plain UBR with inadequate buffer sizes performs

poorly in a congested network. Partial Packet Discard (PPD) or Packet TaU Discard

(PTD) has been proposed to address this problem [24]. In this scheme, if a cell

is dropped from a switch, the subsequent cells of the higher layer protocol data

unit are aIso discarded. Romanov and et al. [25] have shown that PPD improves

network performance to a certain degree, but is still not optimal. They proposed a

new mechanism called Early Packet Discard, when the switch buffer queues reach a

threshold level, entire higher level data units are preventively dropped. This approach

achieves better throughput performance but does not guarantee fairness among the

connections. Floyd and Jacobson shown that connections using short packets can

unfairly suffer using this approach [26]. To improve its fairness, selective packet

drops based on per-VC accounting have been introduced by Heinnen and Kilkki and

referenced as Fair Buffer Allocation (FBA) (27),

None of the above congestion control and QoS management schemes are focusing

on the transmission of specifie MPEG-2 video streams over ATM best-effort services.

In [28], a variant of PPD called adaptive Partial Slice Discard (A-PSD) has been

proposed to cope with this problem in video networking environment. Similarly,

an EPO-like strategy called adaptive Early Slice Discard (A-ESO) has also been

presented by the same author. Both of A-PSO and A-ESD mns at the video slice

level and the results have shown a significant reduction of corrupted slices received at

destination as weil as the decrease of the mean end-tO-end cell transfer delay. These

approaches consist of selecting the packet (i.e. slice) to be dropped with respect ta

MPEG-2 data hierarchy and congestion level.

However, neither A-ESO nor A-PSD has taken into account the FEC function

performed at the destination. Also, like PPD and EPO, they do not guarantee fairness .

among connections. As a results, we will propose enhancement to these mechanisms

to support FEC feature and improve their fairness.

•

•

4.7 Related Work 41

•

4.7.3 Priority Assignation Scbeme

As we have shawn above, ta avoid congestion worsening and higher transit delay,

severa! switch discarding approaches are proposed ta drop lower priority celis rather

than delay them and give their buffer space to higher priority cells. These techniques

rely on ATM prioritization capability. Sorne data prioritization schemes have been

proposed at two difl'erent levels: cell-Ievel and connection..level.


• The first method consists of discriminating between cells within a single channel.

The cellioss priority (CLP) bit in ATM headen is used to provide a twO-level cell

priority mechanism. One such approach is proposed in [29], where a MPEG-2 video

stream is partitioned using frequency domain transfonn and subsequently transmitted

over a single .I\TM virtual channel. The data partition scheme is implemented at

MPEG-2 block and macroblock 1ayers, it sets the priority level of cells belonging to

following frames to different values [8](I-frame cells to he high and B-frame to be low).

These approaches use the only CLP and are Dot able to efficiently capture MPEG-2

data structure complexity.

In [30], a connection-level prioritization approach is evaluated for the transmission

of a layered ~IPEG-2 video sequences. The authors proposed a frequency-domain

static data partitioning scheme using two virtual connections (VCs) associated with

different service classes. By means of a load balancing factor (LBF), the video data

are split and conveyed by two different connections. The VCs are associated with

a guaranteed service class (e.g. VBR-rt) and a best effort service class (e.g. ABR)

• ta respectively carry the base layer and the enhancement layer. The main drawback

of these techniques is the added complexity at the encoders and the special devices

required at the destination ta recover and synchronize the original video stream.

•

•

•

•

5VIDEO QUALITY OF SERVICE CONTROL FRAMEWORK

43

44 5 VIDEO QUALITY OF SERVICE CONTROL FRAMEWORK

• In this chapter, we propose the video-oriented quality of service control framework

for the delivery of MPEG-2 traffic over USR service. To address issues presented in

Chapter 4, this framework consists of the following components:

• A priority assignment scheme to discriminate ATM cells by their importance.

(Section 5.1)

• A packet encapsulation strategy to map MPEG-2 transport stream packets into

AAL-S SDU and then into ATM cells. (Section 5.2)

• A Corward error correction mechanism, which is implemented at the AAL-S

service specifie convergence sublayer (SSCS) ta provide the error detection and

recovery capability. (Section 5.3)

•

•

• A cell discarding scheme termed selective and adaptive Partial Slice Discarding

(SA-PSD), which is designed for taking into account both hierarchical MPEG-2

data structure and SSCS error detecting and correcting capability. (Section 5.4)

5.1 Dynamic Extended Priority Assignation Scheme (Dex

PAS)

As we know, trans~ission of compressed video over ATM networks requires efficient

data priority partition techniques. In association with intelligent cell discard schemes,

these techniques aim ta minimize loss probability of critical information in the sit

uation of congestion. Since the ATM cell header only contains one bit (CLR) to

discriminate between video data, they are not able to efficiently capture MPEG

2 data structure complexity. To better cope with the hierarchical MPEG-2 video

transmission requirements, we propose a new video data formatting and prioriti

zation scheme, named Dynamic Extended Priority Assignation Scheme (Dex-PAS).

The mechanism is sufliciently generic to he performed at any MPEG-2 data layer

(e.g. frame, slice, macroblock, or block). In this thesis, the data partition is made

at the slice layer and the priority assignation is performed at the frame level. In [28J

a new cell header field, located in the ATM cell header, is defined and referenced as

Extended CLP (Ex-CLP). This field comprises the classical CLP bit and the adjacent

PTI ATM-user-t~ATM-userbit (AUU). Used individuallythese two single bits define

only three distinctive cells: high priority cell, low priority cell, and end of message

(EOM) cell. Their gathering permits a better utilization of the ceU header with the

definition of up to four available ceU types within a single channel. In Dex-PAS,

we uses Ex-CLP field to dynamically assign cell priorities according to the current

MPEG-2 frame type (e.g. (I)ntra (P)redictive or (B)i-directional predictive) and the

reception of backward congestion signais of the network.

Table 5.1 presents the mapping of MPEG-2 data frames into the Ex-CLP field.

•5.1 DYDamic ExtetJded Priority Assignation Scheme (Dex-PAS) 45

•

•

Cell Type CLP PTI-AUU Priority

I/P-frame 0 0 High

P/B-frame 0 1 Low

End of Control Block 1 0 Very High

End ofSlîce 1 1 Very High

Table 5.1: New Ex-CLP Field Mapping

Cells belonging to I-frame have a high priority and their Ex-CLP ftag is set to

'00'. B-frame cells bave the lowest priority and the associated ceUs have an Ex-CLP

value of '01'. As to P-frame ceUs, they are altematively assigned a high or a low

priority depending on the network load. At the beginning of the transmission, P-cells

are initialized with a high priority. When the buffer queue length (QL) exceeds

an upper threshold, an early congestion is detected and the ATM switch sends a

Ceedback signal to the source, which, in tum, adjust P-cells priority level to low. .

When QL decreases belowa lower threshold, P-cells priority are switched back to a

high priority. In our implementation we use forward resource management (RM) cells

with congestion indication (CI) flag marked to notify the destination and afterward

the source about the network status. The 'lot value is used to allow the design

of a tw~level video--oriented cell discard scheme located at every switch along the

connection path. The cell having its Ex-CLP field set to '10', is referenced as 'End

of Control Block' (EOB) and delimits a group of video cells onder FEe control (see

Section 5.3). Sînce the PTI AUU bit is employed to indicate whether it is the last

ceU of an upper message (e.g. Tep packet), we propose to define a similar flag to


Frame Hcader

• distinguish between successive video sUces. The cell having its Ex·CLP 8ag set to

'11' is termed End of Slice (EOS) eell. 80th EOB and EOS eell will be treated as

very high priority.

5.2 Slice-Based MPEG-2 TB Packets Encapsulation Strat-

egy

As illustrated in Chapter 2, uncompressed video frames (presentation units) are in-

. dividuallyencoded according to the MPEG standard and are referenced as access

units. The stream produced by these access units is then named the elementary

stream. the ne.xt step is packetization, the resulting stream is called Packetized EI

ementary Stream (PES). According to the MPEG standard [Il], there is no specifie

requirements for encapsulating encoded video data in a PESo This means that an en

coded frame (e.g. access unit) may start at any point within a PES packet, and more

than one encoded frame May be presented in one PES packet. The way this packeti-

• zation is done can significantly affect the performance of the decoding process and the

quality of the service provided by the network. In this thesis, we propose that each

PES is built from a single encoded video slice (see Figure 5.1). The consideration is

Frame Lcvel

1.... Acc_css_un_ll 1Etc:mcnrarySIream

Slice Header

Figure 5.1: SUc~based PES encapsulation using variable length paclœt.

• that slice is the main coding processing and the smallest autonomous unit in MPEG

2, coding and decoding of blocks and macroblocks are Ceasible only when ail the data

ofa sUce is available [8). At the next step, we propose to segment the PES packet into

a number of fixed length Transport Stream (TS) packets. In respect to the MPEG-2

system multiplex standard [11], every TS packet embeds data from only one PES

Packet. The last transport packet may not be completely full since it is unlikely that

a variable PES packet will fit exactly into an integer number of transport packets.

Thus, stuffing bytes are placed in the adaptation field to complete the payload. Us

ing this encapsulation strategy, the decoder can easily determine the start and end

of the sUce. Similarly, network transport and control polides can take advantage

of this structure to offer a guaranteed packet-oriented service. For example, we can

implement sorne intelligent discarding algorithm at switch to decrease the frame error

ratio, like PPD and EPD (see Section 4.7.2), since the start (or end) of a frame could

be easily found in this case.

•5.3 AAL-5 Service Specifie Convergence Sublayer witb FEe Support 47

•

•

5.3 AAL-5 Service Specifie Convergence Sublayer with FEe

Support

5.3.1 Requirement of SSGS with FEe support

As described in Chapter 4, classical AAL-5 only provides error detection by means

of CPCS packet lcngth integrity and CRC-32 checks, and is not able to locate which

cell was dropped or which cell includes bit errors. Therefore, the task of the proposed

video service specifie convergence sublayer is to implement a robust FEC mechanism

targeted to hierarchical MPEG-2 encoded video transmission. Requirements of sucb

a FEC-SSCS are described in [31, 32} and may he summarized in the following :

1. Compatibility with the specification of the existing AAL-5, e.g. compatibility

with the current CPCS/SAR layers;

2. No modifications are required for the upper layer (e.g. MPEG-2 Transport

Stream or Program Stream);

3. Support of variable size data (e.g. slice or frame);

4. The amount oC [edundant data should be minimized;


• 5. Similarly to the ATM Forum's video on demand over ATM specification [33],

byte padding should he avoided;

6. It would he interesting ta adjust and negotiate FEC-SSCS parameters at the

connection setup phase as weIl as during the session;

7. FEC-SSCS shauld be able ta ùetect erl'ors, localize them and finally correct

them;

8. In arder to avoid increase of latency, SSCS service data unit should he trans

ferred in pipelining at the sender side. This way, no buffering is required and

the processing cast is minimized,

•

•

9. At the peer destination, if no errors are detected, the packet should he forwarded

to the upper layer with no extra delay. The processing speed at the receiver

entity should be as fast as with classical AAL-5;

10. In order to recaver a corrupted packet, huffering af previous packets should be

avoided;

Il. In order to avoid errors propagation, slice houndaries have ta be respected

during cell filling;

12. A similar requirement shauld be applied ta the frame boundaries.

The proposed FEC-SSCS protocol satisfies all the above requirements. It is based

on Reed-Solomon and Parity Codes [34, 35J. In comparison ta those based on only

Reed-Solomon codes with byte interleaving, this approach permits the use of flexible

matrix structures and a correction granularity at the basis of byte and cell (see Page

52 for further explanation). Moreovert it better takes into account the fixed struc

tures of MPEG-2 TS packet and ATM cell to avoid bit padding at the lower AAL-5

CPCS sublayer. It can also be used selectively to protect separately audio, video and

syntactic data (e.g. headers) and thus minimize data control overheads.

5.3.2 Behavior of Bender and Receiver Entities

Sender Side Behavior

First, the TS packets are passed to the SSCS sublayer by the MPEG-2 system layer

using message mode service with blocking/non-blocking internaI function, as illus

trated in Figure 5.2. The following primitive is used: AAL_UNITDATA-request(ID,

•5.3 AAL-5 Service Specilic Convergence Sublayer with FEe Support 49

Application Layer

MPEG-2 PES Laycr

Comprcsscd Video Slice

6iit_cs_. PES Packct

4 bytes

MPEG-2 TS uycrlr---Ts-p-ac-k-ct-l 1 1 1 _4··············· .. ··· .. ·..

188 bytes

8 byte!----C-o-m-m-o-n-Pan-C-o-ny-c-rg-cn-ce-S-u-b--Ia-y-cr-P-D-U---... .• AAL TypeS

AV-SSCS

CPCS

2 bytes.r------Sc-rv-ic-c-S-pe-c-m-c-C-on-y-er-gc-n-cc-S-u-b--Ia-ye-r-P-OU---.

.. ..576 bytes

SAR

S bytes

ATMLaycr B EJ EJS3 bytes

B

•

4···············································..·.. ··· _Exactly 12 cells (no padding)

Figure 5.2: AAL-5 multi-level FEC-SSCS using grouping mode 1.

M, SLP, CI). The tlnterface Data' (ID) parameter specifies the exchange MPEG-2 TS

packet. The tMore' (M) parameter indicates if it is the last AAL SOU of the upper

message (e.y. end of the eurrent video sUce). The 'Submitted Loss Priority' (SLP)

parameter gives the priority level of the TS packet and is initialized in respect to the

'PICTURE-CODING-TYPE' field Iocated in the MPEG-2 frame header [Il} , this

field specifies the used coding mode (e.g. Intra, Predictive or Bi-direetional predic·

tive) for each frame. This parameter also indicates how the 'SLP' parameter of the

ATM-DATA-request primitive shall be set for cell header initializatioD. As described

in Section 5.1, we propose to extend its range from two to (our possible values ta


• allow identification of MPEG-2 frame types and system information. Finally, the

last parameter 'Congestion Indication' (CI) indicates how the 'CI' parameter of the

ATM-DATA-request primitive shaH be set to notify a congestion state ta the network

nodes and destination.

•

•

Four grouping modes are defined at the SSCS sublayer, which ensure an integer

number of 48-byte cell payloads at the SAR layer and thus avoid byte stuffing. These

modes consist to group a number 'N'of MPEG-2 TS packets ta build a SSCS-SDU.

The parameter, 'N' May have the following values: 3, 15, 27 and 39. After ap

pending the CPCS-trailer information, we, respectively, obtain exactly 12, 59, 106

and 153x 48-byte ATM cell payloads. For every connection, the grouping mode is

negotiated between the source and destination at connection establishment phase ac

cording to the requested quality of service. During communication, this mode can

he dynamically adjusted in respect to the on-line measures of the end-to-end QoS

parameters.

A two-byte header and a two-byte trailer information are appended to every SSCS

SDU. The header is composed of a 4-bit Sequence Number (SN), a 4-hit Sequence

Number Protection (SNP), a 4-bit Payload Type (PT), and a 4-bit Control Black

Length (CBL). The trailer is composed of a 2-byte FEC field applied on1y to the pay

load. The FEC scheme uses a Reed-Solomon (RS) code, which enables the correction

of up to 4 erroneous bytes in each block of 564 bytes (i.e. 188x3, sinee for group mode

1, each SSCS-PDU contains 3 TS packets). It is only used for recovering of cell errors

due to electrical or physical problems along the communication path. The addition

of a sequence number (SN) of 4 bits enables the reeeiver entity ta detect and locate

up ta 15 consecutive SSCS POU losses. When lasses are detected, dummy bytes are

inserted in order to preserve the bit connt integrity at the receiver. The SNP cantains

a 3-bit CRC generated using the generator polynomial g(x) = x3 + x + 1, and the

resulting 7-bit codeword is protected by an even parity check bit. The SNP field is

then capable of correcting single bit errors and detecting multiple bit errors. The PT

field specifies the type of embedded information for discrimination purpose (I-frame,

P-frame, B-Crame, Audio, Data, Headers, FEC information, etc. ).

Now, let us define a Control Block (CB) as a two dimensional matrix of P ceUs

column x M rows ioto which consecutive fixed length SSCS PDUs are written row

by row (see Figure 5.3). The corresponding CPCS trailer is then appended. A single•5.3 AA.lr-5 Service Specilic Convergence Sublayer with FEe Support 51

P(œl1 paylOlds)

M

tiCSTl'Iilcr

CPCS2

3

4

•

•

--_. Writina and Re4dina Order • Erreonous or LoSI Cell

• Correction (RS and XOR Codes) • FEe In(onnation

'. LocaIislUion (PuriIY ond Sequence Number Checks)

Figure 5.3: Control black structure used in FEe scheme

redundancy row is appended at the tail of the matrix which is obtained by XORing

the columns at the cell basis. A single cellioss per black can be recovered or up to

an entire SSCS POU. The number of row 'M' is referenced as Control Block Length

(eBL) which determine the ratio of data and redundancy. It is negotiated at the caU

set up with reference to the protection level desired by the connection. Lower is its

value and higher is the recovery power of the FEC-SSCS mechanism. The drawback

is a proportional increase of the control information overhead with M decreases. Since

the FEe information are obtained using XORing method, the data matrix is only an

abstract structure and no buffering is required at the sender. The destination checking

process is also pipelined and the correct SSCS POUs are immediately transmitted to

the upper layer with no latency. The virtual matrix is read row by row and the

traiter is created by calculating the Reed-Solomn check bytes tirst. The parameters,

Payload Type (PT), the Control Block Length (CBL) and the Sequence Number

(SN) are subsequently set. Finally the Sequence Number Protection (SNP) fields is

calculated and appended to the black.

Receiver Side Behavior

Since we are dealing with variable length encoded video slices, it is unlikely ta

have an exact Bumber of SSCS SDUs ta fill up the last virtual matrix of every slice.

Therefore, we propose ta indicate in the SSCS header, the length of the Control

Black (CBL) that they belong to. This approach allow an easier and more reliable

delimitation of the end of the block as well as a better protection of slices from error

propagation.

The SSCS-POU are then transmitted to the common part convergence sublayer

(CpeS) using the CPCS-UNIDATA-Invoke primitive. The 8-byte CPCS trailer infor

mation is appended ta the CPCS SDU and no byte padding is required. The resulting

cpes POU is passed to the segmentation and reassembly (SAR) layer using the SAR

UNIDATA-Invoke primitive. The underlying SAR protocol will subsequently segment

the CS-POU into exactly twelve (12) 48-byte ATM SOU. The ATM layer will then

marked the CLP field of every cell using the 'AUU' and the 'SLP' parameters of the

AAL-UNIDATA-Request [36] .

At destination, three tasks have ta be performed by the FEe sses receiver entity :

(1) detect errar or loss in the incoming stream, (2) localize the missing cells or the

position of the erroneous bytes, and finally (3) recover the initial data.

The detection of erroneous SSCS POUs is assured by both SSCS and cpes pro

tocols.··CPCS layer is able ta identify received corrupted AAL POUs by CRC-32 and

missing cells by length mismatch. Rather than discarding a corrupted packet, we

propose ta forward it ta the upper layer SSCS together with an error indication (e.g.

Reception Status (RS) parameter of the CPCS_UNIDATA-signal primitive). Un

fortunately, in the extreme situation of missing entire POUs, the previous checking

mechanisms are not capable of detecting the problem. Therefore, the introduction

of a sequence number (SN) at the SSCS layer will permit the detection of up ta 15

consecutive packet losses.

The association of the reported indication and the parity FEC XOR check se

quence allows the FEC-SSCS layer ta locate the erroneous bytes by determining

simultaneously the Une and the column numbers in matrix (control block), as shawn

in Figure 5.3. This way, up ta 4 erroreous bytes in each block could be detected and

•

•

•


corrected. Moreover, taking benefit of the fixed length of bath MPEG-2 TS packet

and ATM cell, the SSCS layer is capable to easily locate the missing celle As a result,

this approach achieves a correction granularity at the basis of both byte and celle

After localization, bath errors and losses can be corrected by respectively using

Reed-Solomon and XORed FEC check codes. If no error is detected, the SSCS PDUs

are immediately passed to the upper layer aCter sequence numbering check and trailer

removing. When ooly the last SSCS-PDU (redundancy part) is erroneous, no action

is performed.

•5.4 Selective and Adaptive Partial Slice Discard Scheme (SA-P8D) 53

•

•

5.4 Selective and Adaptive Partial Slice Discard Scheme

(SA-PSD)

5.4.1 The Algorithm Introduction

One of the simplest switch buffer scheduling algorithm is to serve cells in first-in

first-out (FIFO) order, if buffer congestion occurs, the incoming cells are dropped

regardless to their importance. This random discard (RD) strategy is not suitable

for video transmission. A modification is to take into consideration the cell's priority

when discarding, e.g. a cell with low priority is dropped first, then, if the congestion

persists, this approach gradually begins to drop the high priority cells. This is called

Selective Cell Discard (SCO) [28}. However, as described in Chapter 4, the useless

ceIls, in our case, the tail of corrupted slice may still be transmitted and congest

upstream switches. In [28], a scheme called Adaptive Partial Slice Oiscard (A-PSO) _

has been proposed to cape with this problem, which consists of selecting the packet

(i.e. slice) to be dropped with respect to MPEG-2 data hierarchy and congestion level

(e.g. switch queue length).

We propose sorne enhancernent ta the A-PSD to support forward error correction

feature. The new scheme, named Selective and Adaptive Partial Slice Discard (SA

PSD), is performed at bath control group and video slice levels. Our approach is to

reduce the number of corrupted slices by assuming that a number of ceUs per control

block can be recovered by the destination SSCS sublayer using FEe techniques. Let

us define this specifie number as the drop tolerance (DT) which corresponds ta the


• maximum number oC cells per control block that may be discarded by SA-PSD beCore

considering the control black as definitively corrupted. DT is usually set ta the

number of cells per row in the control black. Therefore, unlikely ta simple A-PSD,

SA-PSD stops discard as saon as the congestion decreases and only if the number

of previously dropped cells in every control black is below DT, the drop tolerance.

Using this approach, the proposed scheme acts at a finer data granu1arity and better

preserves entire slices from elimination. The ftexibility proposed by our mechanism

can not be achieved without the use of Dex-PAS which allows the detection of bath

slice and control black boundaries at the cellievel. The proposed SA-PSD algorithm

is highlighted in the following.

5.4.2 8A-P8D Parameters

•SA-PSO scheme runs per-VC and employs four state variables and one counter vari

able ta control each video connection. Two of them are associated with the slice level

and the remaining ones with the control black level.

1. S...PRIORITY indicates the priority level of the current slice. The indicator is

modified at the reception of the first cell of this slice in respect to its priority

field (the two Ex-CLP bits). This indicate that the switch is currently handling

a high (S-PRIORITY=O), or a low (S-PRIORITY=l) priority slice.

2. S..DISCARDING indicates whether the switch is currently discarding (S..DISC

ARDING=l) this slice (e.g. the tail) or not (S-DISCARDING=O). Only the last

cell of a slice (EOS) can change this indicator from discarding ta not discarding.

Other cells will only change the flag Crom not discarding to discarding.

3. CB..DROPPED is a counter which indicates for the current control black the

number of cells discarded by the switch. It is initialized to zero at the reception

of a new control black. This is needed so that we can check if a control block

(thus, a slice) is still recoverable or not.

4. CB-DISCARDING indicates whether the switch is currently discarding (CB-DIS-

• CARDING=l) this control block or not (CB.DISCARDING-O). Unlikely to

the slice level control, the indicator changes from discardiDg to Dot discarding

in two situations: the CB-DROPPED counter reaches the drop tolerance 'DT',

or else a new black is received. Other events (e.g. ceU arrivaIs) will ooly change

the flag from not discarding to discarding.

•5.4 Selective and Adaptive Partial Slice Discard Scheme (SA-PSD) 55

•

5. CB..EFCI.MARKING indicates whether the switch is tagging (CB..EFCI.MARK

ING=l) or not tagging (CB..EFCI.MARKING=O) the EFCI bit of the cell for

the CUITent control black. Ooly the last cell of a block (EOB) cao change

this indicator from marking ta not marking. Besides, only one event may pro

voke the modification of the state from not marking ta marking: the arrivai

of a cell whereas CB..DISCARDING indicator is in 'no discarding' state and

CB..DROPPED equals the tolerance DT.

The use of bath CB..DISCARDING and CB..EFCLMARKING indicators allow us

to manage more efficient1y lasses occurring at subsequent switches and belonging ta

a control black. Indeed, when a black is partially discarded by a switch node, the

following switches are not capable to take into accouot these celliosses ta update the

assaciated drop tolerance. The consequence is that the switches handie erroneous cell

drop tolerance with adverse effect on algorithm performance. At the control black

level, the drop tolerance DT can be seen as a 10ss credit shared by the crossed switches.

Ta make implementation easy, we propose ta entirely consume the 10ss credit as 500n

as a celll05s occuITed. CB..DISCARDING is used ta ensure that, for every control

black, lasses are concentrated in a single switch. If cells from a black tail arrive in a

congested node, the use of EFCI bit allows the detection of non recoverable blacks

since whole the drop credit have been used by a previous switcb. In this situation, we

propose to commit ta the slice level control by entirely dropping the remaining slice.

5.4.3 SA-PSD Operation Modes

SA-PSO uses three buffer thresholds (Figure 5.4): Low_Threshold (LT) , MediuID_Thre-

O~---------~88-----IB3-------B8---00tO~"'~"

LT MT KT Qmax•{Ml} {M2} (M2. 1M3)

Fipre 5.4: Oder thresholds


e shold (MT) and High_Threshold (HT). The utilization of three thresholds instead of

two reduces the speed of oscillation for the transmission of Dex-PAS RM cells and

have shown better performance.

These thresholds define three operation modes :

1. Mode 1: If the buffer queue length (QL) is lower than Low_Threshold, for every

connections, the cells arc acccptcd and may have EFCr marked ifCB-EFCLJ.\IAR

KING is activated.

•

e.

2. Mode 2 : If the total number of celis in the buffer exceeds Low_Threshold but is

still below High_Threshold, for every video connection currently emitting a low

priority slice, SA-PSD starts ta discard their incoming cells in respect ta the

drop tolerance associated with each connection. We propose ta fairly distribute

the elimination among the targeted connections using round robin service (i.e.

each connections using this congested switch would, in turn, have a chance ta get

its cells dropped). If the light congestion is subsisting, the algorithm commutes

ta the slice level and starts ta eliminate the incoming low priority cells until

the reception of an EOS cell. Again, this commutation is done in a round robin

manner ta guarantee the fairness among connections. The last cells of a control

block and slice (EOB and EOS) are always preserved from elimination since

they provides indication of the next control block or slice. The cells with higher

priority are accepted in the buffer. This mode stops when queue length falls

down ta Low_Threshold.

3. Mode 3 : This mode is activated when queue length exceeds High_threshold.

Ineoming slices are eligible for discarding regardless ta their priority level. This

mode behaves like Mode 2 for intelligently spreading the losses over connections

with respect to their drop tolerance. It stops when queue length falls below

High_Threshold. As in mode 2, EOB and EOS are preserved ta avoid the error

propagation. This is feasible, sinee usually 10% of switch buffer has been set

aside ta accommodate the system control and management messages and other

important cells.

The Medium_Threshold is used ta control P-cell priority assignment: IIPB cells

are transmitted to aIl the video sources when MT is exceeded, upon receiving these

cells, the source start to set P-cells to he low priority. When queue length drops below

Low_Tbreshold IP/B RM cells are send and at the reception of feedhack signais, P

cells are switched back to high priority immediately. Consequently, some P-frames

may transmit cells with different priority.

Using this adaptive strategyt B-slices are 6rstly dropped ta quickly reduce huffer

occupancy during light congestion, while P and I-slices are preserved from elimination.

If the congestion persists, B and P-slices are both candidate to elimination, this

situation is Collowed by gradually including I-frame cells if congestion worsens.

•

•

•

5.4 Selective and Adaptive Partial SUce Discard Scheme (SA-PSD) 57

•

•

•

6

EXPERIMENT AND RESULT

58

•6.1 Simulation Environment

6.1 Simulation EnviIonment

59

•

•

Ta evaluate the end-to-end behavior of MPEG-2 traffic over ATM network using the

proposed video quality of service control framework, we implemented these schemes in

an ATM simulator and performed simulation experiments. In this section, we discuss

the experiment environment. We start with an presentation of the NIST simulator

we used as our software tao!. Then, alter a brie! description of the network topology

used in the simulations, we proceed to discuss the MPEG-2 trace files. Parameters we

used as network configuration and performance measurement are discussed in Section

6.4.

6.1.1 The NIST ATM Simulator

The Nationallnstitute of Science and Technology (NIST) developed an ATM simu

lator in mid-1994, and provided the simulator and source code to the public. The

initial version (1.0) provided basic ATM networking capabilities. Version 2.0 was

made available in 1995 and included sorne flow control options, like EFCI and ER

ABR flow control options.

The simulator gives the user an interactive modeling environment with a graph

icaI user interface. It uses bath the C language and the X-window system running

on UNIX platform and allows the user to create different network topologies, set the

parameters of component operation and save/load the different simulated configura

tions.

The network to be simulated coosists of several components sending message to .

one another. The components available include ATM switches, Broadband Terminal

Equipment (BTE), and ATM applications. Switches and BTE components are in

terconnected with physicallinks, which is also considered a component. The ATM

applications may be cODsidered as traffic generators that are capable of emulating

variable or constant bit rate traffic sources. They are connected ta each other over a

route that uses a selected Hst of adjacent components to form an end-to-end virtual

connection.

Every component, sueb as switch, BTE, and link can be configured as to what

type of data should be logged during a simulation. These data can be link utilization,

60 6 EXPERIMENT AND RESULT

• buffer usage, cellioss ratio, cell transfer delay, cells received/transmitted, etc. The

simulations can then be executed and the data analyzed.

6.1.2 Network Madel

The network topology used is shown in Figure 6.1. It consists of two ATM switches,

•Figure 6.1: Network topology used in the simulation

ten MPEG-2 application sources and ten receivers, each is connected to a BTE.

MEPG-2 trace file is read by the sources and then sent through the two switches ta the

destinations. The backbone link (LINK-B) between these two switches (SWITCH...1

and SWITCH-2) is shared by ail connections, with a capacity of 155Mbps (to simulate

an OC-3Iink). Our experiments were run in both LAN and WAN configurations, with

backbone links ranging nom 1 ta lOOOkm. AIl the other link distances (between the

source/destination and the switch nodes) are constant and set to O.2km. The ATM

• switches are implemented to be non-blocking, output-buffered with a finite amount

of buffering, and the switch bufl'ers size varies (rom 80000 ta 220000 cells for both

•6.2 MPEG..2 Thace File

SWiTCH..l and SWITCH-2. AIl BTEs have infinite bufFers.

6.2 MPEG-2 'frace File

61

•

•

We obtained the MPEG-2 file from Michael R. Izquierdo, IBM Corporation, a detailed

description of this file can be found in [37}. The video sequences shows a flower garden

located in the bottom hale of the screen and a row ofhouses in the background towards

the top of the scene. The camera traclcs this scenery from left ta right. Table 6.1,

shows the cells/slice statistics for the video.

File Size (Bytes) 2,819,836

Total Pictures 150

Cornpression Ratio 6.741:1

Peak Cell Rate (Mbps) 105

Mean CeU Rate (Mbps) 26.608

Peak Rate (Mbps) 20.034

Mean Rate (Mbps) 5.077

Peak/Mean 3.9462

Table 6.1: Statistics data of MPEG-2 trace file used in simulation

The video sequences use SIF format and were encoded at a resolution of 352 x 240

pixels per frame, a frame rate of 30 frames/sec, and 15 slices/frame. It is 150 frames

(5 sec) long. In order ta run the experiment for a sufficient period, we repeatedly .

transmit it by starting from the beginning aCter it reaches the end. The distance

between two 1 frames is 6 and that between 1 and P frames is 3, sa, as discussed in

Chapter 2, the GOP pattern is lliBPBBI... A sUce consisted of one macroblock row

of 352x16 pixels.

Figure 6.2 shows the number of ATM cells pel slice for the first 20 frames. It is

obvious that distinctive pulses occuning at deterministic time intervals. The pulse

period is determined by the GOP pattern, for every forty-five slices, there are alter

nating pulses caused by 1and P frames. The spacing between pulses is B frames. We

use the same file for all of the senders. Sînce each sequence has the same I/PlB frame

62

• 25

20

15

110

5

°0

6 EXPERIMENT AND RESULT

Figure 6.2: Number oC ATM cells per sUce for the tirst 20 Crames oC the MPEG-2 video sequence.

pattern, 1 frames will always overlap for the duration oC playback if the source send

video streams at the same time, which exaggerate the bursty traffie. For this reason,

• we shiCt the send time so that 1 and P frames from one sequence would overlap B

frames Crom another source, and Figure 6.3 shows the results of aCter multiplexing

the shiCted MPEG-2 traffics.

6.3 Several Assumptions

Before discussing the experimental results, let us make the Collowing assumptions.

First, as indicated in Cbapter 5, the level of congestion is monitored through the

occupancy of the switch buffers, and we assume shared output FIFO buffer is used.

Three congestion thresholds are defined, namely, Low Threshold (LT) , Middle Thresh

old (MT) and High Threshold (RT). As to the transfer delay, we have the foUowing

estimation:

• Propagation delay between the sender and the receiver varies from 0.005 msec

ta 5.0 IDSeC, if we assume the medium speed ta be 200,OOOkm/s - two thirds

• of light speed. 0.005 is equivalent to the propagation distance of about 1 km,

while 5.0 corresponds to 1000 km.

•6.3 Several Assomptions

7Or----.,....--...,.....----,.--~--r__-_..,

50

I~

~W~ l ~ l

~ "30~M

~ ~ ~ fil \ \u

V20

liDo 400 450 500SllceNumtlef

Figure 6.3: Number of cells per slice after multiplexing of ail source with time shift.

63

•

•

• Queuing delay varies from 0 to a maximum value of 0.6 seCt which corresponds

to the maximum buffer size (220tOOO cells) when transmitted using 155 Mbps

link.

• The process delay for the sender can he assumed as negligiblet due to pipelined

data transmission and encoding of the appended data. At the receiver, following

FEe processing time for error recovery in SSCS layer is assumed:

- FEC-SSCS without error: 0.092 ms/slice (12 cell transmission time for 155 .

Mbps link).

- FEC-SSCS with error: 0.46 ms/slice. (12 x M cell transmission time for

155 Mbps link, where M is the number of rows in a control block, set ta 5

in most cases).

The additional processing delay generated at the other layers (e.g. SAR and ATM)

are not explicitly modeled. We assume that their contribution to the end-to-end delay

experienced by the cell is relatively constant, and thus can be omitted.

64

• 6.4 Parameters

6 EXPERlMENT AND RESULT

•

•

We carried out our simulation with seven switch buffer configurations. For each oC

them, the sarne method is applied to determine the values oC the three thresholds,

specifically, RT, MT and LT are respectively set to 0.9, 0.8 and 0.7 fraction of the

maximum queue size (Qmax) , where Qmax is set to one of the following values:

80,000, 100,000, 120,000, 140,000. 160,000, 180,000, 200,000 and 220,000 ceUs.

Data units Definition

Lost CeU A cell dropped by discarding scheme.

Dead CellA cell received at the destination but belong-

ing ta a partially discarded slice.

A cell arriving at destination after an ended

time-out. This time-out is triggered at the

Late Cell reception of every first cell of a picture. Its

value is set to liN sec., where N is the frame

rate of the video sequence.

Correct Cell Neither a lost, dead or latc cell.

Correct SliceA sUce received with ooly correct cells or cor-

rupted but can be recovered by FEC.

Table 6.2: Data unit definitions

Table 6.2 summarizes the possible states ofa cell crossing the network and Table 6.3

defines the investigated performance parameters. Notice that Slice Loss Ratio (SLR)

is measured at the application layer and taken into account decoding (e.g. 10ss cells)

and propagation (e.g. late cells) constraints.

We compare the performance of proposed Cramework (caUed Dex-SA-PSD here

after) with the three Collowing schemes:

• Random discarding with no priority assignation scheme (No-RD)

• Selective cell discarding with extended priority assignation scheme (Ex·.SCD).

• Partial sUce discardiog with exteoded priority assignation scheme (Ex-PSD).

•6.4 Parameters 65

•

•

Performance Parameters Definition

I-frame Cell Loss Ratio (CLR-I)Number of lost and late ceUs belonging ta 1-

frame vs. the total number of cells transmitted.

P-frame Cell Loss Ratio (CLR-P)Numher of lost and late cells belonging ta P-

frame vs. the total Dumber of cells transmitted.

B-frame Cell LOBs Ratio (CLR-B)Number of lost and late cells belonging ta B-

frame vs. the total Dumber of cells transmitted.

I-frame Slice Loss Ratio (SLR-I)Number of corrupted I-frame slices vs. number

of slices transmitted.

P-frame Slice Loss Ratio (SLR-P)Number of corrupted P-frame slices vs. Dumher


B-frame Slice Loss Ratio (SLR-B)Number of corrupted B-frame slices vs. number


Average time between the departure of a cell

Mean Cell Transfer Delay (Mean CTD) from the source Dode (~) and its arrivai at the

destination Dode (ta): D = E:_,;k-filc -

Table 6.3: Performance Parameters Definitions

•

•

•

7DISCUSSION

66

7.1 Results at Cell-level 67

• 7.1 Results at Cell-level

'2

0

11 0 -Del.PSD0

00 0

oooaoEa-PSD

-Es-SCO1..5

• l• -. .-DelLPSD

_No..RD d00000 Ea.,PSD

--Ea..5CD

_HO-RDa

\0 100 120 140 110 110 200 22 120 140 110 '10 200 220BuIIr Sile (1blII) 8ultIf SIle (KalI)

Figure 1.1: Cell Ioss ratio (Aggregate) Figure 7.2: CeII Ioss ratio (I-frame)

1O

._- Du...SAJ"SD• 00000 Ea.,PSO 10

-- El-SCD a<)

_No_RD

15

l la .- o..-SA.PSD~ 3

III

0 ~5 - &..PSD2..5 a

~

4 - Ea..SCD

3 _ No-RD

1..5

\O-~100~~12O~-I""40----o.'I10---6'IO-~2OO----I22IMIrSlnC~

-100 120 140 110 '10 200 220

BuIIf SIle C....)

Figure 7.3: Ce1110ss ratio (P-frame) Figure 7.4: Cellioss ratio (B-frame)

•

Figure 7.1,7.2,7.3 and 7.4 show the cellioss ratio (CLR) for respectively the aggregate,

1- P- and B-frame cell flows. From Figure 7.1 we can notice that there exists slight

differences between the aggregate cellioss curves for different schemes (No-RD, Ex

sco, Ex-PSO and Dex-SA-PSD) with the No-RD having the minimum CLR. Because

in No-RD, a switch accommodate every cell unless the buffer averfiow, until then it

start ta discard cells without discrimination. As a result, it achieves the most usage

of buffer and has the (east aggregate CLR. Un1ikely, ail the other three schemes take a

68 7 DISCUSSION

• preventive strategy to drop cells before the switch buffer is full in order to protect the

important information (high priority cells) from discarding. Thust the switch buffer

utilization is not as good as N~RD. Ex-SCO bas the second least aggregate cellioss

ratio because of its discarding data at the cell-Ievel. It stops dropping as soon as the

switch buffer size decreases below Low_Threshold. On the contrary, both Ex-PSO

and Oex-SA-PSD has the possibility of stopping elimination only at the reception

of the end of slice cell (EOS), even though the congestion has already gone. They

experience higher cellioss ratio regardless of the picture types. Oex-SA-PSO behaves

better than Ex-PSO, because as discussed in Section 5.4, it has to modes of discarding

cell during congestion. First, it tries to drop cells within a control black, and back ta

normal as soon as the congestion alleviate, in this mode, the dropped data could be

recovered at the end side through FEC and the whole sUce is considered ta be correct.

If the congestion persists, it commute ta the second mode, where the whole slice is ta

he dropped. By this way, cells are preserved better than in Ex-PSD, which has one

discarding mode. In addition, we applied a round-robin strategy to fairly distribute

• the 1058 among the connections, when the switch becomes congested t it tries ta discard

cells one virtual connections after the other, while in Ex-PSO, all connections discard

cells in the case of congestion. The cellioss ratio decreases by 15.4%, 17.6%, 21.6%

and 34.8% (Ex-SCO, Ex-PSO, Dex-SA-PSO and No-RO respectively) while the buffer

size increases from 80,000 ta 220,000 cells.

As illustrated in Figure 7.2, 7.3 and 7.4, aIl of the three preventive schemes, namely,

Ex-SCO, Ex-PSO, and Oex-SA-PSO concentrate the loss within the B-frame and

protect the reference 1- and P- frames. Ta the extreme, Ex-SCO and Ex-PSO have

no CLR-I at all, Dex-SA-PSD has non-zero CLR-I but much better than that of No

RD. Again, the round-robin manner we applied ta ensure faimess in Dex-SA-PSO

plays a raie in here, since only one connection is subject to discarding at one time,

the reaction of discarding B-frame to protect I-frame is not as promptlyas the Ex

SCO and Ex-PSD. Indeed, when an I-frame is transmitted in Dex-SA-PSO, the bufl'er

queue length rapidly increases ta accommodate the burst traffic. The queue length

• exceeds the high threshold more often. Because switch indiscriminately discard cells

when HT is exceeded. this scheme endures more I-frame cellioss compared with the

Ex-SCO and Ex-PSD. No--RD bas the highest I-cellioss ratio because it drops cells

blindly all the time.

As to the CLR-P, No--RD again suffers the most 10ss, because no protection is

applied at a11. Ex-SCD has the minimum value for the same reason as in I-cell CLR,

i.e. it protects high priority cells at the cellievel, which stops dropping as soon as

congestion decreases.

As illustrated, B-frame cells are of the most concemed by 10ss and contribute

largely to the overallioss ratio in Ex-SCD, Ex-PSD, and Oex-SA-PSD. Afterwards,

poo and I-frame cells are, in this order, the most subject to loss. This is eaused by the

drop policy of these three preventive schemes, as well as the frequency of B-frames in

video sequences. Indeed, in our MPEG-2 video samples, the proportion of l, P and

B data are respectively of 53%, 24% and 23% of the aggregate stream. Due to the

GOP pattern and the multiplexing process, B-frames oceur more often and are more

likely to be discarded, e.g. 16 B-frame occurrences per second, while only 2 and 6 for

1 and P- frames.

•

•

7.1 Results at CeH-level

_.Ou.PSO

ooooo&..,PSD

..... &..,sco

_No~D

c..

e..e..

::..e.

;

100 120 140 UIO 110 200 220...., Slza (KcIIII)

69

•Figure 1.S: Mean cell transfer delay

From Figure 7.5, we can see, the mean-ce11 transfer delay (CTD) increases on

arder of magnitude of the buffer size. Without surprise, N<rRD has the largest mean

CTOt since it accommodates every cell in its switch buffer until overflow OCCUISt

thus endure the largest queue delay. Also, we may notice that Dex-SA-PSD has

70 7 DISCUSSION

• longer mean-CTO than the other two schemes. This is mainly due to its overhead

introduced by FEC-SSCS sublayer. Indeed, the data ftow in Oex-SA-PSO contains

a certain amount of redundancy used for the forward error recovery, therefore. it

consumes more bandwidth and results larger switch occupancy. This can be further

shown by Figure 7.6,7.7,7.8 and 7.9, which show the buffer occupancy status of each

schemes. Ex-SCO and Ex-PSO start to drop B-frame ceUs when light congestion

happens and thus has the least buffer occupancy and minimize the transrer delay of

the high priority ceUs. Oex-SA-PSO behaves similar but bas larger average queue

length because of the redundant data it introduced.

-JalIlID

-lGlIlID

--• lOllOO

....

... .. ..... 1-' .....

A A A j

ICIlIlID

J 1 j ....1--... ~,

A j A A

Figure 7.6: Buffer occupancy in No-RD Figure 7.7: Buff'er occupancy in Ex-Sen

.........

--lam

,..U.• -1 .... •...... -

A j A j

....,... -~ '-' """"

r-'I

~ ~ ~ .. A ~ ~

Figure 7.8: Buff'er occupancy in Ex-PSD Figure 7.9: Buff'er occupancy in Oex-SA-PSO

• The difference of mean-CTDs between these four schemes ÎDcreases with buffer

size goes up. This could be explained by the ract that the preventive B-&ame cells

elimination approach works better when more buffer space is available. With limited

buffer size, the space saved by dropping B-cells is also limited and therefore they

perfonn just like No-RD.

•7.2 Results at Slice-Level 71

7.2 Results at Slice-Level

In this section, we compare between the efficiency of Dex-SA-PSD and that of other

techniques in their ability to provide a message-based service. Video slice 10ss ratio

is measured at the application layer and taken ioto account decoding (e.g. lost cells)

and propagation (e.g. late cells) constraints. In addition, the forward error correction

capability is considered in the Dex-SA-PSD framework.

1.1,...---_-_-...--...---....--......--~ 15....--...--...--.....---...--...-----,

•14

._- a....SA-PSDOOGOO EJr,.pso 3~ -J

lICIOClO EIl.PSO

o oo o

0.5

100 120 140 110 110 200 nlIuIer SIZe

Figure 7.10: Slîce loss ratio (Aggregate)

1 ~ 100 120 140 1110 IID 200 220Bulfw Sü. tibial

Figure 7.11: SUce loss ratio (I-Crame)

•

As expected, the performance of Dex-SA-PSD at slice level is better and this is

clearly shown in Figure 7.10. The proposed framework significantly improves the

percentage of arrivais of the non-corrupted slices at the destination. Indeed, the

aggregated SLR is reduced to achieve an upper bound of 5.8% of the total number of

transmitted slices. In comparison, No-RD, Ex-SeO or Ex-PD reaches respectively,

16.6%, 12.2% and 10.9%.

To complete the comparison, we summarized, shown in the Figure 7.11, 7.12 and

7.13, the SLR per sub-ftow for the four approaches. We observe that, Ex-PSO and Ex

SeD outperform the other approaches in better protecting I...frames. This is consistent

72 7 DISCUSSION

• 3~ 13

-2.5 a

0 Q 0 • •Q 0

2 _ Da-~D

l -'~~'.5 aoooo Ea..PSO

~' - Du.SAJISD

~ ~ 7 OGCIOO &-PSO

- Ex..SCO 01 :1 ~ &-seoQ

0.5 _ No.,AD_ No..RQ

..iO 100 120 140 180 110 200 ~ \0 100 120 140 1150 110 200 220

ButIIf~ (1irc8II) lIuIIr SIa

Figure 7.12: Slice loss ratio (P-frame) Figure 7.13: Slice 10ss ratio (B-frame)

with the results measured at cellievel. As we discarded in the previous section, this

is the cast ta pay for the fair distribution of cell discarding among the VCs that we

applied on Dex-SA-PSO.

• As ta P-frame and B-frame, De."(-SA-PSD demonstrates the best SLR value. This

indicates the capability provided by the FEC mechanism ta protect data at the slice

level.

7.3 Distance Effect

3

_ a.a.a. l00011m 0

0Q

10 UQ0

Q

0lIDOO c..nc.. 1 Ilm 0

1.5 _ ~ • lOGO Ilm

• Q

l l~

, • 0 ~t.5.. DlIIIIa. 1 !lm

•1.5

• •7iO--.-..�aa----'�20~-,~40--I~to~~I.~~2OD~-'22....- \O-~I00~-,.~-t~40--,~.~~,.:---~2OD~~220...-•Figure 7.14: CLR(Aggregate} with different distance Figure 7.15: CLR(I-&ame) with difFerent distance

7.3 Distance Elfect 73

• 3.7 U

U - 0IIIIncI.'000!lm •Q •3.5 OODO ~.'Ian 5.5

0 •l' 0

_:1.3 0 lu~ - DiII8nce. 'llQO Ilm

i U ID

0 ~ . 0000 DiIlance. 1 lIm11

J.5

z.t 0 0

2.1 U :)0

2.~ .00 .20 1'0 '10 '10 200 22 \Ci 100 120 1'0 UIO '10 200 220BuIIIr Sile BulIIr Slze

Figure 7.16: CLR(P-frame) with diff'erent distance Figure 7.17: CLR(B-frame} with dift'erent distance

•

•

As mentioned in Section 5.1, in our framework, the extended CLP Priority Assigna

tion Scheme (Oex-PAS) uses the Ex-CLP bits and dynamically assigns cell priorities

according to the current MPEG-2 frame type and network state. ThereCore, it de

pends on the reception of backward congestion signal from the network. The main

drawback is that its efficiency is dependent on the round trip time delay. To validate

this, we compare two sets of result using the new scheme, one with a backbone link

distance of one kilometer (simulating the LAN case) and the other of one thousand

kilometers (simulating WAN case).

Figure 7.14, 7.15, 7.16 and 7.17 show the results of cellioss ratio. In the WAN sit

uation, the aggregated CLR shows only slight difference from that of LAN. However,

the I-frame cell 1058 becomes much worse. This is expected and can be explained by .

the following reason: since the Dex-PAS scheme assigns the priority of P-frame to

lowor high dynamically depending on the Ceedback of the networks, in LAN situa

tion, this can be achieved promptly to reflect the current congestion state and thus,

P-frame could be dropped to alleviate the congestion as soon as the queue length ex

ceeds MT (middle threshold). In WAN, due to the longer round trip time (RTT),

the feedback is slower and thus the priority assignation level could not be adjusted as

fast. Therefore, the protection of I-frame is Dot as good when congestion gets worse.

Sînce the cast of{·frame protection is to sacrifice some p..frame ceUs, the overall CLR

remains almost the same. The same is true at the slice level, as shown in Figure 7.18,

74 7 DISCUSSION

• u 1.1

• _ .,..,.. ,GOO..01.4

0

0000 DiIIMce.'!lm 0 a0

01.2

5.1a

lSA li _ 0IIIance. 1000 Ml

;$.2Q ï

~o.a _ DIItance. 1 lm0

50.15

4.1

Q.44.1

~ 0iQ100 120 140 UIO 110 200 22 100 120 140 110 110 200 220lIuIIf SIa Ek* Slza

Figure 7.18: SLR(Aggregate) with different Distance Figure 7.19: SLR(I-frame) with different Distance

7.19, 7.20 and 7.21.

7.4 Redundancy Vs Data Ratio

• In this section, we want ta turn to the effect introduced by the redundant data we used

in FEe. As discussed in Section 5.3, the FEe capability is based on the redundancy

ta data ratio.

In our algorithm, the redundancy is determined by M, the number of rows in one

control block. The smaller the M value, means a row of FEe coding (redundant) data

is applied ta a small group of user data, thus the stronger error correcting capability.

We use M as an indicator oC redundancy to data ratio hereafter. The redundant

data per FEe control black needed in arder ta obtain sufficient performance mainly

depends on the cellioss pattern: larger number of redundant data is required for the

strongly corrected cellioss, as mentioned in Chapter 5. As one of the advantages, the

praposed FEC scheme uses a variable size of matrix as FEe control block, thus can

he easily optimized Cor both the number oC redundancy data per FEC frame, and the

actual FEe black size. As a result, this FEC scheme cao be fine tuned to achieve

sufficient throughput and latency performance with reasonable transmission overhead.

As an example, the Collowing demonstrate the behavior of our framework by using

• different values oC M. Notice that, M could he adjusted even during a connection

session by using network management Cacility.

7.4 Redundancy Vs Data Ratio 75

• 1.1 U

1.115 •3.1

,.e 0 _ DIIeanel:.'OOO !lm

:lA1.515 0 0000 DIIIInœ.'!lm •

:u - ~. 'OllOIem.

_1.5l. l~""6 ~ 3 oaoo DliII8nca.' lem

~

10 li) 0 Q....U 'J

1.315 Cl 0

2.e1.3 Cl

1.252.4 0

''\0 '00 '20 1~ teo 110 200 22 2il; 100 120 1~ 1eo 110 200 220Butret su. Bullet su.

Figure 7.20: SLR(P·frame} with different Distance Figure 7.21: SLR(B·frame) with different Distance

•

•

Figure 7.22, 7.23, 7.24, 7.25, 7.26, 7.27, 7.28 and 7.29 show the results of CLR and

SLR when different values of lVI are used. For CLR, it is obvious that the 10ss ratio

decreases as redundancy decreases (M increases) since less overhead is transmitted ta

the network and no recovery issue is considered at this level. In SLR, It is interesting

to notice that there exists a optimized value for M, above which, the SLR increases

with M increases. and below, the SLR increases with M decreases. This could be

explained by the two effects introduced by appending redundant data used for errar

recovery. First, if we added more redundant information, we get more powerful error

recovery capability, and as a result, the 10ss ratio at slice level could be reduced. This

is called recovery effect. However, on the other hand, the appended data consumes

more bandwidth, which could worsen the congestion that alreadyexisted in the net

work. As a result, the cellioss ratio would increase and 50 result in a large slice 1055

ratio, called overhead effect.

Therefore, when redundancy to data ratio is tao low (large M value), the errar

correction power provided by redundant data is insufficient, the SLR decreases with

M decreases as a result of increasing recovery power. That is, recovery effect dom

inates in this phase, this continues until M faUs below certain value (around 25 in

our experiment), then, the SLR turns to increase because the overhead introduced

becomes significant, and cause more and more buffer aggregation. In this phase, the

recovery effect is not comparable with the overhead one.

•76 7 DISCUSSION

•

•

'7 1.5

0

11 Q

Il

'5

5.!'4

l ld·3

~ 50

'24,5

11 00

0 g-'0

0 0

'0 5 '0 '5 20 25 30 31150 10 15 20 25 30 35

M SlH(rowI) .. Sih(IUWI)

Figure 7.22: CLR(Aggregate) with diff'erent redun- Figure 7.23: CLR(I-frame) with different redun-dancy daney

• 4.2

a4

c

U

U

5 3..

l lu~u iu Q

0

30

2.1

3.5 0

00-· U

30

20405 10 15 20 25 30 :It 5 '0 '5 20 25 30 35

.. Sih(1UWI1 .. sa(rGIIIl

Figure 7.24: CLR(P-frame) with dift"erent redun- Figure 7.25: CLR(B-&ame) with dürerent redun-dancy dancy

•7.4 RedUl1dancy Vs Data Ratio 77

10

Q

U

•_1.5~

S14 1

Cl

7.5

U

Q

2.2

2.1

l 2

;Ut.I

Q

0

0-U o.

0

010 15 20 25 JO :sr

"'0.. SIz.(JWa) 5 10 15 20 25 JO 35.. 51a(ra-a)

•

•

Figure 7.26: SLR(Aggregate) with different redun- F' 12- SLR(I Cr ) 'th d:a; t d ddancy 19ure • 1: - ame W1 lUeren re un aney

U 5.2Q :;1

U ....

_ 2 ....t. la- i....'514

lA4.2 0

0

lA

O. :li ~

1)•.. .0-.0

·Ct1.4

0 5 10 15 20 25 JO :sr 110 5 10 15 20 25 JO 35.._~) .. S1a(r-.)

Figure 7.28: SLR(P-Crame) with dift'erent redun- Figure 1.29: SLR(B-Crame) with different redun-dancy dancy

•

•

•

8

CONCLUSION

78

•8.1 CODclusions

8.1 Conclusions

79

•

•

In this thesis, we studied the transmission of encoded MPEG-2 video data over ATM

best effort services (i.e. VBR). we have surveyed a number of issues related to the cod

ing and control of MPEG-2 video traffic in ATM networks. Base on these knowledge,

we proposed and evaluated a quality of service control framework, which takes into

account the specifie stochastic properties of MPEG-2 video traffic. This framework

consists several components ta satisfy different requirements of improving end-tO-end

quality of service for video applications in ATM network.

First, a new priority data partition technique which extends ATM prioritization ca

pability is proposed. To better cope with the hierarchical MPEG-2 data transmission

requirements, a new field located in the ATM cell header is defined and referenced as

Ex-CLP. This field comprises the classieal CLP bit and the adjacent PTI ATM-user

to-ATM-user bit. By gathering these two bits together, a better utilization of the cell

header is achieved. This extended priority assignment strategy minimize the loss of

critieal video frames and provide better performance than classical CLP-based tech

niques. However, the main drawback of this strategy is that its efficiency is dependent

on the round trip time delay, thus on the network topology and link distances.

Ta support this sliee-based data partition scheme, a new MPEG-2 video stream

encapsulation strategy is aIso presented. In this scheme, each MPEG-2 PES packet

contains exactly one video slice, so the decoder can easily determine the start/end of

the slice, the network transport and control policies can take henefit of this ta offer

a guaranteed packet-oriented service.

In order to handle the cell 10ss and signal error in the network, we designed a

slice-based service specifie convergence sublayer, which enhanced the classical ATM

adaptation layer type 5. Among the additional features supported by the new ex

tended AAL-5 are the ability ta distinguish video frame types, as weIl as the detection

and the Corward correction of 1085 and errors at both byte and cell basis. The forward

errar correction is based on Reed-Solomn and parity codes. In comparison with those

based on only Reed-Solomn code with byte interleaving, our approach permits the

use of flexible matrix structure and a correction granularity at the basis of byte and

cel!. Moreover, it better takes ioto account the fixed structures of MPEG-2 transport

80 8 CONCLUSION

• stream packet and ATM cell to avoid bit padding at the lower ATM convergence

sublayer. To implement this FEC mechanism, we defined a control block as two di

mensional matrix, which contains fixed length service specific convergence sublayer

PDUs. ACter constructed by the source side FEe process, this control block is usoo

by destination side to detect, localize and finally correct errors.

We aIso presented an intelligent video-oriented switch data dropping scheme,

(selective-adaptive partial slice discarding) which is performed at both control block

and video slice level. Unlike the traditional partial slice discarding scheme, SA-PSD

stops discarding as saon as the congestion decreases and ooly if the number of pre

viously dropped cells in every control black is below a drop tolerance. By using this

approach, the proposed scheme acts at a finer data granularity and better preserves

entire slices from elimination.

•

•

The integration oC these schemes provided us an efficient and intelligent video

delivery service with quality of picture control optimization. Results have shawn that

the proposed Cramework is able ta ensure gracerul picture degradatian during overload

periods as weil as increase of network performance.

We implemented this Cramework in NIST ATM simulator and used MPEG-2 trac

ing file as input to simulate the network transmission state. Results show that our

Cramework (Dex..SA-PSD), as with other two preventive strategies, selective cell dis

carding combined with extended priority assignation scheme (Ex-SeD) and partial

sUce discarding with extended priority assignation scheme (Ex-PSD), can concentrate

the data loss within the B-frame and thus prevent critical video data loss (e.g. 1- and

P-frames). Dex-SA-PSO results in more eellioss if caleulated at the cellievel, but

demonstrated an improved result at the sUce level - the number of non-eorrupted

slice arriving at destination signifieantly increases. A slight reduction of the mean cell

transfer delay for the aggregate video stream is experieneed because of the overhead

introduced by the FEC mechanism. In a WAN situation, the proposed framework

endures performance degradation in protection of critieal information, sinee it de

pends on the reception of reedback from the network. As a result, the efficiency of

Dex-SA-PSO is dependent on the round trip time delay. Finally, we have aIso shawn

that the proposed FEe scheme is 8exible and can be easily optimized for both the

number of redundancy data per FEe block and the actual block size. As a result, it

can achieve sufficient throughput and latency performance through careful selection

of the redundancy data ratio.

•8.2 Future Work 81

•

•

8.2 Future Work

In this framework, as we have seen, the major drawback of the Dex-PAS scheme is

that its efficiency is dependant on the link distance and network topology, thus, not

work weIl in WAN situation. Further work should be done to improve this. For

example, we can segment the netwark inta smaller pieces, and let the switches act as

''virtual source" and/or ''virtual destination", thus reduce the round trip time of the

resource management cell [38J.

Another problem is that, in our FEC-SSCS protocol, it introduces sorne redundant

information, which could be significant in sorne situation (e.g. when the user data

packet is very small). Thus, further wark need ta be done to minimize this kind of

overhead. As an example, we could avoid transmit the stuffing bits (ta construct a

control block at sses sublayer) if both send and receive entities knows the pattern for

stumng. Of course, this makes it necessary to use variable length of SSCS-PDU and

thus need a field in the POU head (or tail) to indicate the actual POU length. We have

actually implemented this in our simulatar, and have observed a slight improvement

of the performance. Further studies need ta be done on this subject ta confirm this.

In this thesis, we only tested the mean Cell Transfer Delay of the MPEG-2 video

stream. For many real-time applications, however, delay jitter (Cell Transfer Delay

Variance) is much more important than the mean delay. In other words, manyappli

cations are not sensitive to a slight increase of average delay, but are sensitive to an

increase of delay jitter. Thus a performance evaluation with respect to delay jitter

should be studied in the future.

•BIBLIOGRAPHY

[1] S. Dbeit and P. Skelly. Mpeg-2 over ATM for video dia! tone networks: issues

and strategies. IEEE Network, 9(5):30-40, 1995.

[2] S. Keshav M. Grossglauser and O. Tse. RRCBR: a simple and effective service

for multiple time-scale traffic. Proceedings of ACiW: SIGCOMM'95, 25:219-230,

1995.

[3] H. Kanakia, P. Mishra, and A. Reibman. Packet video transport in ATM net

works with single-bit feedback. Proceedings of the sixth International Workshop

on Packet Video, Portland, Oregon, September 1994.

• [4] T. V. Lakshman, P. P. Mishra, and K. K. Ramakrishnan. Transporting com-

pressed video over ATM networkswith explicit rate feedback control. IEEE IN

FOCOM 97, Kobe, Japan, March 1997.

[5] H. Zhang and E. \V. Knightly. RE0-VBR: A renegotiation-based approach to

support delay-sensitive VBR video. ta appear in ACM/Sringer- Verlag Multime

dia Systems Journal, 1996.

[6] H. Kanakia, P. Mishra, and A. Reibman. An adaptive congestion control

scheme for real-time packet video transport. Proceedings ofACM SIGCOMM'93,

September 1993.

[7] P. N. Tudor. MPEG-2 - video compression tutorial. IEE Colloqium on "MPEG

2 - what it is and what it ÎSn't", January 1995.

•[8) International Organization for StandardizatioD. Information technology

Generic coding of moving pictures and associated audio: Video, recommendation

H.262, ISO/IEC 13818-2. Droft International Standard Edition, November 1994.

82

[91 M. W. Garrett and W. Willinger. Analysis, modeling and generation of self

similar VBR video traffic. Computer Communication Review, 24(4), October

1994.

•BIBLIOGRAPHY 83

•

•

[101 Uyless Black. ATM:Foundation For Broadband Networks. Prentice Hall Series

in Advanced Communications Technologies, 1995.

[111 International Organization for Standardization. Information technology

Generic coding of moving pictures and associated audio: Systems, recommenda

tion 8.222.0, ISO/IEC 13818-1. Droft International Standard Edition, November

1994.

[12} H. Eriksson. Mbone~ The multicast backbone. Communications of the AGM,

37(8), August 1994.

[13] S. Klett. Mbone: Videoconferencing over internet. CRosseUTS, 5(1), April

1996.

[14) M. R. Macedonia and O. P. Brutzman. Mbone provides audio and video across

the internet. Computer, 27(4), Apri11994.

[15] A. Guha, H. Esaki, G. Carle, and T. Dwight. Necessity of cell-Ievel FEe scheme

for ATM networks. ATM-Forum Technical Contributions/95-1011.

[16] W. Luo and M. El Zarki. Analysis of error concealment schemes for MPEG-2

video transmission over ATM based network. SPIE'95, 2501:1358-68, 1997..

[17] G. Ramamurthy and O. Raychaudhuri. Performance of packet video with com

bined error recovery and concealment. IEEE INFOCOM'95, Boston, pages 753

61, 1995.

[18] S. Lee. CeU loss and error recovery in variable rate video. Journal 0/ Vsual

communication and image representation, pages 39-45, March 1993.

[19] E. W. Biersack. Performance evaluation oC Forward Error Correction in an ATM

environment.IEEE Journal onSelected Areas in Communication, 11(4):631-640.

84 BmLIOGRAPHY

• [20] H. Ohta and T. Titami. A cellioss recovery method using FEe in ATM networks.

IEEE Journal on Selected Areas in Communication, 9:1471-1483, Decmber 1991.

[21] A. J. McAuley. Reliable broadband communication using a burst erasure cor

recting code. ACM, pages 297-306, 1990.

[22] P. Pancha and M. El Zarki. Mpeg cading for variable bit rate video transmission.

IEEE Communication magazine, pages 54-66, May 1994.

[23] B. DeGleen, P. Pancha, and M. El Zark. Comparison of priority partition meth

ods for VaR MPEG. IEEE INFOOM'94, pages 689-96, 1994.

[24] G. Armitage and K. Adams. Packet reassembly during cellloss. IEEE Network

magazine, 7(5):26-34, September 1993.

•

•

(25] A. Romanov and S. Floyd. Dynamics of TCP traffic over ATM networks. ACM

SIGCOMM'94, pages 79-88, September 1994.

[26] R.Jain et al. Buffer requirements for TCP over UBR. ATM Forum 96-0518,

April 1996.

[27] J. Heinanen and K. Kilkki. A fair buffer allocation scheme. unpublished

manuscript.

[28] A. Mehaoua and R. Boutaba. Performance analysis of a slice-based discard

scheme for MPEG video over UBR+ service. Proceedings of /CCC'97, Cannes,

France, November 1997.

[29] ATM Forum. Real-time ABR: Proposai for a new work item. AF-TM-96-1760,

December 1996.

[30] D. LeGal!.. Mpeg: a video compression standard for multimedia application.

Communications of A CM, 34:47-58, April 1991.

[31] ATM Forum. Draft proposai for specification oC FEC-SSCS for AAL type 5.

AF-SAA-0926RB, October 1995.

[32] A. K. Kanai, R. Grueter, et al. Forward error correction control on AAL5:

FEC-SSCS. /CO'96, Dallaa, TX, pages 384-391, June 1996.•BIBLIOGRAPHY 85

•

•

[33] ATM Forum SAA SWG. Audiovisual multimedia services: Video on demand.

AF-SAA-0049.000, December 1995.

[34] I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. SIAM

Journal of Applied Mathematics, 8:300-304, 1960.

[35) S. B. Wicker and V. K. Bhargava. Reed..solomom codes and their applications.

IEEE press, 1994.

[36] Perth. B-ISDN adapation layer (AAL). November 1995.

[37] M. R. Izquierdo and D. S. Reeves. Mpeg VBR sUce layer model using lin..

ear predictive coding and generalized periodic Markov chains. International

Performance, Computing and Communications Conference, Scottsdale, Arizona,

Febrary 1997.

[38} M.Hluchyj et aL Closed-Ioop rate-based traffic management. AF-TM 94-0438R2,

September 1994.

information to users -...

Documents