optimization of a packet video receiver under different ...vanhoudt/papers/2004/peva_nikos.pdf ·...

Performance Evaluation 55 (2004) 251–275

Optimization of a packet video receiver under differentlevels of delay jitter: an analytical approach�

Nikolaos Laoutarisa,∗, Benny Van Houdtb, Ioannis Stavrakakisaa Department of Informatics and Telecommunications, University of Athens, 15784 Athens, Greece

b Department of Mathematics and Computer Science, University of Antwerp, B-2020 Antwerp, Belgium

Received 8 October 2002; received in revised form 17 July 2003

Abstract

This paper studies the problem of analyzing and designing optimal playout adaptation policies for packet video receivers(PVRs) that operate in a delay jitter inducing best-effort network, like the current Internet. The developed system model isbuilt around theEk/Di/1/N phase-type queue and allows for the effective modeling of key design and system parameters,such as: the level of delay jitter, the performance metrics and the employed playout policy. The optimal playout policy isderived underk-Erlang interarrivals by formulating and solving an optimization problem. The (theoretical) optimal solution istransformed into an approximately optimal one that utilizes observable information and it is, thus, feasible. Numerical resultsare derived under the optimal policy and compared against those under the optimal policy that assumes a fixed level of jitteras determined by Poisson arrivals, as well as against the deterministic service that applies no playout adaptation. Based onthis work, a PVR is proposed that adapts to varying network delay jitter and tries to induce a performance that approximatesthe derived theoretical optimal one.© 2003 Elsevier B.V. All rights reserved.

Keywords: Packet video communications; Network jitter; Synchronization quality; Playout adaptation

1. Introduction

The transmission of video streams, real-time or pre-stored, over best-effort networks has been aninteresting research area for over a decade. An important objective of the research community has beento devise methods that cope with the variations of the network delay—also called delay jitter—that arean inherent characteristic of best-effort networks. Jitter destroys the temporal relationships between theperiodically transmitted video frames thus hinders the comprehension of the stream. Playout adaptation

� This work was supported in part by the Special Account for Research Grants of the National and Kapodistrian University ofAthens and the IST program of the European Union under contract IST-2001-32686 (Broadway).

∗ Corresponding author.E-mail addresses: [email protected] (N. Laoutaris), [email protected] (B. Van Houdt), [email protected](I. Stavrakakis).

0166-5316/$ – see front matter © 2003 Elsevier B.V. All rights reserved.doi:10.1016/j.peva.2003.07.004

252 N. Laoutaris et al. / Performance Evaluation 55 (2004) 251–275

algorithms undertake the labor of the temporal reconstruction of the stream to a form that resembles asmuch as possible its initial structure, as enforced by the encoding process. Theintrastream synchronizationquality of the stream quantifies the extent of this temporal reconstruction and is directly related to theperceived presentation quality at the receiving end.

A packet video receiver (PVR) consists of a playout buffer, for the temporary storage of incomingframes, and a playout scheduler, for the determination of the presentation initiation time and the pre-sentation duration of each frame. The playout scheduler is given the ability to regulate the presentationduration of a video frame (which normally is fixed and equal to the inverse of the frame productionrate) in an attempt to smooth-out the effects of network jitter. The general principle that drives theoperation of the scheduler is that large discontinuities between consecutive frames are undesirable asthey are easily detected by human users and, therefore, it is desirable to break them into discontinu-ities of smaller duration that may be unnoticed due to human perceptual limitations in the detection ofmotion.

By manipulating the duration of frames, the playout scheduler also affects the number of bufferedframes. Unpresented frames that wait in the playout buffer increase the end-to-end delay of each newlyarriving frame. The end-to-end delay measures the time between the encoding of a frame at the senderand its presentation at the receiver. Applications that have a dialogic nature (e.g., videoconferencing) callfor a small end-to-end delay so that they can offer the required interactivity to the communicating parties.

This paper is concerned with the performance evaluation and optimization ofbuffer-oriented playoutschedulers, which perform their task without the use of timing information (clocks and timestampingof frames), as opposed totime-oriented schedulers which utilize such information. The systems that areconsidered here use the current occupancy of the playout buffer as an implicit indication of jitter and baseall regulatory actions on that information. The two main contributions of this work are: the development ofan analytical performance evaluation model for PVRs, capturing key design and environmental parameterssuch as the level of delay jitter, the operation of the playout scheduler, and the considered quality metrics;and the development of an analytical optimization model that has the potential of deriving the optimalPVR design, under appropriate quality metrics, for different levels of delay jitter.

The remainder of this paper is organized as follows. InSection 2we examine some relevant work fromthe literature. InSection 3we discuss issues related to the modeling of frame arrivals at the PVR. Aqueuing model for PVRs and the associated performance metrics are developed inSection 4. Section 5formulates a Markov decision problem whose solution is the theoretical optimal PVR for a given levelof delay jitter. Some numerical results from the optimized systems along with a comparison with earliersystems are presented inSection 6. In Section 7we show how to apply the theoretical optimal solutionto a real-world PVR. InSection 8we describe the overall architecture of a potential implementation ofthe system and propose a way to adapt to fluctuating delay jitter.

2. Related work

A survey of proposed playout schedulers, both time and buffer-oriented, has been presented in[1]. Herewe selectively present some buffer-oriented schemes that are of particular interest to the current work.

The fundamental idea that the level of delay jitter can be implicitly deduced by observing the occu-pancy of the playout buffer has been demonstrated with a system that implements thequeue monitoring(QM) algorithm[2]. Under QM, a sequence of video frames that has been presented in a continuous

N. Laoutaris et al. / Performance Evaluation 55 (2004) 251–275 253

manner—meaning that the queue was never found empty following the completion of a presentation—isused as an indication of reduced delay variability and triggers a reduction of the end-to-end delay of thestream by discarding the newest frame from the buffer. The configuration of the algorithm is empirical,based on traces of real frame interarrivals. Although QM handles the basic continuity-latency tradeoffit does not try to “smooth-out” the disruptive effects of this process. It allows the natural build-up ofthe buffer with (detectable1) underflows, while it decreases the delay with frame discards (which canalso be detectable, especially in the case where the frame has a significant duration, e.g., in low framerate encoding). Newer systems try to avoid long-lasting discontinuities. Thethreshold-slowdown (TS)scheduler of Yuang et al.[3] applies a more general regulation scheme governed by the selection of theslowdown-threshold TH. Frames are presented with a normal duration when the buffer occupancy,i, isgreater than the (fixed) threshold, and with extended duration, by a factor equal to TH/i, when the bufferoccupancy is smaller than the threshold. In essence, the scheduler attempts to prevent an impendingunderflow, when the occupancy of the buffer is small, by applying gradually reduced playout rates. In[4]we examined some of the implications of network jitter in the selection of an appropriate threshold valueand provided algorithms that modify this value on the fly, in response to changing jitter.

The work that is most relevant to the current one is[5].2 It differs from [3,4] in that instead of usinga heuristic method (like the aforementioned threshold) for the regulation of the playout rate, it appliesthe best possible playout policy (for the assumed environment) which emanates from the analyticalsolution of an appropriate optimization problem. This playout policy, however, delivers the desired optimalperformance only under the assumed level of delay jitter. One contribution of the present work is that itextends the methods and the results of Laoutaris and Stavrakakis[5] (see Footnote 2) so that they may beapplied to different levels of delay jitter, allowing for the exploitation of the system in real-world PVRsthat operate under fluctuating delay jitter.

3. Modeling delay jitter

Video frames are periodically transmitted by a sender at a rateλf which is specific to the employedvideo format, usually at 25 or 30 frames/s. The spacing between consecutive frames at the sender and theduration of each frame,T , are equal, given by the inverse of the frame production rate, i.e.,T = 1/λf .

Due to the variable network transfer delay of best-effort networks, frames arrive at the PVR atnon-regular intervals that may deviate significantly from the frame periodT . The variability of theinterarrival intervals is directly related to the variability in the network transfer delays (network jitter).The ith frame interarrival,Xi, is given byXi = T + Dn,i − Dn,i−1, whereDn,i denotes the networkdelay of theith frame. IfDn,i = Dn,i−1 the interarrival spacing is equal to the interdeparture spacing. IfDn,i > Dn,i−1 the two frames drift apart (Xi > T ) otherwise they approach each other (Xi < T ). ForDn,i−1 = Dn,i + T the two frames arrive concurrently at the PVR, a phenomenon called clustering offrames. Due to the high degree of aggregation of the traffic that co-exists with the video stream in thenetwork, it becomes reasonable to assume independent and identically distributed (i.i.d.) network transferdelays, that is,Dn,i’s, thus giving rise to the following observations: the expected duration of interarrivals

1 We mean “detectable” under the motion detection capabilities of a human end-user.2 An earlier version was presented at the Second International Workshop on Quality of Future Internet Services (QofIS2001),

Coimbra, Portugal.


isE{X} = T ; the variance of interarrivals is Var{X} = 2 Var{Dn}; the distribution is symmetrical aroundits mean value; see[6] for more details on these observations.

Previous studies of buffer-oriented playout schedulers have used the Poisson[3–5,7] (see Footnote 2)or the interrupted Poisson process[4,8], for the modeling of frame arrivals. The exponentially (hyperex-ponentially) distributed interarrivals, that are implied by the Poisson (interrupted Poisson) process, havesome properties that limit their value as models of the true interarrivals of periodic streams that have beenreshaped by jitter. First, the exponential distribution is not symmetrical around its mean value, as requiredby the independence ofDn,i’s. The symmetrical nature of the interarrival times does not only stem fromthe i.i.d. assumption made for the network delays but it has also been verified experimentally on realnetworks[6,9,10]. Second, the exponential distribution is much more variable than measured interarrivaldistributions, making it appropriate only under conditions of extreme delay jitter that results in highlyvariable interarrivals at the PVR. These observations apply, to a greater extent, also for the interruptedPoisson process.

Under “normal” network conditions, frame interarrivals tend to be much more regular than what theexponential distribution provides. To capture this increased regularity we use throughout the rest of thiswork thek-Erlang distribution for the modeling of frame interarrivals. Ak-Erlang distribution, being ak-fold convolution of an exponential distribution, isk timesmore regular than the exponential distributionof the same mean. Ak-Erlang interarrivalX with mean valueT is given byX = ∑k

i=1 Yi, whereYi, for1 ≤ i ≤ k, is an exponentially distributed random variable with meanT/k. For the random variablesXandY we haveE{X} = k · E{Y} = T ; Var{X} = kVar{Y} = T 2/k; Var{Y}/E2{Y} = 1/k. The lastratio denotes the regularity of the distribution. The exponential distribution has a reference regularityequal to 1 and isk times less regular than the correspondingk-Erlang of same mean. Thek-Erlang canapproach the regularity of a deterministic distribution by increasingk so that the ratio approaches 0. Alsofor sufficiently largek thek-Erlang is almost symmetrically distributed around its mean value.

To identify the range of interarrival variability that should be modeled by thek-Erlang distribution wehave transmitted a periodic stream of “dummy” frames, at 30 frames/s, and have logged the interarrivalsat the receiving host. The stream has crossed the data path from the University of Athens (UoA), Greece,to the Arizona State University (ASU), USA. The Mgen/Drec suite of tools[11] was used for the creationof the test traffic and for the logging of the interarrival trace.Fig. 1 illustrates the measured variance ofinterarrivals at ASU, over 10 min intervals, throughout an entire working day. In they-axis instead ofmarking the actual measured variance, we mark the points that correspond to the variance ofk-Erlang for1 ≤ k ≤ 32. The results indicate that the measured variance corresponds to a range of regularities fromPoisson (high jitter at 10:00–11:00 h) down to 34-Erlang (much more regular interarrivals), with severalintermediate levels in-between.

Althoughk times more regular than Poisson, ak-Erlang input stream in the aforementioned range wouldlead to poor playout quality if not handled by an appropriate playout algorithm, such as the one developedhere. In the following we provide an example to support this claim usingk = 20. This value is a fre-quently encountered one in the measurement experiment.Fig. 2shows the steady-state buffer occupancydistribution of a receiver that is fed by a 20-Erlang input. The receiver can hold up toN = 30 frames. Twocases are plotted: (i) DS, which amounts to a deterministic service (DS), where all frames are presented attheir normal duration; (ii) EO(33), which is an example of the playout policies developed here and aims atavoiding discontinuities due to buffer underflows and overflows. The occupancy distribution is derived forframe presentation completion instances. Observe that under DS a 0.5% of presented frames are followedby an underflow, whereas the corresponding percentage for EO(33) is almost equal to zero. The 0.5%


1

2

4

8

16

32

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Var

{inte

rarr

ival

s} (

equi

vale

nt k

-Erla

ng)

time of day (hours)

Fig. 1. The variance of interarrivals of a test stream (with 30 frames/s) from UoA to ASU. Thex-axis marks the beginning time ofeach 10 min trace. They-axis marks the points that correspond to the variance of interarrivals ofk-Erlang distributed interarrivals(with 30 frames/s mean rate). The logged interarrival correspond to jitter levels from slightly higher than Poisson, to as low as34-Erlang.

underflow percentage of DS amounts to 30× 60× 0.005 = 9 underflows per minute for a 30 frames/sstream. Such a rate of underflows is considered to be very high[2] thus resulting in poor playout quality.To this rate of discontinuities, one should add a substantial rate of buffer overflows. Indeed, observe thatthe DS playout often leads to high occupancies—where overflows occur—whereas the EO(33) policyeffectively avoids high occupancies, thus avoiding frame overflows.

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0 5 10 15 20 25 30

π i /

(ste

ady-

stat

e pr

obab

ility

)

i (buffer occupancy, frames)

k=20, N=30

EO(33)DS

Fig. 2. Steady-state buffer occupancy distribution for a receiver that can hold up to 30 frames. The input process is 20-Erlang.DS and EO(33) playout policies are considered.


In the sequel we develop some new playout policies by using thek-Erlang distribution as a model whichmatches the first two moments of real frame interarrival distributions. It should be noted that the currentwork does not strive to model frame interarrivals as accurately as possible, as has been done by trafficmodeling works[12,13]. The final objective here is to identify the optimal playout policy for differentlevels of delay jitter, thus, only the macrodynamics (first two moments) of the input traffic are modeled,therefore, Erlang arrivals suffice. It is possible to use a more elaborate interarrival process like a PH[14]or a (B)MAP[15,16], though it remains to be seen whether the optimal playout policy can be determinedby the value-iteration algorithm in an efficient manner.

The numerical results ofSection 6are based on jitter levels that correspond tok-Erlang distributions,1 ≤ k ≤ 50, motivated by the measured variances in the UoA–ASU experiment. The selected range,however, need not be taken as a “standard” range of jitter fluctuation; other ranges of delay jitter, corre-sponding probably to even more regular arrivals (k > 50) could be examined as required by the targetednetwork environment. In fact, we have obtained the optimal playout policies for much largerk (above150) that correspond to almost deterministic interarrivals. The scalability of the proposed algorithm forobtaining the optimal playout policy should not pose a problem as it is unnecessary to optimize forvery largek, e.g.,k = 1000, since for much smallerk the optimal playout policy already reduces todeterministic playout (all frames are presented at their normal duration).

A final note on the jitter that is reported by our measurements is that this is only the network part of theoverall jitter that is present at the application layer of a PVR. The end-system jitter, introduced by the oper-ating systems of the end-systems, is missing. This jitter component in many cases can be quite large, pos-sibly dominating the network portion of jitter, especially when overloaded video-on-demand servers areused for the streaming of the content. This means that even a small network jitter does not necessarily makethe proposed playout policies obsolete, as there is still the end-system jitter that needs to be smoothed out.

4. Performance analysis of PVRs under k-Erlang arrivals

This section develops an analytical model that has the potential to capture the characteristics of aconsiderable number of buffer-oriented playout schedulers and provide for their performance evaluationacross different levels of delay jitter. Along with the introduction of the queuing model, the involvedperformance metrics are presented and justified in a subsequent section. The analysis is carried out atthe application layer of a PVR; it focuses on solid video frames that become available at the applicationlayer from the underlying layers and uses the interarrival of frames at the application layer to capture theaggregate effect of the end-to-end delay jitter (network and end-system parts). Implementation specificissues such as video encoding and network level packetization schemes are not discussed in order topreserve the generality of the proposed playout policies. It is expected that with minor modificationsthe proposed algorithms should be applicable to a variety of encoding schemes (raw, MPEG, H263) andpacketization formats. Finally, it is noted that although network packet losses are not considered in thiswork their effect on the overall stream quality is expected to be much smaller as compared to that of delayjitter. Loguinov and Radha[17] confirm this claim in their recent large-scale study of Internet dynamicsand its effect of video streaming where it is reported that 98.9% of buffer underflow events in a PVR weredue to delay jitter and only the tiny remainder due to packet loss.3 Based on these reports and accounting

3 In their study they have used retransmissions for the recovery of lost packets but report that even without them the maindisruptive cause would still be the delay jitter not the packet losses.


Fig. 3. Timing diagram:tn, the time prior to the presentation initiation of framen; an, time of last arrival prior totn; En, theelapsed time since the last arrival,En=tn − an.

that lost packets can be recovered/concealed by employing forward error correction techniques at thereceiver it is believed that the presented jitter oriented assessment of playout quality will approximatesufficiently the quality in an operational system in the presence of some packet loss.

In the following a PVR is modeled as anEk/Di/1/N queuing system, i.e., a queue with the follow-ing properties: an Erlang arrival process (Ek), appropriate for the modeling of jitter-dependent frameinterarrivals; a deterministic state-dependent playout policy (Di), modeling the operation of a generalbuffer-oriented playout scheduler which applies frame durations that depend on the current queue oc-cupancyi; a finite playout buffer forN video frames. In the next section, we obtain the steady-stateoccupancy distribution for theEk/Di/1/N queue upon service initiation times4 by using the method ofphases[18]. This method, commonly used to analyze theEk/D/1/N queue which is one of the simplestqueues in the family ofPH/G/1 queues[19], is straightforward to generalize to theEk/Di/1/N queue.

4.1. The embedded Markov chain

Frame interarrivals are assumed to follow ak-Erlang distribution with mean rateλf = 1/T , i.e., thenth interarrivalXn is composed ofk i.i.d. intervalsYj, all following the exponential distribution withparameterλ = kλf , such thatXn = ∑k

j=1 Yj. The passage of each exponential intervalYj is referred toas “completion of a phase”—with a single frame arrival occurring whenk phases have been completed.

Let{In}n>0 denote the number of frames in the buffer at timetn which is the time prior to the presentationof framen and let{In}n>0 the number of phases in the system attn. {In}n>0 counts the number of phasesin the system accounting for the buffered frames (each frame is seen as a batch ofk phases) and thenumber of phases thus far completed by the ongoing arrival. The value ofIn can be directly obtained fromthe value ofIn, which hasIn embedded in it. Formally,In = kIn + Jn, where{Jn}n>0 is a discrete timestochastic process that counts the number of phases completed by the arrival process in the interval froman, the time of the last arrival prior totn, up totn (seeFig. 3). The knowledge ofIn not only provides forthe exact number of frames in the buffer prior to the presentation of thenth frame (sinceIn = ⌊

In/k⌋),

but also adds information concerning the next arriving frame. The extra information (memory) is used forthe approximation of non-memoryless interarrival distributions, such that are typical of periodic streamsthat have been reshaped by network jitter. Finally, it is important to notice thatIn is a Markov chain, thatis, In+1 does not depend uponIm, for m < n, provided thatIn is known.

Next, we indicate how to obtain the transition probability matrixP of the Markov chainIn. The valueof In varies betweenk and(N + 1)k − 1: k is the minimum number of phases (corresponding to one

4 Hereafter also calledpresentation initiation or decision instances, depending on the focus of the discussion.


complete frame) that must be in place in buffer for the next presentation to begin;(N + 1)k − 1 phasescorrespond to the case of a full buffer (N complete frames which are represented bykN phases) andk−1phases having been completed by the ongoing arrival. The evolution of the number of phases in the systemcan be seen as a queue with a waiting room for(N + 1)k − 1 customers—where each phase representsa customer—with the following characteristics: (1) the customers arrive according to a Poisson processwith ratekλf ; (2) the customers are served in batches of sizek, where the service timeDi of a batch isdeterministic and depends on the number of phasesi found in the waiting room prior to service initiation;(3) if a customer finds the queue full upon arrival, it is dropped andk − 1 other customers immediatelyleave the queue. Notice that the server serves the customers in batches of sizek, therefore, it will notcommence until there are at leastk customers in the waiting room. This requirement, along with the factthat the system is observed upon frame presentation initiation instants, mean that the system can neverbe found empty (with less thank phases) upon an observation instant.

With the aforementioned system description, the expression of the phase transition probabilities, de-noted aspij(Di) = Pr{In = j|In−1 = i,Di}, is simplified.Di denotes the service duration for thek-sizedbatches; it depends solely on the number of phases in the waiting room prior to service initiation. Beforeproceeding, we present three examples of DS disciplines: the DS with constant durationT independentof phase occupancy; the TS algorithm of[3]; and the Poisson-optimal (PO) service discipline derived in[5] (see Footnote 2) which selects a duration∆n:5 1 ≤ n ≤ N, n being the frame occupancy (equal ton = �i/k here):

Di =

T, k ≤ i ≤ (N + 1)k − 1 (DS),

max

(TH

�i/k · T, T), k ≤ i ≤ (N + 1)k − 1 (TS),

∆�i/k , k ≤ i ≤ (N + 1)k − 1 (PO).

(1)

Also note that the QM algorithm[2] (briefly discussed inSection 2) can be studied with the aforementionedmodel by appropriately augmenting the definition of the state so that in addition to the current occupancyit also includes the number of continuous uninterrupted frame playouts.

Based on the three characteristics of the size(N+1)k−1 queue mentioned above, we get the followingtransition matrixP , where the(i, j)th element ofP is denoted aspij(Di):

pij(Di) =

2k−i∑σ=0

P{σ,Di}, k ≤ i < 2k, j = k,

P{j − i+ k,Di}, k ≤ i < 2k, k < j ≤ Nk − 1,∞∑σ=0

P{j − i+ k + σk,Di}, k ≤ i < 2k, Nk ≤ j ≤ (N + 1)k − 1,

P{j − i+ k,Di}, 2k ≤ i ≤ (N + 1)k − 1, i− k ≤ j ≤ Nk − 1,∞∑σ=0

P{j − i+ k + σk,Di}, 2k ≤ i ≤ (N + 1)k − 1, Nk ≤ j ≤ (N + 1)k − 1,

0 elsewhere,

(2)

5 The value of∆n, for 1 ≤ n ≤ N, is derived as the solution of an appropriate Markov decision problem assuming Poissonframe arrivals and studying the system upon service completions.


whereP{m,X} is the Poisson distributed probability ofm new phase arrivals occurring in an intervalof durationX. The first five lines ofEq. (2) correspond to the following cases: (i) there is only oneframe available in the buffer (at timetn) and an underflow follows its presentation attn + Di (playoutrecommences whenk phases become again available thusj = k); (ii) there is only one frame availablein the buffer (at timetn) and no overflow(s) occur during its presentation; (iii) there is only one frameavailable in the buffer (at timetn) and possible overflow(s) occur during its presentation; (iv) more thanone frame is available in the buffer (at timetn) and no overflow(s) occur during the imminent presentation;(v) more than one frame is available in the buffer (at timetn) and possible overflow(s) occur during theimminent presentation.

Let πj, k ≤ j ≤ (N + 1)k − 1, be the steady-state probabilities of{In}n>0; thenπi, 1 ≤ i ≤ N, thedistribution of the embedded process{In}n>0, is readily available since it holds thatπi = ∑(i+1)k−1

j=ik πj.The 1× Nk steady-state vectorπ is obtained as follows. Define theNk × Nk matrixQ asP − I, whereI is the unity matrix of dimensionNk. The matrixQ can be seen as an infinitesimal generator of acontinuous time Markov chain andQ can be written in a lower block-Hessenberg form by relabelingthe states appropriately. Therefore, we can calculate the unique stochastic vectorπ for which πQ = 0,i.e., πP = π, in an efficient manner using the Latouche–Jacobs–Gaver algorithm[20], which has a timecomplexity of O(k3N2) and a space complexity of O(k2N).

4.2. Intrastream synchronization (continuity) metrics

Expanding or shortening the duration of a frame presentation, as done by the playout policies ofEq. (1)or any other playout policy, introduces a discontinuity—a loss of intrastream synchronization—quantifiedby the difference between the selected frame duration and the normal frame durationT . Letdi(Di) denotethediscontinuity that is incurred when the next frame (framen) is presented with a durationDi and thecurrent phase occupancy is{In} = i:

di(Di) = |Di − T + Si(Di)|. (3)

The termDi−T quantifies the discontinuity incurred by choosing a playout duration other than the normalone,T . In addition, the metric has to cater for a possible underflow that might follow the completionof the nominal durationDi. Such an underflow would occur if the buffer is found without a completeframe,Di time units after the initiation of thenth presentation. In that occasion framen will remain onthe display for an additional intervalSi(Di) until the next frame becomes available. This brings up itsduration toDi+Si(Di) (seeFig. 4). The absolute value is used in(3)because the quantityDi−T +Si(Di)

Fig. 4. Time diagram of an underflow occurring during the presentation of framen + 1. As a consequence of the underflow,framen+ 1 remains on display for a total duration equal toDi + Si(Di).


may assume negative values. Such is the case whenDi < T and no underflow occurs. Also note thata truncated duration and an underflow may cancel each other. Thus a truncated duration,Di = T/2,followed by an underflow with durationT/2 inadvertently lead to a normal presentation duration and nodiscontinuity.

The underflow intervalSi(Di) depends on the current buffer occupancyi, the selected presentationdurationDi, and also the number of new phase arrivals overDi, y. Underk-Erlang arrivals:

E{Si(Di)|y} ={(k − (i− k + y)) · T/k, i− k + y < k,

0, i− k + y ≥ k.(4)

In the first case the system is left withi′ phases attn +Di which amount to less than a complete frame,thus an additionalk − i′ phases must arrive. The expected time for that is(k − i′) · T/k. In the secondcase the buffer is non-empty attn +Di thus no underflow occurs and the next frame(n+ 1) is displayedexactlyDi time units after the initiation of framen.

We define a more generalized continuity metric—called thedistortion of playout (DoP) metric—whichincludes the notion of discontinuity as in(3) but also accounts for any lack of continuity due to bufferoverflows over the current presentation interval:

DoPi(Di) = di(Di)+ Li(Di) · T, (5)

whereLi(Di) is a random variable that denotes the number of newly arrived frames that are dropped/lost,due to buffer overflows, over the current frame presentation durationDi, given that the number of phasesin the system wasi at tn:

P [Li(Di) = x] ={ ∑k−1

j=0P{(N + x)k − i+ j,Di}, x > 0,∑j<(N+1)k−i P{j,Di}, x = 0.

(6)

Note that the DoP metric measures time; it adds the duration of all time intervals during which the smoothplayout of frames is disrupted. A basic idea reflected in the definitions of bothdi(Di) and DoPi(Di) isthat the perceptual cost of an idle time gap between two frames (occurring when the first frame stays ondisplay for more thanT ) is equal to the perceptual cost of a loss-of-information discontinuity (an overflowor a fast-forward) of equal duration. This is based on recent perceptual studies[21] where it is shownthat jitter degrades the perceptual quality of video nearly as much as packet loss does. For an examplein support of this claim, we may think that an underflow with a durationT , degrades stream continuity,nearly as much, as does a lost frame.

It must also be noted that there is a delicate semantic difference between overflow disruptions andall other cases of disruption (underflows, modified playout durations). The latter are immediately ex-perienced during the presentation of framen whereas an overflow, although occurring during the pre-sentation of framen, is experienced in the future (when the playout skips one or more overflowedframes). With the current formulation (Eq. (5)) the disruption caused by any overflows during the pre-sentation of framen is added to the overall disruption of this frame instead of a future one. This isrequired in order to preserve the tractability of the model. Otherwise various states should be introduced,adding the required memory that allows for the association of past overflows to the currently presentedframe.


5. Derivation of the optimal playout scheduler for different levels of network jitter

In this section we develop a Markov decision model which leads to the derivation of the optimal playoutscheduler for different levels of network jitter, as captured by the variousk values of the assumedk-Erlangarrival process.

5.1. The Markov decision process (MDP)

In Section 4the Markov chain{In}n>0 was used in the context of theEk/Di/1/N queue for thederivation of the steady-state behavior for a given—occupancy dependent—playout policy (Di). In thissection,{In}n>0 is generalized into a discrete time MDP with the aim of deriving the optimal playoutpolicy for different jitter levels (captured by the Erlang parameterk). Let {I∗

n}n>0 be the MDP obtainedfrom the{In}n>0 Markov chain by adding an action following each observation instant (at the time priorto the next playout). This action explicitly defines the playout duration for the next frame and by doingso it incurs an immediate cost and also affects the probability law for the next transition. For the formaldefinition of the MDP one needs to define a tuple〈S,A, P, C〉, whereS is the set of possible states,A theset of possible actions,P : S × A × S→ [0,1] the state transition function specifying the probabilityP{j|i, a} ≡ tij(a) of observing a transition to statej ∈ S after taking actiona ∈ A in statei ∈ S and,finally, C : S×A→ R is a function specifying the costci(a) of taking actiona ∈ A at statei ∈ S.

The state-spaceS of {I∗n} comprises all possible phase occupancy levels, thus takes values in

[k, (N + 1)k − 1]. An action is defined to be the choice of an integer valuea that explicitly determinesB(a), the presentation duration for the next frame through:

B(a) = T · aα

(7)

α defines the basic adjustment quantum which is equal toT/α. The action space for the problem isA =[1,M], whereM is an integer value that results in the maximum allowable playout durationB(M). Noticethat in(7) whena > α (a < α) the resulting playout duration is larger (smaller) than the normal frameduration. When altered playout durations (larger/smaller) are applied to a series of consecutive frames,they constitute a transient alternation of the playout rate (an effect that resembles a slowdown/fast-forwardoperation in a VCR). The transition probabilities of the MDP are “decision-dependent”; they are describedbyEq. (2), but with service durations that are not a priori known, as in the case of a known state-dependentservice, but depend on the chosen action, i.e.,tij(a) ≡ pij(B(a)).

The goal of the decision model is to prescribe a playoutpolicy—a rule for choosing the duration of thenext frame based on the current state. In the general case a policyR ≡ {Dia : i ∈ S, a ∈ A} is a mapping:S × A → [0,1]; it is completely defined for a given tuple〈S,A, P, C〉 by the probabilities:Dia �P{action= a|state= i}. The derived optimal policy, under the considered minimization objective andthe selected solution method (described inAppendix A), always prescribes the same action whenever atthe same state, i.e., the optimal policy is non-randomized (see[22] for details). Under a non-randomizedpolicy the probabilitiesDia are either 0 or 1. In view of this observation, the exact definition of thenon-randomized policyR reduces to the definition of the functionAi(R) = a which for everyi ∈ Sreturns the selected actiona.

As mentioned earlier,ci(a) denotes the cost incurred when actiona is taken when the phase occupancyprocess is in statei. The optimal policyRopt is defined to be the policy that minimizesER{c}, where


ER{c} denotes the long-run average cost induced under some policyR. If πi(R) denotes the steady-stateprobability that the Markov chain{In}, with Di = (Ai(R)/α)T , is in statei, then

Ropt = arg minR

ER{c} with ER{c} =(N+1)k−1∑

i=kci(Ai(R)) · πi(R). (8)

A number of techniques are known for the derivation of the optimal policy of(8); these include exhaustiveenumeration (only for small systems), linear programming, policy-iteration, and value-iteration. Thecurrent MDP problem was solved by using a value-iteration algorithm (described inAppendix A) whichtakes as input the action-dependent transition probabilities,tij(a) = pij(B(a)), and the state-action costs,ci(a) (defined in detail inSection 5.2), and returnsRopt.

5.2. Cost assignment

An appropriate MDP cost is described in this section; it “penalizes” thelack of continuity that mayarise from a certain action. This lack of continuity may be directly experienced as in the case of a framepresentation with a duration smaller or larger thanT . In addition, the continuity cost also accounts forany lack of continuity due to overflowed (lost) frames occurring during that interval.

A candidate for the continuity cost is: DoPi(a) = E{DoPi(B(a))}, i.e., the expected value of DoPi(Di)

with respect to the number of new arrivals, withDi = B(a) (see definition of DoPi(Di) in (5)).DoPi(a) is a legitimate MDP cost as it depends only on the current statei, and the selected actiona.Such a cost assignment returns anE{DoP}-optimal policy, i.e., a policy that minimizes:ES{DoP} =∑

i∈S πi(R)DoPi(Ai(R)) over all the policiesR defined inS × A. The results presented in[5] (seeFootnote 2) show thatRT , the static deterministic policy with constant presentation durations equal tothe frame period, isE{DoP}-optimal under Poisson frame arrivals. The same applies also to the case ofk-Erlang arrivals (see the results ofSection 6).

The minimization ofES{DoP} calls for the minimization of the average amount of synchronizationloss which is due to: underflow discontinuities, slowdown discontinuities, overflow discontinuities andfast-forward discontinuities. The minimization ofES{DoP} is a rightful objective but cannot guaranteethe perceptual optimality, as it only caters to the minimization of the average loss of synchronization,without paying any attention as to how this loss of synchronization spreads in time. It has been real-ized that the human perceptual system is more sensitive to a small frequency of long-lasting disruptionsthan to a higher frequency of short-lived disruptions[3]. This is due to human perceptual inability tonotice small deviations of presentation rate. As a result, a better perceptual quality can be expectedby replacing large continuity disruptions (underflows and overflows) with shorter ones (slowdowns andfast-forwards), even when the latter lead to a higher value forES{DoP}. Thus, a playout policy should beallowed to increaseES{DoP} if this increase provides for a smoother spacing between synchronizationloss occurrences, thus help in concealing them. We pursue this idea by defining the state-action costto be

ci(a) = βDoPi(a)+ (1 − β)DoP(2)i (a), (9)

where DoP(2)i (a) = E{(DoPi(B(a)))2} is the expected square value of DoPi(B(a)) with respect to thenumber of new arrivals. The weighing factorβ is a user-defined input that controls the relative impor-tance between the two minimization objectives: the minimization ofES{DoP}, and the minimization


of ES{DoP2}.6 Settingβ = 1 leads to the minimization ofES{DoP} without any regard for the vari-ability of the duration of synchronization loss occurrences. Settingβ = 0 leads to the minimization ofES{DoP2} and the resultingES{DoP2}-optimal policy induces smoother synchronization losses than theES{DoP}-optimal policy. As it will be shown later, the reduction ofES{DoP} comes at the cost of anincreasedES{DoP2} and vice versa. Values ofβ that fall between the two extremes (0 and 1) providevarious levels of compromise between min{ES{DoP}} and min{ES{DoP2}}. The optimality of this trade-off stems from the fact that for a given value of one of the continuity components, the derived optimalsolution will provide a minimal value for the other continuity component. In essence, the designer of thePVR selects aβ that results in a desired value ofES{DoP2} (ES{DoP}) and knows that for that valueof ES{DoP2} (ES{DoP}) there cannot exist a policy that provides a smallerES{DoP} (ES{DoP2}) thantheES{DoP} (ES{DoP2}) provided by the proposed playout policy (since that policy minimizes a costexpression that involves bothES{DoP2}, ES{DoP}).

As a final comment on the employed cost we note that although losses are assigned to the ongoingpresentation instead of a future one (as discussed inSection 4.2) ES{DoP} andES{DoP2} representedmeaningful continuity metrics. It can be easily shown thatES{DoP} is immune to the mapping of lossesto frames; it only depends on the number of losses. On the other hand,ES{DoP2} is affected by theexact mapping of losses to frames due to the existence of the square power. However, due to the fact thatoverflow continuity disruptions are generally much larger (especially when batch losses occur) than allother kinds of disruptions that may occur under high occupancies, the difference between the employedapproach and the more accurate, but complicated one, is rather marginal.

6. Numerical results and discussion

In this section we apply the developed optimization model to derive the optimal playout policy asdefined in(8). In all the examples the duration of a frame will be equal to 33 ms (implying 30 frames/s).Unless stated otherwise, the playout buffer capacityN will be equal to 30. Two granularities are usedfor the adjustment of frame durations: 3.3 ms (corresponding toα = 10), and 1 ms (corresponding toα = 33).

6.1. Optimal playout policies for different levels of network jitter

The continuity weightβ was introduced inEq. (9)for the regulation of the relative importance betweenthe mean value and the variability of DoPi(a). In [5] (see Footnote 2) we assumed Poisson arrivals andshowed that by lettingβ take values in [0,1] we can achieve various tradeoffs between the minimizationof the average DoP and its variability. Forβ = 1 the derived optimal playout policy mandated that allframes be played at their normal duration, that is, the DS was shown to beE{DoP}-optimal. Forβ = 0 weobtained theE{DoP2}-optimal policy which applied a considerable amount of playout regulation towardsthe two extremes of the occupancy of the playout buffer.

The aforementioned behavior generally applies also to the policies derived fork-Erlang arrivals,nevertheless, the amount of playout regulation required by a givenβ is affected by the Erlang parame-ter k. For increased network jitter (smallk), the derived policies apply an increased amount of playout

6 We are referring to the minimization of∑

i∈S πi(R)DoP(2)i (Ai(R)) over all the policiesR defined inS×A.


β=0 β

200 400 600 800 1000 1200 1400 1600i (phase) 5

1015

2025

3035

4045

50

k (Erlang parameter)

15

20

25

30

35

40

Ai (Ropt) (prescribed action)

=1

200 400 600 800 1000 1200 1400 1600i (phase) 5

1015

2025

3035

4045

50


15

20

25

30

35

40

Ai (Ropt) (prescribed action)

Fig. 5. The prescribed optimal actionAi(Ropt) (on they-axis) for each phase occupancy (on thex-axis), for the cases ofβ = 0 andβ = 1 and differentk (on thez-axis). The normal frame duration is obtained withAi(Ropt) = 33, i.e., the adjustment granularityis α = 33. The playout buffer has a capacity forN = 30 frames.

regulation, eventually reaching the behavior under Poisson fork = 1. As frame arrivals become more reg-ular (with largerk), the amount of playout regulation required to minimize the average cost, for a selectedβ, decreases. This is on a par with the intuitive guess that playout manipulation ought to “smooth-out”as jitter drops.

Fig. 5shows the structure of the derived playout policies forβ = 0 andβ = 1 and for different levels ofdelay jitter. Due to the fact thatk affects the state-space of the Markov decision problem, the number ofphases (on thex-axis), grows withk (on thez-axis) despite that all systems have the same playout buffer7

(N = 30). The left graph illustrates the structure of playout policies forβ = 0. This produces policiesthat minimizeES{DoP2} by applying a substantial amount of playout regulation at buffer extremes. Asit may be seen, the amount of playout regulation reduces withk (note how the prescribed actions, at theedges of the buffer, tend to be less severe with increasingk). The right graph corresponds toβ = 1 whichleads to policies that minimizeES{DoP}. Such policies apply almost no playout regulation, except inthe last few phases (which, however, have a very small probability of being visited under steady state),thus are approximated very closely by the DS that presents all frames at their normal playout duration.Intermediate values, 0< β < 1, lead to policies that fall between the two extreme cases. In the sequelwe will be focusing inβ confined in the initial range 0< β < 0.1. Notice that the two components thatcontribute to theci(a) cost ofEq. (9)have different units thus only small values ofβ lead to a balancedtradeoff between the two cost components. Values ofβ larger than 0.1 turn almost entirely in favor ofminimizingES{DoP} thus amount almost to the presented behavior forβ = 1.

The two plots ofFig. 6 illustrate the performance results of the derived Erlang-optimal (EO) policies.For a particular policy,R(β, k),8 bothES{DoP} andES{DoP2} improve (reduce) with the regularity ofarrivals (that is, withk). Given a jitter levelk, a smallβ favors the reduction ofES{DoP2}, while a largeβfavors the reduction ofES{DoP}. Fig. 7is identical toFig. 6but corresponds to a smaller buffer,N = 10;notice that both continuity metrics deteriorate as a result of the smaller offered buffer capacity.

7 For k = 5 a total ofNk = 150 phases are required to fill the playout buffer with 30 frames, while fork = 10 the requirednumber of phases is 300.

8 For example,R(0.2,10) denotes the optimal policy if we setβ = 0.2 andk = 10.


5 10 15 20 25 30 35 40 45 50 0

0.02

0.04

0.06

0.08

0.1

0.0156250.031250.0625

0.1250.250.5

124


5 10 15 20 25 30 35 40 45 50k (Erlang parameter) 0

0.02

0.04

0.06

0.08

0.1

0.015625

0.03125

0.0625

0.125

0.25

0.5

ββ

ES{DoP2} (msec2)

ES{DoP} (msec)

Fig. 6. TheES{DoP2}, ES{DoP} performance of EO policies for different values of the continuity weightβ and for differentk(α = 33, N = 30).

The selected range fork has been limited to values up tok = 50 stimulated mainly by the measurementexperiments. An important question is whether the optimization model is “solvable” for largerk valuesin a “practical” time. We have solved the model fork up to 150 in a reasonable time (for an off-linecomputation). Largerk’s are probably not necessary to consider for the following two reasons: (1) themeasured interarrival traces were not that regular; (2) most significantly, for such a small delay jitterit is not necessary to derive the EO policy—its sufficient to use the DS. The latter observation stemsfrom the fact that the amount of playout regulation reduces with delay jitter (seeFig. 5, as well asFig. 10 in the sequel) and thus eventually the best playout policy is to play all frames at their normalduration.

6.2. Controlling the buffering delay

The proposed EO playout policies can be configured to suit the delay requirements of different streamingapplications. The simplest way to do so is by limiting the size of the playout buffer. The maximum


0.02

0.04

0.06

0.08

0.1

0.06250.125

0.250.5

1248

16


0.02

0.04

0.06

0.08

0.1

0.0625

0.125

0.25

0.5

1

2

ββ

ES{DoP2} (msec2)ES{DoP} (msec)

Fig. 7. TheES{DoP2}, ES{DoP} performance of EO policies for different values of the continuity weightβ and for differentk(α = 33, N = 10).



15

20

25

30

0.0156250.031250.0625

0.1250.25

0.51248

ES{DoP2} (msec2)


15

20

25

30

N (buffer capacity)

0.015625

0.03125

0.0625

0.125

0.25

0.5

1

2

N (buffer capacity

ES{DoP} (msec)

Fig. 8. TheES{DoP2}, ES{DoP} performance of EO policies for different values of the buffer capacityN and for differentk(α = 33, β = 0).

buffering delay of a presented frame is approximately equal toN · T , since the average playout durationapplied by EO policies is approximately equal toT . It is known [1] that a maximum network jitter,9

jmax, can be completely removed by accumulating a playout buffer of equal duration. With the playoutbuffer for 30 frames and a frame rate of 30 frames/s, the system can completely eliminate the jitter, ifthis jitter is confined belowN · T = 1 s. The jitter, however, can be larger, especially if we also includethe server-induced jitter (e.g., in an overloaded video-on-demand server); also the delay requirementsof the application could require a smaller maximum buffering delay (e.g., up to 330 ms, achieved withN = 10). In such cases underflows and overflows might occur (sincejmax > N · T ) and the EOplayout policy will try to conceal them by manipulating the duration of frames at the extremes of thebuffer.

The inherent tradeoff between continuity-quality and delay also applies to the proposed EO policies.When the (mean/maximum) buffering delay decreases, e.g., because a smaller playout buffer is available,the overall stream continuity (both components,ES{DoP} andES{DoP2}) deteriorates. The two plots ofFig. 8show that for a givenk, a smaller buffer capacityN always provides a worseES{DoP},ES{DoP2}.

A more sophisticated delay control method has been presented in[5] (see Footnote 2) by associatingan appropriate delay cost to the state-action cost ofEq. (9). Under this model, the playout policy in somecases applies an increased playout rate with the aim to reduce the occupancy of the buffer and, hence, thebuffering delay as well. This method is readily applicable to the EO policy as well.

6.3. Comparison of playout policies: EO vs. PO and DS

In this section a comparative study of the performance of the EO, the PO and the DS policies ispresented. It is assumed that the true interarrival process is i.i.d. andk-Erlang distributed. As it is clear forearlier discussions, such an arrival process may induce a wide range of delay jitter, from no jitter (k = ∞,deterministic arrivals) to the (high) level corresponding to Poisson arrivals (k = 1). Thus, a wide rangeof real network delay jitter (second moment of interarrivals) can be matched with that under ak-Erlangdistribution for somek.

9 The maximum network jitter is defined as the difference between the largest and the smallest network delay, i.e.,jmax =maxi Dn,i − mini Dn,i.


1

2

3

4

5

6

7

8

9

5 10 15 20 25 30 35 40 45 50


PO(33)PO(10)EO(33)EO(10)

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

5 10 15 20 25 30 35 40 45 50


PO(10)PO(33)EO(10)EO(33)

norm

aliz

ed E

S{D

oP2 }

norm

aliz

ed E

S{D

oP}

Fig. 9. The normalizedES{DoP2} andES{DoP} performance of different playout policies with respect to DS, for different valuesof k: PO(10), PO(33), EO(10), EO(33). The Erlang and Poisson policies areES{DoP2}-optimal, i.e., they have been optimizedfor the minimization of variability of DoP by selectingβ = 0. A playout buffer for 30 frames is used in all the examples.

Since the interarrival process is i.i.d. andk-Erlang distributed, the EO policy will lead to the bestpossible performance, as this policy is the one optimized for such an interarrival distribution. When thePO policy is employed when the interarrivals are i.i.d. andk-Erlang distributed the resulting performancecannot outperform that under the EO policy. The same holds true if the DS policy is employed when theinterarrivals are i.i.d. andk-Erlang distributed.

The general observation is that, when the frame arrivals are more regular than Poisson, PO policies (fordifferent continuity weightsβ and/or adjustment granularitiesα) become suboptimal since they applyan excessive amount of playout regulation. In such cases, the corresponding EO policies provide anoverall improved performance by applying an appropriate amount of playout regulation. The DS serviceprovides for the minimalES{DoP}, but has a largeES{DoP2}, especially with smallk. For the purpose ofcomparison, normalized, rather than absolute performance metrics will be presented. The normalization iscarried out by dividing theES{DoP2} (ES{DoP}) performance of an EO or PO policy with the performanceof DS for the samek. A value smaller (larger) than 1 means that the corresponding policy performs better(worse) than DS for the involved metric. This normalization schemes leads to a subjective evaluation ofpolicies which quantifies how much better or worse they would perform as compared to DS, under thesame level of delay jitter.

Fig. 9 illustrates the normalized performance, in terms ofES{DoP2} andES{DoP}, of the followingplayout policies: theE{DoP2}-optimal10 (β = 0) PO(α), theE{DoP2}-optimal (β = 0) EO(α). The plotsshows that DS is constantly inferior in terms of the variability of discontinuity occurrences to both EOand PO. EO guarantees a lowerES{DoP2} compared to the corresponding PO of the sameα, for all kin the presented range (compare the couples EO(10)–PO(10) and EO(33)–PO(33)). By increasing thegranularity of playout manipulation both EO and PO improve significantly; PO(33) is better than PO(10)and EO(33) is better than EO(10). Notice thatα = 33 leads to an adjustment quantum of 1 ms, which isequal to the best timing granularity that most general operating systems support. A largerα would leadto a smaller granularity and an improved performance but this performance would only be a theoretical

10 This is a policy that minimizes the variability of DoP under Poisson arrivals. It has been introduced in[5] (see Footnote 2).


one, since the corresponding policy would not be implementable in practice. Thus the presented resultsfor α = 33 should be viewed as the optimal attainable ones.

DS incurs long-lasting discontinuities by not applying any playout regulation at the buffer extremes,but achieves the bestES{DoP} (see the right graph ofFig. 9). EO(10) and EO(33) follow closely, whilePO(10) and P(33) are much worse.

In conclusion, the following should be noted: (1) there is aconsistent performance tradeoff between DSand EO, with the former (latter) providing a superiorES{DoP} (ES{DoP2}), across allk; (2) EO clearlyoutperforms PO since it provides a better performance with respect to both components of continuity,ES{DoP} andES{DoP2}. Referring back toFig. 9it may be seen that EO(33) provides for a large reductionof variability of discontinuities. For mostk, anES{DoP2} that is only 6% of the corresponding value of DSis achieved (the corresponding PO is limited to around 40–50% of DS). The cost for providing this largereduction of variability is a small increase by a factor of 1.02 (2% worse) ofES{DoP} as compared to DS(here PO performs poorly, as much as 8 times worse than DS). These results indicate that EO(33) couldprovide a significantly improved perceptual quality as compared to all other policies. Finally, althoughomitted for brevity, it can been shown that EO is much better than the TS policy (in[5] (see Footnote 2)we have compared PO and TS).

7. Applying the phase-aware optimal policy to a real-world video receiver

The obtained EO playout policies indicate that the amount of playout regulation required to obtainthe optimal performance varies with the regularity, that is, variation of the frame arrival process. Thisobservation suggests that an implemented receiver should also vary the extent of playout regulation withvarying network jitter.

To apply the analytically derived optimal playout policy ofSection 5to a real-world PVR, the imple-mented version of the derived optimal playout scheduler must be aware of the total number of phases inthe system upon a presentation completion instant (say of thenth frame). In the aforementioned analyti-cal model the phase occupancy process{In} provided that exact information by accounting for both thephases corresponding to the number of frames already in the buffer (kIn phases), as well as the phasescompleted by the ongoing arrival (Jn phases). In an implemented system{In} is onlypartially observable,since only one of its components is directly measurable—kIn, the number of phases that correspond tothe buffered frames.Jn, the second component that contributes to{In}, is unknown.

An estimator could be designed to carry-out the estimation ofJn and use the obtained result to selectthe action that corresponds tokIn phases plus the estimation ofJn. An estimateJn(En) can be derivedby using thelast arrival elapsed time En, which is the interval from the last frame arrival (an) up to thepresent time of completion of thenth playout (tn) (seeFig. 3). The scheduler logs the time of the lastframe arrival and uses it to measureEn. KnowingEn, the expected number of completed phases overthat interval can be computed, thus providing the estimateJn(En). This method has been evaluated bysimulation and has presented moderate success. Its performance, however, deviates significantly from thetheoretical performance whenk is large (k > 35). Additionally, the incorporation of the estimator addsto the complexity at the receiver.

We have derived a simple method that can overcome the problems associated with the estimation ofthe phase. This methods provides a performance that is very close to the theoretical optimal and is alwaysbetter than the simulation results obtained for the estimator. We approximate the theoretical phase-aware


N=30

510

1520

2530

i (buffer occupancy) 510

1520

2530

3540

4550


22242628303234363840

Ai (R') (prescribed action)

N=10

1 2 3 4 5 6 7 8 9 10i (buffer occupancy) 5

1015

2025

3035

4045

50


22242628303234363840

Ai (R') (prescribed action)

Fig. 10. The structure of “collapsed” EO policies for different jitter levels, 1≤ k ≤ 50. The two plots correspond to differentbuffer capacities:N = 30 andN = 10 (α = 33, β = 0).

optimal policy,Ropt, by an appropriate phase-unaware11 policy,R′opt, that introduces anequivalent amount

of playout regulation (at buffer extremes) resulting in asimilar playout quality. Our results show that aperformance very close to the theoretical optimal can be achieved this way.

The transformation of the phase-aware policyRopt, characterized by the functionAi(Ropt), for k ≤ i ≤(N+1)k−1, to the phase-unaware policyR′

opt, characterized by the functionAui (R

′opt), for 1 ≤ i ≤ N, is

carried out by averaging over the suggested actions for eachk-tuple of phases, corresponding to a singleframe occupancy, and use the averaged action for that frame occupancy after rounding it to the closestinteger:

Aui (R

′opt) = round

1

k

(i+1)·k−1∑j=i·k

Aj(Ropt)

, 1 ≤ i ≤ N. (10)

Fig. 10 depicts the structure ofR′opt for β = 0 and differentk, for two buffer capacities. These plots

correspond to phase-unaware versions of the phase-aware policy illustrated inFig. 5. They indicate thatfor the minimization ofES{DoP2}, the playout manipulation ought to “smooth” withk.

Fig. 11illustrates that a real-world video receiver, equipped with an estimator to determine the amountof network jitter, could significantly reduce the variation of the DoP metric by applying the phase-unawareversion of the optimal policies, all of which were determined off-line. Note that the lines that correspondto EO(33) and CEO(33) (collapsed EO) policies almost coincide over allk; the theoretical EO(33) andthe applied CEO(33) policies essentially provide for the same performance, which is always much betterthan that under the PO and DS policies.

An important question is whether a reducedES{DoP},ES{DoP2} is really equivalent to a significantlybetter visual perception. This is a matter that ought to be determined by future experiments. Even if itturns out that this is not always the case, it is possible to apply the techniques presented in this paper toobtain optimal policies with respect to other, perhaps more suitable (dis)continuity metrics.

11 We refer to a policyR as phase-unaware, if the playout duration depends solely on the number of frames in the buffer,In, thatis, it is independent ofJn. Otherwise, it is referred to as a phase-aware policy. Therefore, we can characterize a phase-unawarepolicyR by a functionAu

i (R), with 1 ≤ i ≤ N.


0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

5 10 15 20 25 30 35 40 45 50

norm

aliz

ed E

S{D

oP2 }


PO(33)EO(33)

CEO(33)

Fig. 11. The normalizedES{DoP2} performance with respect to DS for PO(33), EO(33) and CEO(33) (collapsed EO) for differentlevels of delay jitter, 1≤ k ≤ 50 (α = 33, β = 0).

8. Overall system architecture

In this section we describe some scenarios for the exploitation of the developed playout policies inimplemented PVRs.

The obtained theoretical EO policies, and their corresponding implementable CEO policies, are opti-mized for a given level of network jitter, captured by the Erlang parameterk. It is known, however, thatthe jitter in a best-effort network fluctuates over various time scales and is highly affected by a plethora ofparameters such as: the co-existing traffic mix, the underlying network technology, the distance betweenthe communicating end-points, etc. Two time scales of particular interest are: (1) the transient increasesof jitter in small time periods; (2) the somewhat permanent jitter, that relates to increased load in thenetwork, e.g., during the “peak hour” of operation.

Most playout schedulers in the past literature[1] try to monitor the current level of network jitter andadjust accordingly. When the network jitter is stable, most systems are able to conceal it by applyingan appropriate playout algorithm. A phenomenon that complicates the operation of a scheduler is theexistence of sharp “delay spikes” (sudden increases of the network delay of some frames) which havebeen reported by various studies[23,24]; they are attributed to causes such as: surges of peak traffic,administrative functions in routers that result in the distraction of the CPU from packet forwarding, etc.These delay spikes seriously degrade the presentation quality, in a PVR that otherwise can be thought asbeing in a “steady state” (having buffered enough frames to cope with the average jitter). Some systemsfor packet voice communications[25,26](which make use of timing information) operate in a dual mode,with the second mode of operation being devoted to the detection and the concealment of delay spikes.The normal mode of operation uses an estimator, which appreciates the current level of jitter, withouthaving to be very “reactive”, since jitter is assumed to drift rather slowly.

Our EO policies can be thought as an analogous way of handling the delay variability, with a buffer-oriented scheduler, destined for packet video communications. Under a given average jitter (reflected in


bufferplayout playout

scheduler

jitterestimator

policyselector

CEO policies(off-line computed)

arrivals

Fig. 12. Block diagram of an implemented system.

k), a delay spike is handled by the playout regulation towards the buffer extremes (avoiding long-lastingdiscontinuities: underflows and overflows). These delay spikes are modeled by the occasional peaks ofthek-Erlang, moreover, the produced behavior with differentk maps well to the real environment: smallk produce more peaks, as is the case when the network is congested, while largerk result in infrequentpeaks. On a larger time scale, e.g., oriented toward thetime-of-day load behavior, a PVR can switch toan appropriate EO policy for that jitter level.

Fig. 12depicts the envisaged implementation; a PVR estimates the current level of network jitter byobserving the frame interarrivals and “loads” the appropriate, off-line computed CEO playout policy. Theestimation of the average network jitter can be performed by a standard linear recursive estimator[27]that estimates the variance of interarrivals,Vi, at theith playout by using the corresponding interarrivaltimes,Xi:

Xi = g · Xi−1 + (1 − g) ·Xi, Vi = h · Vi−1 + (1 − h) · (Xi−1 −Xi)2. (11)

The estimatesXi, Vi may be used for the selection of the appropriatek that corresponds to the currentlyobserved jitter. This can be done by using the relationship Var{Y}/E2{Y} = 1/k which holds true for ak-Erlang distributed random variableY . Thus an estimatek is obtained by usingk = round((Xi)

2/Vi).The CEO that corresponds tok is loaded from the repository of precomputed playout policies. By lettingg, h, in (11)assume large values (close to 1) the estimatek remains stable, in the sense that transient delayspikes are filtered-out, not causing unnecessary changes of playout policy. Only permanent changes ofjitter are allowed to pass the filter and trigger the exchange of playout policy. Estimators such as(11)havebeen effectively employed in a wide range of applications, e.g., in the estimation of the mean and varianceof round trip delay for TCP[27], and in the estimation of jitter for audio playout applications[25,26].In all situations the filter weights regulate a tradeoff between the accuracy of the derived estimation(values close to 1) and the ability to quickly detect changes in the input. In the current application thefocus is on estimation quality, rather than on responsiveness, so high values should be preferred. The exactidentification of filter weights would require some fine tuning which would jointly cater to implementationand operation environment details and would typically remain fixed as has been the case in most similarapplications.

If more precise information, regarding the level of network jitter, can be gathered in a central authori-tative entity, then it might be meaningful to assign the selection of the appropriate playout policy to that


entity, instead of having each receiver decide. Apolicy server could generate/store the playout policiesand transmit them to each receiver prior to every video communication.

9. Conclusions

This paper has considered the problem of modeling and optimizing a PVR for different levels of de-lay jitter. To the best of our knowledge, this is the first analytical model capable of capturing a widearea of parameters, including: the level of delay jitter, elaborate intrastream synchronization metrics,and diverse playout policies. The performance evaluation model has been built around theEk/Di/1/Nqueue, which is then generalized into a corresponding MDP that is capable of deriving the optimalplayout policy for different levels of delay jitter. TheEk/Di/1/N queue allows for a sufficient mod-eling of the actual input traffic (two moment matching of the input process) and can be solved effi-ciently by specialized algorithms. The requirement for a traffic model that does not lead to state-spaceexposition has been a strict one. Notice that the current work is not limited to the performance eval-uation of the aforementioned system, but rather proceeds to identify its optimal solution which in-volves a systematic search in the entire solution space defined by all the possible playout policiesand all the possible states. The resulting optimization model has been successfully used for the deriva-tion of optimal playout policies for a realistic range of delay variability, as identified by actualmeasurements.

The gain from the optimization under ak-Erlang distribution for the modeling of frame interarrivalshas been demonstrated by a numerical comparison against previous models that make a Poissonianassumption for the input traffic. The presented numerical results have been used to show that: (1) theamount of playout manipulation ought to smooth-out with the regularity of the input traffic; (2) a temporalspreading of discontinuities can be enforced by the playout scheduler. This can potentially lead to theirconcealment by exploiting human perceptual limitations in the detection of motion. The EO(33) policyderived here utilizes this observation to derive a potentially improved playout quality.

The theoretical optimal playout policy has been transformed into an approximately optimal one thatutilizes observable information and it is, thus, feasible to implement. The resulting playout policies maybe exploited in implemented systems where the delay jitter is known to vary across different time scales.A dual model of operation, similar in conception to a previous system for spoken voice, can be constructedas follows. A repository of off-line computed playout policies is constructed for different levels of delayjitter, targeting a large-scale characterization of traffic conditions (congestion, high load, low load). Areceiver uses an estimator to select the policy that best suits the observed conditions and monitors the inputtraffic, issuing changes of playout policy only under permanent changes in traffic conditions. Transientirregularities in the input (large underflows, clustered arrivals) are handled by the regulation of playoutrate as enforced by the derived playout policies.

Acknowledgements

The authors would like to thank Mr. Nikolaos Labris for helping in conducting the UoA–ASU delayexperiment, as well as the anonymous reviewers for their valuable suggestions that helped in improvingthe overall quality of the paper.


Appendix A. Value-iteration algorithm

The value-iteration algorithm is particularly suitable for MDPs with a large state-space. Contrary topolicy-iteration and linear-programming solutions—which in each iteration require the solution of a linearsystem of equations of size equal to the state-space of the problem—the value-iteration algorithm avoidslarge systems of equations by using a recursive solution from dynamic programming.

The algorithm is based on the computation of a sequence ofvalue-functions, Vn(i): ∀i ∈ S andn =1,2, . . . , which approximate the minimal average cost per unit time. In the following we outline theoperation of the algorithm. A good reference for more details is[28]:

Step 0: Select the initial value functionV0(i) such that: 0≤ V0(i) ≤ mina ci(a)∀i ∈ S. Setn = 1.Step 1: Update the value functionVn(i)∀i ∈ S by using

Vn(i) = mina∈A

ci(a)+

∑j∈S

pij(a)Vn−1(j)

and identify the policyRn, which at thenth iteration, minimizes the right-hand side of this equationfor all i ∈ S.

Step 2: Compute the upper,Mn, and lower,mn, bound by using

Mn = maxj∈S

{Vn(j)− Vn−1(j)}, mn = minj∈S

{Vn(j)− Vn−1(j)}.

The algorithm completes, returning the desired policyRopt = Rn when 0≤ Mn − mn ≤ ε · mn,whereε is a (small) tolerance number. Otherwise, proceed to Step 3.

Step 3: Setn = n+ 1 and go to Step 1.

References

[1] N. Laoutaris, I. Stavrakakis, Intrastream synchronization for continuous media streams: a survey of playout schedulers,IEEE Network Mag. 16 (3) (2002).

[2] D.L. Stone, K. Jeffay, An empirical study of a jitter management scheme for video teleconferencing, Multimedia Syst. 2(2) (1995).

[3] M.C. Yuang, S.T. Liang, Yu.G. Chen, C.L. Shen, Dynamic video playout smoothing method for multimedia applications,in: Proceedings of the IEEE International Conference on Communications (IEEE ICC), Dallas, TX, June 1996, p. S44.

[4] N. Laoutaris, I. Stavrakakis, Adaptive playout strategies for packet video receivers with finite buffer capacity, in: Proceedingsof the IEEE International Conference on Communications (IEEE ICC), Helsinki, Finland, June 2001.

[5] N. Laoutaris, I. Stavrakakis, An analytical design of optimal playout schedulers for packet video receivers, Comput.Commun. 26 (4) (2003) 294–303.

[6] D. Frankowski, J. Riedl, Hiding jitter in an audio stream, Technical Report No. TR-93-50, Department of Computer Science,University of Minnesota, 1993.

[7] E. Altman, C. Barakat, V.M.R. Ramos, On the utility of FEC mechanisms for audio applications, in: Proceedings of theSecond International Workshop on Quality of Future Internet Services (QofIS2001), Coimbra, Portugal, September 2001.

[8] M.C. Yuang, S.T. Liang, Yu.G. Chen, Dynamic video playout smoothing method for multimedia applications, MultimediaTools Appl. 6 (1) (1998) 47–60.

[9] M. Luoma, M. Ilvesmaki, M. Peuhkuri, Source characteristics for traffic classification in differentiated services type ofnetworks, in: Proceedings of the Internet III: Quality of Service and Future Directions (VV01), SPIE, Boston, USA,September 1999.


[10] C. Jin, S. Jamin, Design, implementation, and end-to-end evaluation of a measurement-based admission control algorithmfor controlled-load service, in: Proceedings of the International Workshop on Network and Operating System Support forDigital Audio and Video (NOSSDAV), Cambridge, UK, July 1998.

[11] B. Adamson, The mgen toolset.http://manimac.itd.nrl.navy.mil/MGEN/.[12] M. Conti, Modeling mpeg scalable sources, Multimedia Tools Appl. 13 (2) (2001) 127–145.[13] A. Chimienti, M. Conti, E. Gregory, M. Lucenteforte, R. Picco, Mpeg-2 sources: exploiting source scalability for an efficient

bandwidth allocation, Multimedia Syst. 8 (3) (2000) 240–255.[14] G. Latouche, V. Ramaswami, Introduction to Matrix Analytic Methods in Stochastic Modeling, ASA–SIAM Series on

Statistics and Applied Probability, 1999.[15] D.M. Lucantoni, New results on the single server queue with a batch Markovian arrival process, Stochast. Models 7 (1)

(1991) 1–46.[16] D.M. Lucantoni, K.S. Meier-Hellstern, M.F. Neuts, A single server queue with server vacations and a class of non-renewal

arrival processes, Adv. Appl. Probab. 22 (1990) 676–705.[17] D. Loguinov, H. Radha, Large-scale experimental study of Internet performance using video traffic, ACM SIGCOMM

Comput. Commun. Rev. 32 (1) (2002).[18] A.L. Truslove, Queue length for theEk/G/1 queue with finite waiting room, Adv. Appl. Probab. 7 (1975) 215–226.[19] M.F. Neuts, Matrix-geometric Solutions in Stochastic Models—An Algorithmic Approach, Dover, New York, 1995.[20] G. Latouche, P.A. Jacobs, D.P. Gaver, Finite Markov chain models skip-free in one direction, Naval Res. Logistics Quart.

31 (1984) 571–588.[21] M. Claypool, J. Tanner, The effects of jitter on the perceptual quality of video, in: Proceedings of the ACM Multimedia’99,

Orlando, FL, 1999.[22] S.M. Ross, Applied Probability Models with Optimization Applications, Dover, New York, 1992.[23] J.-C. Bolot, End-to-end packet delay and loss behavior in the Internet, in: D.P. Sidhu (Ed.), Proceedings of the SIGCOMM

Symposium on Communications Architectures and Protocols, San Francisco, CA, September 1993, ACM, New York,pp. 289–298;J.-C. Bolot, Comput. Commun. Rev. 23 (4) (October 1992).

[24] V. Paxson, End-to-end Internet packet dynamics, in: Proceedings of the SIGCOMM Symposium on CommunicationsArchitectures and Protocols, Cannes, France, September 1997.

[25] R. Ramjee, J. Kurose, D. Towsley, H. Schulzrinne, Adaptive playout mechanisms for packetized audio applications inwide-area networks, in: Proceedings of the Conference on Computer Communications (IEEE Infocom), Toronto, Canada,IEEE Computer Society Press, Los Alamitos, CA, June 1994, pp. 680–688.

[26] S.B. Moon, J. Kurose, D. Towsley, Packet audio playout delay adjustment: performance bounds and algorithms, ACM/Springer Multimedia Syst. 5 (1) (1998) 17–28.

[27] V. Jacobson, M.J. Karels, Congestion avoidance and control, in: Proceedings of the SIGCOMM Symposium onCommunications Architectures and Protocols, ACM, New York, November 1998.

[28] H.C. Tijms, Stochastic Modelling and Analysis: A Computational Approach, Wiley, New York, 1986.

Nikolaos Laoutaris received his B.Sc. degree in computer science in 1998 and his M.Sc. degree in telecom-munications and computer networks in 2001, both from the Department of Informatics and Telecommu-nications, University of Athens, Greece. He is currently a Ph.D. student at the same department workingin the area of multimedia networking and content distribution. His main interests include real-time net-work communications, content distribution and web caching, and performance evaluation of networksand systems.

Benny Van Houdt graduated in mathematics and computer science, and obtained his Ph.D. degree in science from the Universityof Antwerp (Belgium) in 1997, and 2001, respectively. From 1997 until 2001 he held an assistant position at the Universityof Antwerp. Currently, he is a postdoctoral fellow of the FWO-Flanders. His main research interest goes to the performanceevaluation and stochastic modeling of wired and wireless telecommunication networks, mainly using matrix analytic methods

http://manimac.itd.nrl.navy.mil/MGEN/


(MAM). He has published a variety of papers, containing both theoretical and practical contributions, in a broad range ofinternational journals and conferences.

Ioannis Stavrakakis received his Diploma in electrical engineering, Aristotelian University of Thessa-loniki (Greece), 1983; Ph.D. in EE, University of Virginia (USA), 1988; Assistant Professor in CSEE,University of Vermont (USA), 1988–1994; Associate Professor of ECE, Northeastern University, Boston(USA), 1994–1999; Associate Professor of Informatics and Telecommunications, University of Athens(Greece), 1999–2002 and Professor since 2002. Teaching and research interests are focused on resourceallocation protocols and traffic management for communication networks, with recent emphasis on con-tinuous media applications and ad hoc networking. His past research has been published in over 100scientific journals and conference proceedings and was funded by NSF, DARPA, GTE, BBN and Mo-torola (USA) as well as Greek and European Union (IST) Funding agencies. He has served repeatedly in

NSF and IST research proposal review panels and involved in the organization of numerous conferences sponsored by IEEE,ACM, ITC and IFIP societies. He is a senior member of IEEE, a member of (and has served as an elected officer for) the IEEETechnical Committee on Computer Communications (TCCC) and the chairman of IFIP WG6.3. He has served as a co-organizerof the 1996 International Teletraffic Congress (ITC) Mini-Seminar, the organizer of the 1999 IFIP WG6.3 workshop, a technicalprogram Co-chair for the IFIP Networking 2000 conference, the Vice-General Chair for Networking 2002 conference and the or-ganizer of the COST-IST(EU)/NSF(USA)-sponsored Networking’03. He is an associate editor for the IEEE/ACM Transactionson Networking, the ACM/Baltzer Wireless Networks Journal and the Computer Networks Journal.

optimization of a packet video receiver under different ...vanhoudt/papers/2004/peva_nikos.pdf ·...

Documents