[ieee 2009 picture coding symposium (pcs) - chicago, il, usa (2009.05.6-2009.05.8)] 2009 picture...

PRACTICAL NETWORK CODING FOR SCALABLE VIDEO IN ERROR PRONE NETWORKS

Mohammed Halloush, Member IEEE, Hayder Radha, Senior Member IEEE Department of Electrical and Computer Engineering, Michigan State University

{halloush, radha}@msu.edu

ABSTRACT

In this paper we apply a generalized approach for practical Network Coding (NC) called Multi-Generation Mixing (MGM) in networks communicating scalable video contents. NC has been a viable approach of communication in packet loss networks. On the other hand NC losses are expensive. NC losses reduce the ability of receiver to decode packets, and hence may severely degrade the quality of recovered video. MGM employs the layering of scalable video streams to enhance the reliability of communication. MGM provides unequal protection for the different video layers such that the overall reliability of video communication is improved. With MGM instead of having enhancement layers just dependent on lower layers, enhancement layers support the recovery of lower layers. This is done by network encoding packets of lower layers in higher layers. Through extensive simulations, we show that MGM highly improves the quality of recovered video.

Keywords: Scalable Video Coding (SVC), Network Coding (NC), Multi-Generation Mixing (MGM).

1. INTRODUCTION1

Improving the reliability of video communication in packet loss networks is a QoS for many applications. Due to the benefits of network coding [1], it can be considered as an appealing approach to enhance the quality of video communicated in packet loss networks. Scalable Video Coding (SVC) supports the encoding of video in layers; a base layer and one or more enhancement layers. SVC supports the decoding of high quality video from a bit stream that consists of multiple sub-streams. This work was supported in part by NSF Award CNS-0721550, NSF Award CCF 0728996, and NSF Award CCF-0515253.

H.264 Scalable video coding (SVC) is the latest amendment [2] to the single layer H.264 Advanced Video Coding (AVC) [3]. H.264 SVC is the scalable extension of the single layer H.264 AVC that has been standardized by the Joint Video Team (JVT). In this paper we will evaluate the performance of practical network coding when applied in networks communicating scalable video contents. Network Coding (NC) is becoming an increasingly popular communication approach [4-7]. With traditional Generation based (G-based) network coding, packets are grouped in generations and encoding/decoding is performed among packets of the same generation. With the sufficient reception of independent packets a generation is decodable, otherwise it is lost [5, 6]. The sensitivity of applying generation based NC [6] to video may further aggravates the well-known dependency among coded video frames. In particular, with generation based network coding, generation is the minimum unit of data loss. The loss of one or multiple generations causes major losses in dependent video frames, which may have a severe effect on the quality of recovered video. Multi-generation mixing aims to improve the performance of generation based network coding by improving the reliability of delivering video generations and hence improving decodable rates. MGM supports layered transmission of scalable video. Layered transmission is supported by providing different levels of reliable communication to the different layers of scalable video. In this paper we apply multi-generation mixing on networks communicating scalable video and show the improvements achieved over traditional generation based network coding [6]. The rest of the paper is organized as follows. In Section 2, we give a brief overview of MGM encoding and decoding. In Section 3, we describe how MGM is applied on scalable video. In Section 4, we evaluate the performance of MGM

and compare it with generation-based NC of video. Finally, we conclude in Section 5.

2. OVERVIEW OF MULTI-GENERATION MIXING

In this section we give a brief overview of MGM encoding and decoding, detailed discussion is provided in [8]. With Generation based (G-based) network coding packets are grouped in generations were encoding is performed among packets of the same generation. For a receiver to be able to decode a generation of size k, at least k independent encoded packets are needed. If less than k independent packets were received of that generation the generation is lost. With multi-generation mixing in addition to grouping packets in generations, generations are grouped in mixing sets where the size of a mixing set is m generations. Each generation in the mixing set has a position index that indicates its relative position within the mixing set. The first generation in the mixing set has position index zero and the last has position index m-1. MGM encoding is performed on packets of multiple generations depending on the position indices. For a node to generate an encoded packet associated with generation of position index l, that node encodes all packets it has that are associated with generations of position indices less than or equal to l in the same mixing set. With generation based network coding the number of packets encoded to generate an encoded packet is fixed; it is equal to the generation size. On the other hand with multi-generation mixing the number of packets encoded to generate an encoded packet depends on the generation position index. As the generation position index increases the number of packets encoded increases. Figures 1 and 2 illustrate generation based and multi-generation mixing encoding respectively. MGM decoding is done generation by generation as they are received (incremental decoding). In case that a generation is unrecoverable due to the insufficient number of encoded packets received, it is still possible to decode that generation collectively as part of a subset of mixing set generations [9]. It is worth mentioning that the case of traditional generation based network coding [6] is a special case of MGM when the mixing set size is one (m=1). In this

case encoding is performed within the generation and no intergeneration mixing is performed.

3. NETWORK VIDEO CODING WITH MGM

For the practical communication of video, there is a need to packetize video frames in packets of limitted sizes. To apply NC in networks communicating video contents, video packets are grouped in generations. With generation based NC, encoding is performed on video packets of the same generation. To apply MGM, generations of video packets are grouped in mixing sets. MGM supports the scalability of video by providing different levels of reliable communication to the different video layers. This can be done by mapping each video layer to generations of a particular position index in consecutive mixing sets. For example video base layer is mapped to the first generations of consecutive mixing sets. These are generations with position index zero that are encoded in all succeeding generations in the same mixing set. The first enhancement layer is mapped to generations with position index one that are encoded in all succeeding generations in the mixing set. Figure 3 illustrate the mapping of a video stream of two layers to generations of mixing set of size two. In Figure 3, base (enhancement) video packets are grouped in base (enhancement) layer generations. Video packets of base layer are mapped to generations with position index zero. Video packets of

L L L

L L

L

L L

L

Figure 1: Generation based network coding, m generations are encoded separately.

L L L

L L

L

L L

L

Figure 2: Multi-generation mixing, each generation is encoded with previous generations in a mixing set of size m.

enhancement layer are mapped to generations with position index one. With generation based network coding and in the case of generation loss all packets of the generation are lost which means the loss of corresponding video frames as well as all dependent video frames (from the same video layer or other layers). On the other hand with MGM in the case of unrecoverable generation, there is a chance that this generation will be recovered successfully collectively with upcoming generation(s) in the same mixing set. This can be done since the information of each generation is carried in all succeeding generations in the mixing set.

4. EVALUATION AND RESULTS

The performance of network coding on scalable video was evaluated with extensive simulations. Foreman video sequence of 300 frames was encoded, decoded packetized and evaluated under Generation based (G-based) and Multi-Generation Mixing (MGM) network coding. JSVM 9.4 was used to encode video in two and three CGS layers such that a fixed bit rate was targeted for each layer. Each layer is encoded with five temporal levels with 1.875, 3.75, 7.7, 15 and 30 fps. Table 1 shows the video stream layers, bit rates, and overall PSNR. The packetized video streams were transmitted using a network simulator that was developed for the purpose of evaluating practical network coding (G-based and MGM). For the scenario of video of two layers, G-based network coding and MGM with m=2 were applied and results were compared. For the scenario of video of three layers, G-based network coding and MGM with m=3 were applied and results were compared. The network topology consists of 400 nodes. Nodes are distributed randomly in an area of 20×20. In each 1×1 unit area there is a randomly positioned node that can communicate directly with all nodes within a radius of 1.5. The network is of broadcast nature,

sender is at one corner of the topology and receiver is selected randomly to be close to the other corner. A large number of simulations were done where performance was evaluated using different generation sizes. The quality of video at selected receiver and decodable rates achieved by all nodes were evaluated. In Figures 4 and 5 MGM achieves major improvement in the quality of recovered video at receiver over G-based NC. This improvement is achieved over the different generation sizes. We note that MGM with m=3 (Figure 5) achieves almost full video quality at receiver. On the other hand for MGM with m=2 (Figure 4) the quality of recovered video is degraded in some scenarios (generation size 30). This indicates that the larger the mixing set the more reliable video communication. Figures 6 and 7 show the percent of unrecovered packets at receiver. With generation based NC, as the generation size increases there is an increase in the

Table 1: Bit rates and Y-PSNR for 300 video frames of SVC Foreman sequence.

Number of Video Layers

Bit Rate (Kbps) Y-PSNR Layer1 Layer 2 Layer 3 Two Layers 145.96 301.07 - 32.9

Three Layers 95.59 198.28 295.91 30.7

10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Generation size

PS

NR

G basedMGM, m=2

Figure 4: PSNR with multi-generation mixing and generation based for Foreman SVC sequence of two layers.

10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Generation size

PS

NR

G basedMGM, m=3

Figure 5: PSNR with multi-generation mixing and generation based for Foreman SVC sequence of three layers.

L

L

L

L

2/hM1M

L L L

L L L

0M

Figure 3: Video of two layers, base (enhancement) layer packets are grouped in base (enhancement) layer generations in mixing sets of size two (m=2). Each mixing set consists of two consecutive generations, one from each video layer.

percent of unrecovered packets and hence a degradation in the quality of recovered video (as shown in Figures 4 and 5). The reason behind this is that the cost of losses increases as generation size increases. On the other hand with MGM the percent of unrecovered packets is decreased due to the supported inter-generation mixing. In Figure 8 the percent of decoded packets of G-based and MGM with m=3 relative to MGM with m=3 is evaluated. The percent of decoded packets is evaluated over all topology nodes for different generation sizes. As shown in the figure for MGM with m=2, over 89% of MGM with m=3 decodable rate is achieved. On the other hand with generation based NC the decodable rate is between 32% - 75% of that for MGM, m=3. The improvement of MGM comes on the cost of computational overhead. This is the overhead of checking the usefulness (independence) of received encoded packets and/or decoding. For more about the computational overhead of MGM reader can refer to [9].

5. CONCLUSION

MGM network coding improves the reliability of communicating scalable video contents. MGM allows the encoding among network coding generations such that the expensive network coding losses are

decreased. The results presented in this paper show major improvements achieved by MGM when applied in networks communicating scalable video.

REFERENCES

[1] R Ahlswede, N Cai, S Li, and R Yeung, "Network Information Flow," IEEE Transactions in Information Theory, July 2000.

[2] "ITU-T and ISO/IEC JTC1, JVT-W201, “Joint Draft 10 of SVC Amendment,”" Apr. 2007.

[3] "ISO/IEC 14496-10:2003 Information technology – Coding of audiovisual objects – Part 10: Advanced Video Coding."

[4] Abhinav Kamra, Vishal Misra, Jon Feldman, and Dan Rubenstein,, "Growth codes: maximizing sensor network data persistence," SIGCOMM Comput. Commun. Rev., vol. 36, pp. 255-266 2006

[5] C Gkantsidis, and P Rodriguez, "Network Coding for Large Scale Content Distribution," INFOCOM 2005.

[6] P Chou, Y Wu, and K Jain, "Practical network coding," Allerton Conference on Communication, Control, and Computing, Monticello, IL, October 20, 2003.

[7] S. Katti, D. Katabi, W. Hu Hariharan, R. Medard, "The Importance of Being Opportunistic:Practical Network Coding for Wireless Environments," Allerton, 05.

[8] M. Halloush, H. Radha, "Network Coding with Multi-generation Mixing," CISS, 2008.

[9] M. Halloush, H. Radha, "A Case Study of: Sender Transmission Reliability and Complexity Using Network Coding with Multi-generation Mixing," CISS09, MD, USA, March, 2009.

10 15 20 25 30 35 40 45 50

0

0.1

0.2

0.3

0.4

0.5

0.6

Generation size

Per

cent

of u

nrec

over

ed p

acke

ts

G basedMGM m=2

Figure 6: Percent of unrecovered packets at the selected receiver. With MGM m=2 an improved decodable rate is achieved.

10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

Generation size

Per

cent

of u

nrec

over

ed p

acke

ts

G basedMGM m=3

Figure 7: Percent of unrecovered packets at the selected receiver. With MGM m=3 almost full decodable rate is achieved.

10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Generation size

Per

cent

of r

ecov

ered

pac

kets

G basedMGM m=2

Figure 8: Ratio of network decodable packets received with Generation based, and MGM m=2 to that with MGM m=3, averaged over all nodes.

[ieee 2009 picture coding symposium (pcs) - chicago, il, usa (2009.05.6-2009.05.8)] 2009 picture...

Documents