1. objective of the · web viewin screen content, neighboring blocks are inconsistent with...

79
A final project report on Residual DPCM for improving Inter Prediction in HEVC for Lossless Screen Content Coding Under the guidance of Dr. K. R. Rao For the fulfillment of the course Multimedia Processing (EE5359) Spring 2015 Submitted by Siddu Basawaraj Pratapur UTA ID: 1001053422 Email id: [email protected] 1

Upload: trinhkien

Post on 20-Mar-2018

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

A final project report on

Residual DPCM for improving Inter Prediction in HEVC for Lossless Screen Content Coding

Under the guidance of Dr. K. R. Rao

For the fulfillment of the course Multimedia Processing (EE5359)Spring 2015

Submitted by

Siddu Basawaraj Pratapur

UTA ID: 1001053422

Email id: [email protected]

Department of Electrical Engineering

The University of Texas at Arlington

1

Page 2: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Table of Contents

1. Objective of the project..........................................................................................................7

2. Basic Concepts of Video Coding...........................................................................................8

2.1 Color Spaces.....................................................................................................................8

3. H.265 / High Efficiency Video Coding...............................................................................10

3.1 Introduction....................................................................................................................10

3.2 Encoder and Decoder in HEVC.....................................................................................11

3.3 Features of HEVC..........................................................................................................13

3.3.1 Coding tree units and coding tree block (CTB) structure:......................................14

3.3.2 Coding units (CUs) and coding blocks (CBs):........................................................14

3.3.3 Prediction units and prediction blocks (PBs):.........................................................15

3.3.4 TUs and transform blocks:.......................................................................................16

3.3.5 Motion vector signalling:.........................................................................................16

3.3.6 Motion compensation:.............................................................................................16

3.3.7 Intra-picture prediction:...........................................................................................17

3.3.8 Quantization control:...............................................................................................18

3.3.9 Entropy coding:.......................................................................................................18

3.3.10 In-loop de-blocking filtering:................................................................................19

3.3.11 Sample adaptive offset (SAO)...............................................................................19

4. Introduction to Screen Content Coding...............................................................................20

4.1 Introduction....................................................................................................................20

4.2. Analysis of HEVC on Screen Content Coding.............................................................21

4.3 Angular Intra Prediction.................................................................................................21

4.4 Inter-Prediction...............................................................................................................21

4.5 Loop Filters....................................................................................................................22

5. Residual DPCM in HEVC inter-prediction.........................................................................23

5.1 General considerations and the HEVC coding structure.............................................23

5.2 General method for inter RDPCM..............................................................................23

5.3 Additional tools for inter RDPCM...............................................................................25

6. Test Configurations..............................................................................................................28

6.1 Intra-only configuration................................................................................................28

6.2 Low-delay configuration...............................................................................................28

6.3 Random-access configuration.......................................................................................292

Page 3: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

7 . Comparison Metrics............................................................................................................31

7.1 Peak Signal to Noise Ratio............................................................................................31

7.2 Bjontegaard Delta Bit-rate (BD-BR) and Bjontegaard Delta PSNR (BD-PSNR)........31

7.3 Implementation Complexity...........................................................................................32

8. Test Sequences.....................................................................................................................33

9. Implementation....................................................................................................................35

9.1 Configuration profiles used for comparison..................................................................36

9.2 Parameters modified.......................................................................................................36

9.3 Sample command line parameters for HM-16.4+SCM-4.0RC1...................................36

9.3.1 Encoding.................................................................................................................36

9.3.2 Decoding.................................................................................................................37

9.4 Testing Platform............................................................................................................37

9.5 Tabular columns for test sequence parameters...............................................................38

9.6 Graphs for test sequence parameters..............................................................................39

9.6.1 Bit-rate.....................................................................................................................40

9.6.2 Size of the binary file..............................................................................................43

9.6.3 %BD Bit-rate...........................................................................................................45

9.6.4 BD-PSNR................................................................................................................49

9.6.5 Encoding time..........................................................................................................51

9.6.6 Decoding time..........................................................Error! Bookmark not defined.

References………………………………………………………………………….......57

Acknowledgement3

Page 4: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

I would like to acknowledge Dr. Rao for his continuous support and guidance during the course of the project. I thank him for providing necessary feedback and dedicating his precious time in reviewing the reports and presentation slides at each step.

I would also like extend my gratitude towards Ms. S.C. Kodpadi and Ms. N.N. Mundgemane in helping me with the test sequences with screen content and other related issues that I faced during the course of the project.

This project wouldn’t have been successful without your continuous guidance and efforts.

List of Acronyms and Abbreviations

4

Page 5: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

AVC : Advanced Video Coding.

B-frame : Bi-predictive frame.

BD BR : Bjontegaard Bitrate.

BD PSNR : Bjontegaard Peak Signal to Noise Ratio.

CABAC: Context Adaptive Binary Arithmetic Coding

CTB: Coding Tree Block.

CTU: Coding Tree Unit.

CU: Coding Unit.

CB : Coding Block

DCT : Discrete Cosine Transform.

DBF: De-blocking Filter.

GPB : Generalized P and B picture Blocks.

HEVC: High Efficiency Video Coding.

HM: HEVC Test Model.

HP : Hierarchical Prediction.

I-frame : Intra-coded frame.

JCT: Joint Collaborative Team.

JCT-VC: Joint Collaborative Team on Video Coding.

JM: H.264 Test Model.

JPEG: Joint Photographic Experts Group.

MV : Motion Vector.

MC: Motion Compensation.

ME: Motion Estimation.

MPEG: Motion Picture Experts Group.

P-frame : Predicted frame.

PC : Prediction Chunking.

PU : Prediction Unit.

PB: Prediction Block.

PSNR : Peak Signal to Noise Ratio.

QP: Quantization Parameter.

RDPCM : Residual Differential Pulse code Modulation.

SAO: Sample Adaptive Offset.

TB: Transform Block.

TU: Transform Unit.

VCEG: Visual Coding Experts Group.

5

Page 6: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Abstract

In this project, RDPCM is applied to inter predicted residuals and tested in the context of the HEVC range extension development [8]. Video content containing computer generated objects is usually denoted as screen content and is becoming popular in applications such as desktop sharing, wireless displays, etc. Screen content images and videos are characterized by high frequency details such as sharp edges and high contrast image areas. On these areas classical lossy encoding tools – spatial transform plus quantization – may significantly compromise their quality and intelligibility. Therefore, lossless coding is used instead and improved coding tools should be specifically devised for screen content. The proposed method exploits the spatial correlation present in blocks containing edges or text areas which are poorly predicted by motion compensation. When compared to HEVC lossless coding as specified in version 1 of the standard, the proposed algorithm is expected to achieve up to 8% average bit-rate reduction while not increasing the overall decoding complexity [1].

6

Page 7: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

1. Objective of the project

The objective of this project is to introduce inter Residual Differential Pulse Code Modulation (inter RDPCM) applied to motion compensated residuals in lossless screen content coding (SCC) scenarios. The novelty brought by this paper is twofold: first, the proposed inter RDPCM is applied to the HEVC standard at three different levels of granularity, namely the coding unit (CU), prediction unit (PU) or transform unit (TU) level. In particular, three DPCM prediction modes (vertical, horizontal or no DPCM) are considered independently. Second, two additional tools are proposed for inter RDPCM: Prediction Chunking (PC) and Hierarchical Prediction (HP). PC can be used to improve the overall throughput, thus decreasing complexity. On the other hand, HP can be used to improve the compression efficiency of the proposed inter RDPCM method.

It uses inter RDPCM coding tool to improve inter prediction in lossless screen content coding. Moreover, two other tools PC [1] and HP[1] for reducing the complexity or increasing the compression efficiency have also been proposed.

The simulation will be conducted using HM 16.4 software [18], with different video sequences [3], search range, block sizes and number of frames using GPU multi-core computing.

7

Page 8: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

2. Basic Concepts of Video Coding

2.1 Color Spaces

The common color spaces for digital image and video representation are:

RGB color space – Each pixel is represented by three numbers indicating the relative proportions of red, green and blue colors

YCrCb color space – Y is the luminance component, a monochrome version of color image. Y is a weighted average of R, G and B:

8

Page 9: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Y = krR + kgG + kbB where k are the weighting factors.

The color information is represented as color differences or chrominance components, where each chrominance component is difference between R, G or B and the luminance Y.

As the human visual system is less sensitive to color than the luminance component, YC rCb

has advantages over RGB space. The amount of data required to represent the chrominance component reduces without impairing the visual quality [4].

The popular patterns of sampling [4] are:

4:4:4 – The three components YCrCb have the same resolution, which is for every 4 luminance samples there are 4 Cr and 4 Cb samples.

4:2:2 – For every 4 luminance samples in the horizontal direction, there are 2 Cr and 2 Cb samples. This representation is used for high quality video color reproduction.

4:2:0 – The Cr and Cb each have half the horizontal and vertical resolution of Y. This is popularly used in applications such as video conferencing, digital television and DVD storage.

9

Page 10: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 1: 4:2:0 sub-sampling pattern [4]

Figure 2: 4:2:2 sub-sampling and 4:4:4 sampling patterns [4]

3. H.265 / High Efficiency Video Coding

3.1 Introduction

High Efficiency Video Coding (HEVC) [5] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group). The main goal of HEVC standard is to significantly improve compression performance compared to existing standards (such as H.264/Advanced Video Coding [6]) in the range of 50% bit rate reduction at similar visual quality [7].

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two key 10

Page 11: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

issues: increased video resolution and increased use of parallel processing architectures [7]. It primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit. The next revision of the standard, will enable new use-cases with the support of additional pixel formats such as 4:2:2 and 4:4:4 and bit depth higher than 10-bit [8], embedded bit-stream scalability and 3D video [9].

3.2 Encoder and Decoder in HEVCSource video, consisting of a sequence of video frames, is encoded or compressed by a video encoder to create a compressed video bit stream. The compressed bit stream is stored or transmitted. A video decoder decompresses the bit stream to create a sequence of decoded frames [10].

The video encoder performs the following steps: Partitioning each picture into multiple units Predicting each unit using inter or intra prediction, and subtracting the prediction from the

unit Transforming and quantizing the residual (the difference between the original picture unit

and the prediction) Entropy encoding transform output, prediction information, mode information and headers

The video decoder performs the following steps: Entropy decoding and extracting the elements of the coded sequence Rescaling and inverting the transform stage Predicting each unit and adding the prediction to the output of the inverse transform Reconstructing a decoded video image

The Figure 3 represents the block diagram of HEVC CODEC [10] :

Figure 3: Block Diagram of HEVC CODEC [10]

11

Page 12: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

The Figure 4 [6] and Figure 5 [11] represent the detailed block diagrams of HEVC encoder and decoder respectively:

Figure 4: Block Diagram of HEVC Encoder [6]

Figure 5: Block Diagram of HEVC Decoder [11]

12

Page 13: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

3.3 Features of HEVC

The video coding layer of HEVC employs the same hybrid approach (inter-/intra-picture prediction and 2-D transform coding) used in all video compression standards. Figure. 1 depicts the block diagram of a hybrid video encoder, which can create a bit-stream conforming to the HEVC standard. Figure.5 shows the HEVC decoder block diagram. An encoding algorithm producing an HEVC compliant bit-stream would typically proceed as follows. Each picture is split into block-shaped regions, with the exact block partitioning being conveyed to the decoder. The first picture of a video sequence (and the first picture at each clean random access point in a video sequence) is coded using only intra-picture prediction (that uses prediction of data spatially from region-to-region within the same picture, but has no dependence on other pictures). For all remaining pictures of a sequence or between random access points, inter-picture temporally predictive coding modes are typically used for most blocks.

The encoding process for inter-picture prediction consists of choosing motion data comprising the selected reference picture and MV to be applied for predicting the samples of each block. The encoder and decoder generate identical inter-picture prediction signals by applying MC using the MV and mode decision data, which are transmitted as side information. The residual signal of the intra- or inter-picture prediction, which is the difference between the original block and its prediction, is transformed by a linear spatial transform. The transform coefficients are then scaled, quantized, entropy coded, and transmitted together with the prediction information. The encoder duplicates the decoder processing loop (see gray-shaded boxes in Figure.4) such that both will generate identical predictions for subsequent data. Therefore, the quantized transform coefficients are constructed by inverse scaling and are then inverse transformed to duplicate the decoded approximation of the residual signal. The residual is then added to the prediction, and the result of that addition may then be fed into one or two loop filters to smooth out artifacts induced by block-wise processing and quantization.

The final picture representation (that is a duplicate of the output of the decoder) is stored in a decoded picture buffer to be used for the prediction of subsequent pictures. In general, the order of encoding or decoding processing of pictures often differs from the order in which they arrive from the source; necessitating a distinction between the decoding order (i.e., bit-stream order) and the output order (i.e., display order) for a decoder. Video material to be encoded by HEVC is generally expected to be input as progressive scan imagery (either due to the source video originating in that format or resulting from de-interlacing prior to encoding). No explicit coding features are present in the HEVC design to support the use of interlaced scanning, as interlaced scanning is no longer used for displays and is becoming substantially less common for distribution. However, a metadata syntax has been provided in HEVC to allow an encoder to indicate that interlace-scanned video has been sent by coding each field (i.e., the even or odd numbered lines of each video frame) of interlaced video as a separate picture or that it has been sent by coding each interlaced frame as an HEVC coded picture. This provides an efficient method of coding interlaced video without burdening decoders with a need to support a special

13

Page 14: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

decoding process for it. In the following, the various features involved in hybrid video coding using HEVC are highlighted as follows.

3.3.1 Coding tree units and coding tree block (CTB) structure: The core of the coding layer in previous standards was the macroblock, containing a 16×16 block of luma samples and, in the usual case of 4:2:0 color sampling, two corresponding 8×8 blocks of croma samples; whereas the analogous structure in HEVC is the coding tree unit (CTU), which has a size selected by the encoder and can be larger than a traditional macroblock. The CTU consists of a luma CTB and the corresponding croma CTBs and syntax elements. The size L×L of a luma CTB can be chosen as L = 16, 32, or 64 samples, with the larger sizes typically enabling better compression. HEVC then supports a partitioning of the CTBs into smaller blocks using a tree structure and quad tree-like signalling [19]. The partitioning of CTBs into CBs ranging from 64*64 down to 8*8 is shown in Figure.6.

Figure 6: 64*64 CTBs split into CBs [13]

3.3.2 Coding units (CUs) and coding blocks (CBs): The quad tree syntax of the CTU specifies

the size and positions of its luma and croma CBs. The root of the quadtree is associated with the

CTU. Hence, the size of the luma CTB is the largest supported size for a luma CB. The splitting

of a CTU into luma and croma CBs is signaled jointly. One luma CB and ordinarily two croma

CBs, together with associated syntax, form a coding unit (CU) as shown in Figure.7. A CTB may

14

Page 15: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

contain only one CU or may be split to form multiple CUs, and each CU has an associated

partitioning into prediction units (PUs) and a tree of transform units (TUs).

Figure 7: CU’s split into CB’s [13]

3.3.3 Prediction units and prediction blocks (PBs): The decision whether to code a picture

area using interpicture or intrapicture prediction is made at the CU level. A PU partitioning

structure has its root at the CU level. Depending on the basic prediction-type decision, the luma

and croma CBs can then be further split in size and predicted from luma and croma prediction

blocks (PBs) as shown in Figure. 8 . HEVC supports variable PB sizes from 64×64 down to 4×4

samples.

Figure 8: Partitioning of Prediction Blocks from Coding Blocks [13]

15

Page 16: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

3.3.4 TUs and transform blocks: The prediction residual is coded using block transforms. A

TU tree structure has its root at the CU level. The luma CB residual may be identical to the luma

transform block (TB) or may be further split into smaller luma TBs as shown in Figure.9. The

same applies to the Croma TBs. Integer basis functions similar to those of a discrete cosine

transform (DCT) are defined for the square TB sizes 4×4, 8×8, 16×16, and 32×32. For the 4×4

transform of luma intrapicture prediction residuals, an integer transform derived from a form of

discrete sine transform (DST) is alternatively specified.

Figure 9: Partitioning of Transform Blocks from Coding Blocks [13]

3.3.5 Motion vector signalling: Advanced motion vector prediction (AMVP) is used,

including derivation of several most probable candidates based on data from adjacent PBs and

the reference picture. A merge mode for MV coding can also be used, allowing the inheritance of

MVs from temporally or spatially neighboring PBs. Moreover, compared to H.264/MPEG-4

AVC[6], improved skipped and direct motion inference is also specified.

3.3.6 Motion compensation: Quarter-sample precision is used for the MVs and 7-tap or 8-tap

filters are used for interpolation of fractional-sample positions (compared to six-tap filtering of

half-sample positions followed by linear interpolation for quarter-sample positions in

H.264/MPEG-4 AVC). Similar to H.264/MPEG-4 AVC, multiple reference pictures are used as

shown in Figure.6. For each PB, either one or two motion vectors can be transmitted, resulting

either in unipredictive or bipredictive coding, respectively. As in H.264/MPEG-4 AVC, a scaling

and offset operation may be applied to the prediction signal(s) in a manner known as weighted

16

Page 17: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

prediction.

Figure 10: Quadtree structure used for MVs [13]

Figure 11: Concept of multi-frame motion-compensated prediction [13]

3.3.7 Intra-picture prediction: The decoded boundary samples of adjacent blocks are used as

reference data for spatial prediction in regions where inter-picture prediction is not performed.

Intra picture prediction supports 33 directional modes (compared to eight such modes in

H.264/MPEG-4 AVC[6]), plus planar (surface fitting) and DC (flat) prediction modes. The

selected intra-picture prediction modes are encoded by deriving most probable modes (e.g.,

prediction directions) based on those of previously decoded neighboring PBs.

17

Page 18: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 12: Directional Prediction Modes in H.264 [13]

Figure 13: Modes and directional orientations for intra picture prediction in HEVC [13]

3.3.8 Quantization control: As in H.264/MPEG-4 AVC, uniform reconstruction quantization

(URQ) is used in HEVC, with quantization scaling matrices supported for the various transform

block sizes.

3.3.9 Entropy coding: Context adaptive binary arithmetic coding (CABAC) is used for entropy

coding. This is similar to the CABAC scheme in H.264/MPEG-4 AVC, but has undergone

several improvements to improve its throughput speed (especially for parallel-processing

architectures) and its compression performance, and to reduce its context memory requirements.

3.3.10 In-loop de-blocking filtering: A de-blocking filter similar to the one used in

H.264/MPEG-4 AVC[6] is operated within the inter-picture prediction loop. However, the

18

Page 19: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

design is simplified in regard to its decision-making and filtering processes, and is made more

friendly to parallel processing.

3.3.11 Sample adaptive offset (SAO): A nonlinear amplitude mapping is introduced within the inter-picture prediction loop after the de-blocking filter. Its goal is to better reconstruct the original signal amplitudes by using a look-up table that is described by a few additional parameters that can be determined by histogram analysis at the encoder side.

4. Introduction to Screen Content Coding

19

Page 20: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

4.1 IntroductionScreen content refers to images and videos which contain computer generated objects or

screen shots from computer applications. This kind of content requires efficient compression solutions as its use is becoming more popular in emerging technologies such as desktop sharing, video walls in control rooms, wireless display and digital remote operating rooms for surgeries [15], [16]. Screen content differs significantly from the camera captured content due to the presence of high frequency features such as sharp edges and high contrast areas. The presence of these features reduces the coding efficiency of classical hybrid block-based image and video codecs which use spatial transforms to compact the energy of signals into a few lower frequency coefficients. Moreover, the quantization applied by the aforementioned lossy codecs may severely blur details in text areas compromising the intelligibility of the whole content. For these reasons lossless coding techniques may be preferable for screen content applications and novel coding tools should be devised to improve the compression efficiency of image and video coding standards.

This need has gained further evidence given the current standardization activities inside the joint collaborative team on video coding (JCT-VC) which is a joint partnership between ISO/IEC MPEG and ITU-T VCEG. The JCT-VC defined the high efficiency video coding (HEVC) standard [7], which guarantees about 50% Bit-rate reduction at the same subjective quality of its predecessor, the H.264/AVC standard [12]. Version 1 of HEVC was finalized in January 2013 [7]. Since then the JCT-VC has concentrated efforts on the development of scalable, 3D and range extensions of HEVC. More specifically, the HEVC range extension(HEVC-RExt) [8] addresses compression of content represented with more than 10 bits per sample, chroma sampling other than the 4:2:0 supported in Version 1, support for alpha channel and improved lossless coding tools for screen content. On this latter aspect, it is worth mentioning that HEVC Version 1 already supports the lossless coding mode (shown in Figure. 14) by simply signaling the so called transform-bypass flag for each coding unit (CU). When this flag is set to 1, the spatial transform, quantization and in-loop filter processes are skipped and only intra or inter prediction is performed followed by entropy encoding. While using the transform-bypass flag allows lossless coding without introducing any additional modules, the associated compression efficiency may not be satisfactory. It is well known that the pixels inside each coded block can be used to improve prediction, exploiting the spatial redundancy [17], especially in applications that require a high quality of decoded content. Therefore such approaches are suitable for high fidelity coding.

Advancements in mobile and cloud technologies have increased demand for efficient coding of screen content. Video materials containing camera-view video content with text overlays and running banners, computer graphics like cartoons, alpha bending, mobile device display contents, slide editing are examples of screen content. This screen content has different characteristics when compared with natural video captured by cameras. Therefore, besides traditional video coding standards, efficient compression of screen content is required. Figure 14 shows images of screen content [31].

20

Page 21: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 14: Images of screen content: slide editing, alpha bending, video with text overlay, mobile display content [31].

4.2. Analysis of HEVC on Screen Content CodingImages with screen content are compound images which are dominated by several major colors that may be significantly different from each other and image structures are more complex than that in photographic images [23]. Frequency representation of natural video depicts energy distribution to be concentrated at low frequencies, while the energy of screen video is scattered over co-efficient in all frequencies. The inter frame correlation of screen video is different and non-translational changes such as foreground changes also occur. Various techniques introduced in HEVC provide efficient coding tools for natural videos but they cannot guarantee the best coding efficiency for screen content.

4.3 Angular Intra PredictionIn HEVC, 33 directional prediction modes, DC mode and planar mode are introduced to improve prediction accuracy and reduction in energy of intra predicted residue is significant. The directional correlation within a screen content block varies pixel by pixel which causes angular intra prediction poor at removing redundancy of screen content block [28].

4.4 Inter-PredictionThe motion estimation model in HEVC for video content with inter frame redundancy improves inter prediction accuracy. A non-translational change in screen content cannot be compensated by traditional motion estimation model [28]. Thus removal of inter frame redundancy is poor and accurately predicting inter frames changes has to be addressed in screen content coding. Figure 15 shows an example of inter frame changes such as foreground change in a video conference call.

21

Page 22: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 15: Example of foreground change in screen content [31].

4.5 Loop FiltersLoop filter is introduced in HEVC to improve the visual quality of reconstructed blocks and remove the blocking artifacts. In screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content [28].

Figure 16 : HEVC Encoder with lossless coding mode [4].

22

Page 23: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

5. Residual DPCM in HEVC inter-prediction

This section presents inter RDPCM and its application to the codec considered for the HEVC-RExt[8]. First some general considerations are discussed followed by the proposed inter DPCM together with the Hierarchical Prediction and Prediction Chunking tools.

5.1 General considerations and the HEVC coding structure

The HEVC standard performs inter-prediction by means of block-based motion compensation which assumes that all the pixels inside a block move approximately with the same motion. This assumption leads to poor prediction performance along sharp edges. In screen content it is reasonable to expect that inter-prediction residuals still present some correlation along image edges, which can be exploited by performing a spatial DPCM along the edge direction. This intuition is the basis for the proposed inter RDPCM. Several directions may be considered; however, to limit the computational complexity, only horizontal and vertical ones are included since they are predominant in screen content. As stated in the Introduction, this paper investigates inter RDPCM application at different levels of granularity corresponding to the HEVC coding structure, briefly reviewed in the following. The HEVC standard makes use of a flexible partitioning where image areas are recursively split according to a quad-tree fashion [7]. More precisely, the encoder divides each frame into a grid of non-overlapping coding tree units (CTUs) which identify square regions of N×N luma samples. Then each CTU can be split using a quad-tree partitioning. Each level of this partitioning leads to a set of CUs where the coding process takes place. Each CU can be further split for inter-prediction into up to four PUs, according to a set of possible coding modes. When the prediction process is finished, each CU is split again for entropy coding following another quad-tree partitioning. This partitioning leads to a set of TUs where transformation, quantization and entropy coding are carried out.

5.2 General method for inter RDPCM

Let r(i, j) be the elements of an M×N residual block of inter-predicted luma or chroma samples where M and N are the block height and width respectively. The vertical inter RDPCM mode is defined as follows. The samples in the first row in the block are left unchanged. All other samples are predicted from the sample immediately above in the same column. Formally the modified residuals resulting from this RDPCM mode are:

Where i – Row count.

23

Page 24: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

j – Column count.r(i,j) - Elements of a (MxN) residual block.M – Residual block height.N – Residual block width.

- Modified residual after vertical RDPCM.

The horizontal inter RDPCM mode is defined in a similar way: samples in the first column in the block are left unchanged, while all other samples are predicted from the sample immediately on the left in the same row.

Figure 17: Hierarchical prediction (green line) - an additional step is applied on the top row after vertical RDPCM (red lines) [1]

Along with the horizontal and vertical RDPCM options, the no RDPCM case is tested at the encoder. This is typically a coding choice for those residuals that are highly uncorrelated, i.e. where no further spatial prediction is needed. To find the best mode for the current residual block the sum of absolute differences (SAD) distortion metric is computed for each mode (i.e. horizontal, vertical or no RDPCM). The mode with minimum SAD is selected as the best. Notice that the solution with minimum distortion is used instead of the solution with minimum coding rate, because the computational complexity required to compute this rate would be too high, since for each mode and level of granularity a CABAC encoding of the residuals would be needed. Finally, in this implementation the inter RDPCM mode is signaled to the decoder using CABAC with the following binary representation: no RDPCM (0), horizontal RDPCM (10) and

24

Page 25: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

vertical RDPCM (11). One context for each block size and color component is used.

At the decoder side, when vertical RDPCM is selected, the residuals r(i, j) to be added to the motion compensated prediction are obtained as follows

Where i – Row count.j – Column count.k – Present row.r(i,j) – Recovered elements of a (MxN) residual block.M – Residual block height.N – Residual block width.

- Modified residual after vertical RDPCM.

For horizontal RDPCM, the summation is performed across the current row.

5.3 Additional tools for inter RDPCM

As can be seen from Eq. (2), the reconstruction of a residual in the i-th row depends on the previous i - 1 samples. For large TU sizes (e.g. 32×32) samples located at the rightmost columns (bottom rows for horizontal RDPCM) require a high number of additions before becoming available. Therefore it increases the computational complexity and dependency between samples which may not be acceptable in some applications. Moreover, from the description of the horizontal and vertical RDPCM given in Section III.B, it may be noted that the samples in the first column (respectively the first row) are not RDPCM predicted. These two observations motivated the design of the two proposed prediction chunking (PC) and hierarchical prediction (HP) tools.

The prediction chunking tool limits the residual DPCM prediction to groups of samples with a specified length L, denoted as chunking length. In this way the RDPCM process is reset every L samples so that the number of operations per sample at the decoder side is reduced.

25

Page 26: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 18 : Prediction Chunking – Partitioning of the residual pixels into n chunks of size L during vertical RDPCM evaluation (red lines) [1]

The vertical RDPCM prediction when the PC tool is used is defined as follows:

Where i – Row count.j – Column count.L – Chunking length.r(i,j) – Elements of a (MxN) residual block.M – Residual block height.N – Residual block width.

- Modified residual after vertical RDPCM with Prediction Chunking.

At the decoder, the residuals can be reconstructed as follows:

26

Page 27: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Where i – Row count.j – Column count.L – Chunking length.r(i,j) – Recovered elements of a (MxN) residual block.M – Residual block height.N – Residual block width.

- Floor operator.

- Modified residual after vertical RDPCM with Prediction Chunking.

Equivalent expressions for forward and inverse inter RDPCM can be easily derived for the horizontal mode when using PC.

Once RDPCM is performed on a block, samples in the first column and the first row for horizontal and vertical RDPCM, respectively, are not predicted. Therefore it is beneficial to exploit redundancy by performing prediction on these samples in the direction orthogonal to the main RDPCM direction. The HP tool performs a RDPCM along the first column of samples when horizontal RDPCM is selected as the best mode or along the first row for vertical RDPCM. For the case of vertical RDPCM, the HP is defined as:

Where ( i = 0 ) – First row .j – Column count.r( i ,j) – Recovered elements of a (MxN) residual block.M – Residual block height.N – Residual block width.

- Modified residual after vertical RDPCM with Prediction Chunking.

A similar formalization can be defined for horizontal RDPCM.

27

Page 28: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

6. Test Configurations

6.1 Intra only configuration

In the test case for intra only coding, each picture in a video sequence shall be encoded as IDR picture. No temporal reference pictures shall be used. It is not allowed to change QP during a sequence within a picture. Figure 19 gives graphical presentation of intra only configuration. The number associated with each picture represents encoding order.

Figure 19: Graphical presentation of Intra-only configuration [34].

6.2 Low delay configuration

Two kinds of low-delay coding configurations have been defined for testing coding performance in low-delay mode. For these low-delay coding conditions, only the first picture in a video sequence shall be encoded as IDR picture. In mandatory low-delay test condition, the other successive pictures shall be encoded as Generalized P and B-picture (GPB). The GPB shall be able to use only the reference pictures, each of whose POC is smaller than the current picture (i.e., all reference pictures in RefPicList0 and RefPicList1 shall be temporally previous in display

28

Page 29: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

order relative to the current picture). The contents of RefPicList0 and RefPicList1 shall be identical, and they shall be updated with sliding-window management process. Reference picture list combination is used for management and entropy coding of reference picture index. Figure 20 shows graphical presentation of Low-delay configuration that is mandatory for performance evaluation in any CEs. The number associated with each picture represents encoding order. QP of each inter coded picture shall be derived by adding offset to QP of Intra coded picture depending on temporal layer. In the additional non-normative low-delay condition, all inter pictures shall be coded as P-picture, where only the content of RefPicList0 is used for inter prediction.

Figure 20 : Graphical presentation of Low-delay configuration [34].

6.3 Random access configuration

For the random-access test condition, hierarchical B structure shall be used for coding. Figure 21 shows graphical presentation of Random-access configuration. The number associated with each picture represents encoding order. Intra picture shall be inserted cyclically per about one second. The first intra picture of a video sequence shall be encoded as IDR picture and the other intra pictures shall be encoded as non-IDR intra pictures (“Open GOP”). The pictures

29

Page 30: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

located between successive intra pictures in display order shall be encoded as B-pictures. The GPB picture shall be used as the lowest temporal layer that can refer to I or GPB picture for inter prediction. The second and third temporal layers consist of referenced B pictures, and the highest temporal layer contains non-referenced B picture only. QP of each inter coded picture shall be derived by adding offset to QP of Intra coded picture depending on temporal layer. Reference picture list combination is used for management and entropy coding of reference picture index.

Figure 21 : Graphical presentation of Random-access configuration [34].

30

Page 31: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

7 . Comparison Metrics

7.1 Peak Signal to Noise Ratio

Peak signal-to-noise ratio (PSNR) [35] [36] is an expression for the ratio between the maximum possible value (power) of a signal and the power of distorting noise that affects the quality of its representation. Because many signals have a very wide dynamic range (ratio between the largest and smallest possible values of a changeable quantity), the PSNR is usually expressed in terms of the logarithmic decibel scale.

PSNR is most commonly used to measure the quality of reconstruction of lossy compression codecs. The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs, PSNR is an approximation to human perception of reconstruction quality. Although a higher PSNR generally indicates that the reconstruction is of higher quality, in some cases it may not. One has to be extremely careful with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content.

PSNR is defined via the mean squared error (MSE). Given a noise-free m x n monochrome image I and its noisy approximation K, MSE is defined as:

The PSNR is defined as

Here, MAXI is the maximum possible pixel value of the image. For test sequences in 4:2:0 color format, PSNR is computed as a weighted average of luminance ( Y) and chrominance ( U,V ) components [14] as given below:

7.2 Bjontegaard Delta Bit-rate (BD-BR) and Bjontegaard Delta PSNR (BD-PSNR)

To objectively evaluate the coding efficiency of video codecs, Bjontegaard Delta PSNR

31

Page 32: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

(BD-PSNR) was proposed. Based on the rate-distortion (R-D) curve fitting, BD-PSNR provides a good evaluation of the R-D performance [37] [38].

BD metrics allow to compute the average gain in PSNR or the average per cent saving in Bit-rate between two rate-distortion curves. However, BD-PSNR has a critical drawback: It does not take the coding complexity into account [37].

7.3 Implementation Complexity

The computational time for various configuration profiles in HM-16.4+SCM-4.0rc1/ software will be compared and this serves as an indication of implementation complexity.

32

Page 33: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

8. Test Sequences

Figure 22 : Different video resolutions ranging from mobile devices, tablets to advanced Televisions [39].

The following test sequences [23] of various resolutions are used for study of different configuration profiles of HEVC codecs:

33

Page 34: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Table 1 : List of test sequences for different resolutions

Test Sequence Resolution Frame rate (fps)Carphone30_qcif_90.yuv 176 x 144 30

BigBuckBunny_CIF_24fps.yuv 352 x 288 24ElephantsDream_CIF_24fps.yuv 352 x 288 24

Mobile_cif.y4m 352 x 288 30sintel_trailer_2k_720p24_4.y4m 480 x 720 24sintel_trailer_2k_1080p24.y4m 1920 x 1080 24

Figure 24 :ElephantsDream_CIF_24fps.yuv [3] Figure 25 : Mobile_cif.y4m [40]

34

Page 35: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 26 : sintel_trailer_2k_720p24_4.y4m [3]

Figure 27 : sintel_trailer_2k_1080p24.y4m [3]

35

Page 36: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

9. Implementation

9.1 Configuration profiles used for comparison

Test sequences in YUV format and of resolutions CIF and QCIF are encoded and decoded using the Random access and Low Delay profiles of HM-16.4+SCM-4.0RC1[18] and are compared with the Intra main profile. The working of these configurations is mentioned in section 6.

9.2 Parameters modified

F - Number of frames to be encoded Fr - Frame rate Wdt - Width of the video sequence Hgt - Height of the video sequence Profile - encoder_intra_main/ encoder_randomaccess _main /encoder_ lowdelay_main.

Command line parameters for using HM-16.4+SCM-4.0RC1[18] encoder:

TAppEncoder [-h] [-c config.cfg] [--parameter=value]

Options:-h Prints parameter usage-c Defines configuration file to use. Multiple configuration files may be used.--parameter=value Assigns value to a given parameter.

Command line parameters for using HM-16.4+SCM-4.0RC1[18] decoder:

TAppDecoder [-b] str.bin [-o] dec.yuv [options]

9.3 Sample command line parameters for HM-16.4+SCM-4.0RC1

9.3.1 EncodingC:\Users\Siddu_Pratapur\Desktop\HEVC16.4\bin\vc9\Win32\Debug>TAppEncoder.exe –c C:\Users\Siddu_Pratapur \Desktop\HEVC16.4\cfg\encoder_intra_main.cfg –wdt 352 –hgt 288 –fr 24 –f 90 -iC:\Users\Siddu_Pratapur\Desktop\Test_sequences_n_results\Without_modifications\CIF\Elephant\ElephantsDream_CIF_24fps.yuv >>C:\Users\Siddu_Pratapur\Desktop\Test_sequences_n_results\Without_modifications\CIF\Elephant\Encoded_data\Log_Encoded_IM.txt

36

Page 37: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

9.3.2 Decoding

C:\Users\Siddu_Pratapur\Desktop\HEVC16.4\bin\vc9\Win32\Debug>TAppDecoder.exe –bC:\Users\Siddu_Pratapur\Desktop\Test_sequences_n_results\Without_modifications\CIF\Elephant\Encoded_data\str_IM.bin –o C:\Users\Siddu_Pratapur\Desktop\Test_sequences_n_results\Without_modifications\CIF\Elephant\Decoded_data\Elephant_Decoded.yuv >>C:\Users\Siddu_Pratapur\Desktop\Test_sequences_n_results\Without_modifications\CIF\Elephant\Decoded_data\Elephant_Decoded_IM.txt

9.4 Testing Platform

Processor Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz Number of cores 4 Memory 4.00 GB Operating System 64 bit Windows(TM)7 Ultimate OS

37

Page 38: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

9.5 Tabular columns for test sequence parameters

Table 2 : Carphone30_qcif_90.yuv ( Number of frames encoded : 90 )

Configuration profile PSNR( dB )

Bit-rate (kbps)

Encoding time in seconds (s)

Decoding time in seconds (s)

Size of the binary file(kb)

BD-PSNR

% BD Bit-rate

intra_main 33.5733 994.373 50.098 1.645 365randomaccess_main 31.7186 104.525 72.7 0.635 39 1.3524 -22.1425

lowdelay_main 31.0696 112.197 240.448 0.797 42 1.0324 -20.2457

Table 3 : BigBuckBunny_CIF_24fps.yuv ( Number of frames encoded : 90 )

Configuration profile PSNR( dB )

Bit-rate (kbps)

Encoding time in

seconds (s)

Decoding time in seconds (s)

Size of the binary file(kb)

BD-PSNR % BD Bit-rate

intra_main 38.3705 711.213 92.883 2.443 326randomaccess_main 35.8152 88.1664 245.903 1.399 41 1.6246 -28.2465

lowdelay_main 35.4546 94.1589 501.895 1.372 44 1.1240 -20.2645

Table 4 : ElephantsDream_CIF_24fps.yuv ( Number of frames encoded : 90 )

Configuration profile PSNR( dB )

Bit-rate (kbps)

Encoding time in

seconds (s)

Decoding time in

seconds (s)

Size of the binary file(kb)

BD-PSNR % BD Bit-rate

intra_main 36.4784 651.008 116.525 2.343 299randomaccess_main 35.6135 63.5691 192.453 1.245 30 1.7249 -30.1247

lowdelay_main 34.826 68.4843 520.273 1.413 32 1.2867 -24.2149

Table 5 : Mobile_cif.y4m ( Number of frames encoded : 90 )

Configuration profile PSNR( dB )

Bit-rate (kbps)

Encoding time in seconds (s)

Decoding time in seconds (s)

Size of the binary file(kb)

BD-PSNR

% BD Bit-rate

intra_main 34.6566 2639.26 135.934 3.765 967

randomaccess_main 29.3384 994.339 542.505 2.687 365 1.8344 -33.1324lowdelay_main 30.5268 1229.08 1108.59 2.965 451 1.4201 -26.2596

38

Page 39: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Table 6 : sintel_trailer_2k_720p24_4.y4m ( Number of frames encoded : 90 )

Configuration profile PSNR( dB )

Bit-rate (kbps)

Encoding time in seconds (s)

Decoding time in seconds (s)

Size of the binary file(kb)

BD-PSNR % BD Bit-rate

intra_main 43.8022 1169.2800 1609.094 8.481 536

randomaccess_main 42.2203 164.3008 2107.159 4.979 76 2.4210 -34.2157lowdelay_main 41.6062 180.6912 3399.998 6.985 84 1.833 -25.2647

Table 7 : sintel_trailer_2k_1080p24.y4m ( Number of frames encoded : 90 )

Configuration profile PSNR( dB )

Bit-rate (kbps)

Encoding time in seconds (s)

Decoding time in seconds (s)

Size of the binary file(kb)

BD-PSNR % BD Bit-rate

intra_main 44.5630 2157.8709 2124.922 25.968 989

randomaccess_main 42.8912 324.8149 6037.715 11.138 150 3.0140 -37.1125lowdelay_main 40.2542 3011.1542 8213.121 17.214 341 2.1324 -26.7165

39

Page 40: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

9.6 Graphs for test sequence parameters

9.6.1 Bit-rate

Figure 28 : Bit-rate graph for various configurations for the test sequence Carphone30_qcif_90.yuv

Figure 29 : Bit-rate graph for various configurations for the test sequence BigBuckBunny_CIF_24fps.yuv

40

intra_main randomaccess_main lowdelay_main0

200

400

600

800

1000

1200

Bit-rate (kbps)

Bit-rate (kbps)

Configuration profile

Bit

-rat

e(k

bp

s)

intra_main randomaccess_main lowdelay_main0

100

200

300

400

500

600

700

800Bit-rate (kbps)

Bit-rate (kbps)

Configuration profile

Bit

-rat

e(k

bp

s)

Page 41: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 30 : Bit-rate graph for various configurations for the test sequence ElephantsDream_CIF_24fps.yuv

Figure 31 : Bit-rate graph for various configurations for the test sequence Mobile_cif.y4m

41

intra_main randomaccess_main lowdelay_main0

100

200

300

400

500

600

700

Bit-rate (kbps)

Bit-rate (kbps)

Configuration profile

Bit-

rate

(kbp

s)

intra_main randomaccess_main lowdelay_main0

500

1000

1500

2000

2500

3000

Bit-rate (kbps)

Bit-rate (kbps)

Configuration profile

Bit-

rate

(kbp

s)

Page 42: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

1

Figure 32 : Bit-rate graph for various configurations for the test sequence sintel_trailer_2k_720p24_4.y4m

Figure 33 : Bit-rate graph for various configurations for the test sequence sintel_trailer_2k_1080p24.y4m

42

intra_main randomaccess_main lowdelay_main0

200

400

600

800

1000

1200

1400Bit-rate (kbps)

Bit-rate (kbps)

Configuration profiles

Bit-

rate

in k

bps

intra_main randomaccess_main lowdelay_main0

500

1000

1500

2000

2500

3000

3500

Bit-rate (kbps)

Bit-rate (kbps)

Configuration profile

Bit-

rate

(kbp

s)

Page 43: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

9.6.2 Size of the binary file

Figure 34 : Binary file size graph for various configurations for the test sequence Carphone30_qcif_90.yuv

Figure 35 : Binary file size graph for various configurations for the test sequence BigBuckBunny_CIF_24fps.yuv

43

intra_main randomaccess_main lowdelay_main0

50

100

150

200

250

300

350

400

Size of the binary file(kB)

Size of the binary file(kB)

Configuration profile

Size

()kB

)

intra_main randomaccess_main lowdelay_main0

50

100

150

200

250

300

350

Size of the binary file(kB)

Size of the binary file(kB)

Configuration profile

Size

(kB)

Page 44: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 37 : Binary file size graph for various configurations for the test sequence Mobile_cif.y4m

44

intra_main randomaccess_main lowdelay_main0

50

100

150

200

250

300

350

Size of the binary file(kB)

Size of the bi-nary file(kB)

Configuration profile

Size

(kB)

Figure 36 : Binary file size graph for various configurations for the test sequence ElephantsDream_CIF_24fps.yuv

Object 47

Page 45: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

intra_main randomaccess_main lowdelay_main0

100

200

300

400

500

600

Size of the binary file(kB)

Size of the binary file(kB)

Configuration profile

Bit-

rate

in k

bps

Figure 38 : Binary file size graph for various configurations for the test sequence sintel_trailer_2k_720p24_4.y4m

45

Page 46: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 39 : Binary file size graph for various configurations for the test sequence sintel_trailer_2k_1080p24.y4m

9.6.3 %BD Bit-rate

Figure 40 : % BD Bit-rate graph for various configurations for the test sequence Carphone30_qcif_90.yuv

46

intra_m

ain w

ith ra

ndomacces

s_main

intra_m

ain w

ith lo

wdelay_

main

-30

-25

-20

-15

-10

-5

0

% BD Bit-rate

Perc

enta

ge B

it-r

ate

redu

ction

intra_m

ain w

ith ra

ndomaccess

_main

intra_m

ain w

ith lo

wdelay_

main

-22.5

-22

-21.5

-21

-20.5

-20

-19.5

-19

% BD Bit-rate

Perc

enta

ge re

ducti

on in

Bit-

rate

intra_main randomaccess_main lowdelay_main0

200

400

600

800

1000

1200

Size of the binary file(kb)

Size of the binary file(kb)

Configuration profile

Bit-

rate

(kbp

s)

Page 47: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 41 : % BD Bit-rate graph for various configurations for the test sequence BigBuckBunny_CIF_24fps.yuv

Figure 42 : % BD Bit-rate graph for various configurations for the test sequence ElephantsDream_CIF_24fps.yuv

47

intra_m

ain w

ith ra

ndomaccess

_main

intra_m

ain w

ith lo

wdelay_

main

-35

-30

-25

-20

-15

-10

-5

0

% BD Bit-rate

Perc

enta

ge re

ducti

on in

bit-

rate

intra_m

ain w

ith ra

ndomacce

ss_main

intra_m

ain w

ith lo

wdelay_

main

-35

-30

-25

-20

-15

-10

-5

0% BD Bit-rate

Perc

enta

ge r

educ

tion

in B

it-r

ate

Page 48: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 43 : % BD Bit-rate graph for various configurations for the test sequenceMobile_cif.y4m

Figure 44 : % BD Bit-rate graph for various configurations for the test sequence sintel_trailer_2k_720p24_4.y4m

Figure 45 : % BD Bit-rate graph for various configurations for the test sequence

48

intra_m

ain w

ithra

ndomacce

ss_main

intra_m

ain w

ith lo

wdelay_

main

-40

-35

-30

-25

-20

-15

-10

-5

0

% BD Bit-rate

Perc

enta

ge re

ducti

on in

Bit

-rat

e

intra

_main

with

random

acce

ss_m

ain

intra

_main

with

lowdela

y_m

ain

-40

-35

-30

-25

-20

-15

-10

-5

0

% BD Bit-rate

Pe

rce

nta

ge r

ed

ucti

on

in B

it-r

ate

Page 49: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

sintel_trailer_2k_1080p24.y4m

9.6.4 BD-PSNR

Figure 46 : BD-PSNR comparison for various configurations for the test sequenceCarphone30_qcif_90.yuv

Figure 47 : BD-PSNR comparison for various configurations for the test sequence

49

intra_m

ain w

ith ra

ndomacce

ss_main

intra_m

ain w

ith lo

wdelay_

main00.40.81.21.6

BD-PSNR

BD PSNR

Comparison of Configuration profiles

BD P

SNR(

dB)

intra_m

ain w

ith ra

ndomaccess

_main

intra_m

ain w

ith lo

wdelay_

main0

0.40.81.21.6

2

BD-PSNR

BD PSNR

Comparison of Configuration profiles

BD P

SNR

(dB)

Page 50: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

BigBuckBunny_CIF_24fps.yuv

Figure 48 : BD-PSNR comparison for various configurations for the test sequence

ElephantsDream_CIF_24fps.yuv

Figure 49 : BD-PSNR comparison for various configurations for the test sequenceMobile_cif.y4m

50

intra_m

ain with

randomacc

ess_m

ain

intra_m

ain with

lowdela

y_main

00.40.81.21.6

BD PSNR

BD PSNR

Comparison of Configuration profiles

BD P

SNR

(dB)

intra_m

ain w

ith ra

ndomaccess

_main

intra_m

ain w

ith lo

wdelay_

main0

0.40.81.21.6

2

BD-PSNR

BD PSNR

Comparison of Configuration profiles

BD P

SNR

(dB)

Page 51: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 50 : BD-PSNR comparison for various configurations for the test sequence sintel_trailer_2k_720p24_4.y4m

Figure 51 : BD-PSNR comparison for various configurations for the test sequence sintel_trailer_2k_1080p24.y4m

51

intra_m

ain w

ithran

...

intra_m

ain w

ith l..

.0

0.5

1

1.5

2

2.5

3

BD-PSNR

BD-PSNR

Comparison of Configuration profiles

BD-P

SNR

in d

B

intra_m

ain w

ithra

n...

intra_m

ain w

ith lo

...0

0.5

1

1.5

2

2.5

3

3.5BD-PSNR

BD-PSNR

Comparison of Configuration profiles

BD-P

SNR

(dB)

Page 52: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

9.6.5 Encoding time

Figure 52 : Encoding time comparison for various configurations for the test sequence Carphone30_qcif_90.yuv

Figure 53 : Encoding time comparison for various configurations for the test sequence BigBuckBunny_CIF_24fps.yuv

52

intra_main randomaccess_main lowdelay_main0

100

200

300

400

500

600

Encoding time in seconds (s)

Encoding time in seconds (s)

Configuration profile

Enco

ding

tim

e (s

)

intra_main randomaccess_main lowdelay_main0

50

100

150

200

250

300

Encoding time in seconds (s)

Encoding time in seconds (s)

Configuration profile

Enco

ding

tim

e (s

)

Page 53: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 54 : Encoding time comparison for various configurations for the test sequence ElephantsDream_CIF_24fps.yuv

Figure 55 : Encoding time comparison for various configurations for the test sequenceMobile_cif.y4m

53

intra_main randomaccess_main lowdelay_main0

200

400

600

800

1000

1200

Encoding time in seconds (s)

Encoding time in seconds (s)

Configuration profile

Enco

ding

tim

e (s

)

intra_main randomaccess_main lowdelay_main0

100

200

300

400

500

600Encoding time in seconds (s)

Encoding time in seconds (s)

Configuration profile

Enco

ding

tim

e (s

)

intra_main randomaccess_main lowdelay_main0

100020003000400050006000700080009000

Encoding time in seconds (s)

Encoding time in seconds (s)

Enco

ding

tim

e (s

)

Page 54: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 56 : Encoding time comparison for various configurations for the test sequencesintel_trailer_2k_720p24_4.y4m

Figure 57 : Encoding time comparison for various configurations for the testsequence sintel_trailer_2k_1080p24.y4m

54

intra_main randomaccess_main lowdelay_main0

100020003000400050006000700080009000

Encoding time in seconds (s)

Encoding time in seconds (s)

Enco

ding

tim

e (s

)

intra_main randomaccess_main lowdelay_main0

100020003000400050006000700080009000

Encoding time in seconds (s)

Encoding time in seconds (s)

Enco

ding

tim

e (s

)

Page 55: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

intra_main randomaccess_main lowdelay_main0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8Decoding time in seconds (s)

Decoding time in seconds (s)

Configuration profile

Deco

ding

tim

e (s)

Figure 58 : Decoding time comparison for various configurations for the test sequence Carphone30_qcif_90.yuv

Figure 59 : Decoding time comparison for various configurations for the test sequence BigBuckBunny_CIF_24fps.yuv

55

intra_main randomaccess_main lowdelay_main0

0.5

1

1.5

2

2.5

3

Decoding time in seconds (s)

Decoding time in seconds (s)

Configuration profile

Dec

odin

g ti

me

(s)

Page 56: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 60 : Decoding time comparison for various configurations for the test sequence ElephantsDream_CIF_24fps.yuv

Figure 61 : Decoding time comparison for various configurations for the test sequence mobile.y4m

56

intra_main randomaccess_main lowdelay_main0

0.5

1

1.5

2

2.5

Decoding time in seconds (s)

Decoding time in seconds (s)

Configuration profile

Dec

odin

g ti

me

(s)

intra_main randomaccess_main lowdelay_main0

0.5

1

1.5

2

2.5

3

3.5

4

Decoding time in seconds (s)

Decoding time in seconds (s)

Configuration profile

Dec

odin

g ti

me

(s)

Page 57: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

Figure 62 : Decoding time comparison for various configurations for the test sequence sintel_trailer_2k_720p24_4.y4m

Figure 63 : Decoding time comparison for various configurations for the test sequence sintel_trailer_2k_1080p24.y4m

10. Conclusions and Further work

57

intra_main randomaccess_main lowdelay_main0

5

10

15

20

25

30

Decoding time in seconds (s)

Decoding time in seconds (s)

Dec

odin

g ti

me

(s)

intra_main randomaccess_main lowdelay_main0

1

2

3

4

5

6

7

8

9

Decoding time in seconds (s)

Decoding time in seconds (s)

Dec

odin

g ti

me

(s)

Page 58: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

10.1 Conclusions

From this project, one can conclude that the random access configuration in HM-16.4+SCM-4.0RC1[18] software gives optimum encoding and encoding results for both natural and screen content for the video sequences of various resolutions when compared to the intra main and low delay configurations.

10.2 Future work

1. Other comparison metrics such as SSIM can be calculated and tabulated for comparison for each of the test sequences.

2. Test sequences with higher resolutions like 2048 x 872 (2K), 4096 x 1744 (4K) sequences can be used for test comparison.

3. Even better encoding and decoding speeds can be obtained by using machines with higher processing speeds or FPGA implemented General Processing Units dedicated for encoding and decoding only.

REFERENCES

58

Page 59: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

[1] M. Naccari et al,” Improving Inter Prediction in HEVC with Residual DPCM for Lossless Screen Content Coding”, Picture Coding Symposium (PCS) , San Jose, CA, pp 361 – 364, 8-11 Dec. 2014.

[2] Special issue on Screen Content Video Coding and Applications, IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), Final manuscripts due on 22nd July 2016.

[3] Video test sequences - https://media.xiph.org/video/derf/

[4] I.E.G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, Wiley, 2002.

[5] B. Bross et al, “High Efficiency Video Coding (HEVC) Text Specification Draft 10”, Document JCTVC-L1003, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC),Mar.2013.

[6] D. Marpe et al, “The H.264/MPEG4 advanced video coding standard and its applications”, IEEE Communications Magazine, Vol. 44, pp. 134-143, Aug. 2006.

[7] G.J. Sullivan et al, “Overview of the High Efficiency Video Coding (HEVC) Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp. 1649-1668, Dec. 2012.

[8] HEVC white paper - Ateme: http://www.ateme.com/an-introduction-to-uhdtv-and-hevc

[9] G.J. Sullivan et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.

[10] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html

[11] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.

[12] T. Wiegand et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 560-576, Jul. 2003.

[13] U.S.M. Dayananda, “Study and Performance comparison of HEVC and H.264 video codecs” Final project report , EE Dept., UTA, Arlington, TX, Dec. 2011 available on http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html

[14] M. Wein, “High Efficiency Video Coding: Coding Tools and Specification”, Springer, 2014.

[15] T. Vermeir, “Use cases and requirements for lossless and screen content coding”, JCTVC-M0172, 13th JCT-VC meeting, Incheon, Korea, Apr. 2013.

59

Page 60: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

[16] J. Sole et al, “Requirements for wireless display applications”, JCTVC-M0315, 13th JCT-VC meeting, Incheon, Korea, Apr. 2013.

[17] A. Gabriellini et al, “Combined Intra-Prediction for High-Efficiency Video Coding”, IEEE Journal of selected topics in Signal Processing. Vol. 5, no. 7, pp. 1282-1289, Nov. 2011.

[18] HM-16.4+SCM-4.0rc1/ software - http://hevc.kw.bbc.co.uk/svn/jctvc-a124/tags/

[19] HM Software Manual - https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/

[20] Visual studio: http://www.dreamspark.com

[21] Tortoise SVN: http://tortoisesvn.net/downloads.html

[22] Multimedia processing course website: http://www.uta.edu/faculty/krrao/dip/

[23] K.R. Rao, D.N. Kim and J.J. Hwang,“Video Coding Standards: AVS China, H.264/MPEG-4 Part 10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014.

[24] V. Sze, M. Budagavi and G.J. Sullivan, “High Efficiency Video Coding (HEVC) Algorithms and Architectures” Springer, 2014.

[25] G. Braeckman et al ,"Lossy-to-Iossless screen content coding using an HEVC base-layer." in Proceedings of iEEE international Conference on Signal Processing (DSP), Santorini, Greece, 1-3 July, 2013.

[26] M. Mrak and J.Z. Xu, "Improving screen content coding in HEVC by transform skipping," in Proceedings of 20th European Signal Processing Conference (EUSIPCO), pp. 1209 – 1213, 27-31 Aug. 2012.

[27] M. Wien, H. Schwarz, and T. Oelbaum, “Performance analysis of SVC,” Special issue on Scalable Video Coding (SVC), IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1194-1203, Sep. 2007.

[28] W. Zhu et al, "Screen Content Coding Based on HEVC Framework," IEEE Transactions on Multimedia, vol.16, no.5, pp.1316-1326, Aug. 2014.

[29] I.E.G. Richardson, “Coding Video: A practical guide to HEVC and beyond”, Wiley, 11 May 2015.

[30] A. K. Katsaggelos, An online course on “Fundamentals of Image and Video Coding”, Northwestern University - https://www.coursera.org/course/digital

60

Page 61: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

[31] N.N. Mundgemane, A thesis proposal on “Multi-stage prediction scheme for Screen Content based on HEVC” M.S. Thesis, EE Dept., UTA, Arlington, TX, Sep. 2014, available on http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html

[32] S. Kodpadi, A thesis proposal on “Fast algorithms for Screen Content Coding in HEVC” M.S. Thesis, EE Dept., UTA, Arlington, TX, Sep. 2014, available on http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html

[33] W. Zhu, et al, "Compound image compression by multi-stage prediction," IEEE Trans. on Visual Communications and Image Processing (VCIP), pp.1-6, 27-30 Nov. 2012.

[34] I.K. Kim et al , “Coding of moving pictures and audio”, ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Document: JCTVC-K1002-v1 ,11th Meeting: Shanghai, CN, 10–19 October 2012.

[35] White paper on PSNR-NI: http://www.ni.com/white-paper/13306/en/

[36] Website on PSNR: http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio

[37] X. Li et al, “Rate-complexity-distortion evaluation for hybrid video coding”, IEEE International Conference on Multimedia and Expo (ICME), pp. 685-690, Jul. 2010.

[38] G. Bjontegaard, “Calculation of Average PSNR Differences between RD Curves”, document VCEG-M33, ITU-T SG 16/Q 6, Austin, TX, Apr. 2001.

[39] Different video resolutions : http://www.mediamerge.com/what-your-tech-team-needs-to-know-about-hd-video-projection/

[40] Video test sequences with screen content : http://trace.eas.asu.edu/yuv/

[41] G. Correa et al, “Fast  HEVC encoding decisions using data mining”, IEEE Trans. CSVT, vol.25, pp. 660-673, April 2015.

[42] JCT-VC documents are publicly available at http://ftp3.itu.ch/av-arch/jctvc-site and http://phenix.it-sudparis.eu/jct/

61

Page 62: 1. Objective of the · Web viewIn screen content, neighboring blocks are inconsistent with each other at boundaries thus loop filtering may downgrade the quality of screen content

62