interim report on - ut arlington – web viewthe ssim metric is calculated on various windows...

52
INTERIM REPORT ON PERFORMANCE COMPARISON OF HEVC, H.264 and VP9 A PROJECT UNDER THE GUIDANCE OF DR. K. R. RAO COURSE: EE5359 - MULTIMEDIA PROCESSING, SPRING 2015 SUBMITTED BY: DEEPIKA SREENIVASULU PAGALA [email protected] 1001112646 DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY OF TEXAS AT ARLINGTON 1 | Page

Upload: dotuong

Post on 06-Feb-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

INTERIM REPORT ON

PERFORMANCE COMPARISON OF HEVC, H.264 and VP9

A PROJECT UNDER THE GUIDANCE OF

DR. K. R. RAO

COURSE: EE5359 - MULTIMEDIA PROCESSING, SPRING 2015

SUBMITTED BY:

DEEPIKA SREENIVASULU PAGALA

[email protected]

1001112646

DEPARTMENT OF ELECTRICAL ENGINEERING

UNIVERSITY OF TEXAS AT ARLINGTON

1 | P a g e

Page 2: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Table of Contents:

1. Objective.................................................................................................................................6

2. Evolution of Video coding standards......................................................................................6

3. Need for video compression....................................................................................................7

4. Fundamentals Concepts in Video coding................................................................................7

5.H.264/AVC...............................................................................................................................8

5.1. Introduction........................................................................................................................8

5.2 Encoder and Decoder in H.264...........................................................................................9

5.3 Features of H.264/AVC.....................................................................................................11

5.3.1 Prediction................................................................................................................11

5.3.2 Transform and Quantization...................................................................................12

5.3.3 Entropy Coding.......................................................................................................12

6. HEVC......................................................................................................................................12

6.1. Introduction.......................................................................................................................12

6.2 HEVC Extensions and Emerging Applications .................................................................13

6.3 Encoder and Decoder in HEVC.........................................................................................13

6.4 Features of HEVC..............................................................................................................15

6.4.1. Partitioning.............................................................................................................15

6.4.2 Prediction.................................................................................................................15

6.4.3 Transform and Quantization....................................................................................16

6.4.4 Entropy Coding........................................................................................................16

7. VP9..........................................................................................................................................17

7.1 Introduction........................................................................................................................17

7.2 Encoder and Decoder in VP9...............................................................................................17

7.3 Features of VP9...................................................................................................................17

2 | P a g e

Page 3: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

7.3.1 Prediction Block Sizes..............................................................................................18

7.3.2 Prediction Modes .....................................................................................................19

7.3.2.1 Intra-prediction Modes...................................................................................19 7.3.2.2 Inter Prediction Modes.................................................................................20

7.3.3 Transform and Quantization.....................................................................................20

7.3.4 Entropy Coding.........................................................................................................20

8. Comparison Metrics..................................................................................................................20

8.1 Peak Signal to Noise Ratio..................................................................................................20

8.2 Structural Similarity Index..................................................................................................21 8.3 BD-BD and BD-PSNR........................................................................................................22

8.4 Implementation Complexity................................................................................................22

9. Profiles used for comparison.....................................................................................................22

10. Test Sequences........................................................................................................................22

11. Comparison Methodology.......................................................................................................25

11.1 Testing Platform...........................................................................................................25

12.Results.....................................................................................................................................26

13. Conclusion..............................................................................................................................38

Appendix [A]..............................................................................................................................39

A.1 Sample output file of HM 16.2.......................................................................................39

Appendix [B] .............................................................................................................................42

B.1 Sample output file of H.264...........................................................................................43

Appendix [c] ..............................................................................................................................46

C.1 Sample output file of VP9.............................................................................................46

Acknowledgement......................................................................................................................48

14.References.............................................................................................................................48

3 | P a g e

Page 4: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

List of Acronyms and Abbreviations:

ADST: Asymmetric Discrete Sine Transform.AHG: Ad Hoc Group.AVC: Advanced Video Coding. BD-BR: Bjontegaard Delta Bitrate. BD-PSNR: Bjontegaard Delta Peak Signal to Noise Ratio. CABAC: Context Adaptive Binary Arithmetic Coding. CAVLC: Context Adaptive Variable Length Coding.CTB: Coding Tree Block. CTU: Coding Tree Unit. CU: Coding Unit. DBF: De-blocking Filter. DCT: Discrete Cosine Transform. DST :Discrete Sine Transform. DPB :Decoded Picture Buffer.DVD: Digital Video Disk.HD: High Definition.HDR :High Dynamic Range.HEVC: High Efficiency Video Coding. HM: HEVC Test Model. ICME: International Conference on Multimedia and Expo.IEC: International Electro-technical Commission.ISCAS : International Symposium on Circuits and Systems.ISO: International Organization for Standardization. ITU-T: International Telecommunication Union- Telecommunication Standardization Sector. JCT: Joint Collaborative Team. JCT-VC: Joint Collaborative Team on Video Coding. JM: H.264 Test Model. JPEG: Joint Photographic Experts Group. KTA: Key Technical Areas (H.264 based exploration software of VCEG)MC: Motion Compensation. ME: Motion Estimation. MPEG: Moving Picture Experts Group. MSE: Mean Square Error. MVC : Multiview Video Coding..NGOV: Next Generation open VideoPB: Prediction Block. PCS : Picture Coding SymposiumPSNR: Peak Signal to Noise Ratio. PU: Prediction UnitQP: Quantization ParameterRD: Rate Distortion SAO: Sample Adaptive Offset.

4 | P a g e

Page 5: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

SCC: Screen Content Coding. SSIM: Structural Similarity Index. TB: Transform Block. TU: Transform Unit. VCEG: Visual Coding Experts Group.WCG: Wide Color Gamut.

5 | P a g e

Page 6: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

1. Objective:

The objective of this project is to study video coding standards HEVC [1] [34] [35] [36] , H.264 [2] [34] [35] and VP9 [3] [4] and understand various techniques in video coding such as prediction, transform, quantization and coding. A performance comparison of these video codecs based on various metrics such as computational time, PSNR [25], SSIM [5] [20] [31], BD-Bit rate [6] and BD-PSNR [7] will be carried out. The HM 16.0 [26] [33], JM 18.6 [27] [32] and VPX encoder [28] from The WebM Project test models for HEVC, H.264 and VP9 respectively will be used for this purpose.

2. Evolution of Video Coding standards [8]:

Fig 1: Evolution of video coding standards [8]

Major video coding standards have been developed by the International Standardization Organization / International Electro technical Commission (ISO/IEC) and the International Telecommunication Union – Telecommunication Standardization Sector (ITU-T) [8]. Figure 1

6 | P a g e

Screen Content Coding

2016

Page 7: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

shows a historical perspective for video coding standards development since the very first ITU-T H.120. The emergence of H.264/AVC doubled the coding efficiency from that of the MPEG-4 simple profile and has therefore gained wide industrial acceptance recently [8]. Further extensions of H.264/AVC include high profiles , scalable video coding (SVC) extension , and multi view video coding (MVC) extension [8] .

Back in 2005, the ITU-T Visual Coding Experts Group (VCEG) considered the future work beyond H.264/AVC [8]. Possible targets and scope of the standard were brainstormed and a software known as Key Technical Area (KTA) was developed and released in 2008 [8] . In 2009, the ISO/IEC Moving Picture Experts Group (MPEG) began a similar call for High-Performance Video Coding (HVC) [8].

3. Need for Video Compression- Growing demand for video [30]:

Video exceeds half of internet traffic and will grow to 86 percent by 2016 [30]. Increase in applications, content, fidelity, etc. -Need higher coding efficiency! [30]. Ultra-HD 4K broadcast expected for Japan in 2014. London Olympics Opening and

Closing Ceremonies shot in Ultra-HD 8K. - Need higher throughput! [30]. 25x increase in mobile data traffic over next five years. Video is a “must have” on

portable devices. - Need lower power! [30].

4. Fundamental Concepts in Video Coding:

Color Spaces The common color spaces for digital image and video representation are:

RGB color space – Each pixel is represented by three numbers indicating the relative proportions of red, green and blue colors

YCrCb color space – Y is the luminance component, a monochrome version of color image. Y is a weighted average of R, G and B

Where k are the weighting factors. The color information is represented as color differences or chrominance components,

where each chrominance component is difference between R, G or B and the luminance Y.

As the human visual system is less sensitive to color than the luminance component, YCrCb has advantages over RGB space. The amount of data required to represent the chrominance component reduces without impairing the visual quality [10].

The popular pattern of sampling [10] is: 4:4:4 – The three components Y: Cr: Cb has the same resolution, which is for every 4

luminance samples there are 4 Cr and 4 Cb samples.

7 | P a g e

Page 8: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

The popular patterns of sub-sampling [10] are: 4:2:2 – For every 4 luminance samples in the horizontal direction, there are 2 Cr and 2 Cb

samples. This representation is used for high quality video color reproduction. 4:2:0 – The Cr and Cb each have half the horizontal and vertical resolution of Y. This is

popularly used in applications such as video conferencing, digital television and DVD storage.

Fig 2: 4:2:0 sub-sampling pattern [10].

Fig 3: 4:2:2 sub-sampling pattern and 4:4:4 sampling pattern [10].

5. H.264/AVC [2] :

5.1: INTRODUCTION [9]:

H.264/Advanced Video Coding (AVC) is video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group [2].

8 | P a g e

Page 9: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Completed(Version 1) in May 2003 [9]. H.264/AVC is the most popular video standard in the market- 80% of video on internet is

encoded with H.264 [9]. ~50% higher efficiency than MPEG-2 [9].

Applications include HDTV broadcast satellite, cable, and terrestrial video content acquisition and editing camcorders, security applications, Internet and mobile network video, Blue ray Discs real time video chat, video conferencing, and telepresence

5.2: Encoder and Decoder in H.264 [11]:

An H.264 video encoder carries out prediction, transform and encoding processes (Figure 4) to produce a compressed H.264 bit stream. An H.264 video decoder carries out complementary processes of decoding, inverse transform and reconstruction (Figure 5) to produce a decoded video sequence.

9 | P a g e

Page 10: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig 4: Encoding Process in H.264 [11]

10 | P a g e

Page 11: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig 5: Decoding Process in H.264 [11]

5.3 Features of H.264/AVC:

5.3.1 Prediction [12] : The encoder processes a frame of video in units of a macro-block (16x16 displayed pixels) [12] . It forms a prediction of the macro-block based on previously-coded data, either from the current frame (intra prediction) or from other frames that have already been coded and transmitted (inter prediction). The encoder subtracts the prediction from the current macro-block to form a residual. Intra prediction uses 16x16 and 4x4 block sizes to predict the macro-block from surrounding, previously coded pixels within the same frame.

Fig 6: Intra prediction in H.264 [12]

11 | P a g e

Page 12: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Inter prediction uses a range of block sizes (from 16x16 down to 4x4) to predict pixels in the current frame from similar regions in previously coded frames.

Fig 7: Inter prediction in H.264 [12]

Finding a suitable inter prediction is often described as motion estimation. Subtracting an inter prediction from the current macro-block is motion compensation.

5.3.2 Transform and Quantization [13]:A block of residual samples is transformed using a 4x4 or 8x8 integer transform, an approximate form of the Discrete Cosine Transform (DCT) [13]. The transform outputs a set of coefficients, each of which is a weighting value for a standard basis pattern. When combined, the weighted basis patterns re-create the block of residual samples. The output of the transform, a block of transform coefficients, is quantized, i.e. each coefficient is divided by an integer value. Quantization reduces the precision of the transform coefficients according to a quantization parameter (QP).

5.3.3 Entropy Coding [15] :

H.264/AVC specifies two alternative methods of entropy coding: a low-complexity technique based on the usage of context-adaptively switched sets of variable length codes, so-called CAVLC, and the computationally more demanding algorithm of context adaptive binary arithmetic coding (CABAC) [14].

6. HEVC [1] [34]:

6.1 Introduction : High Efficiency Video Coding (HEVC) [1] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Visual Coding Experts Group). The main goal of HEVC standard is to significantly

12 | P a g e

Page 13: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

improve compression performance compared to existing standards (such as H.264/Advanced Video Coding [2]) in the range of 50% bit rate reduction at similar visual quality [1].

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two key issues: increased video resolution and increased use of parallel processing architectures [1] . It primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit. The next revision of the standard, finalized in 2014, enables new use-cases with the support of additional pixel formats such as 4:2:2 and 4:4:4 and bit depth higher than 10-bit [18], embedded bit-stream scalability , 3D video [17] and multiview video [41].

6.2 HEVC Extensions and Emerging Applications [43]:

Range Extensions (Finalized in April 2014)- Support for 4:2:2 , 4:4:4 color sample video , 12- bit Video

Scalable Video Coding (Finalized in July 2014) (HSVC)- Supports layered coding -spatial , quality , color gamut scalability

Multiview Video Coding (Finalized in July 2014) (MVC)-Supports coding of multiple views, 3D stereoscopic video

Screen Content Coding(Expected to be finalized Feb. 2016) (SCC)-Coding mixed contents consisting of natural video , text / graphics etc.

High dynamic range (HDR) / wide color gamut(WCG) (MPEG explotarion) Post-HEVC activity (VCEG and MPEG AHG work)

6.3 Encoder and Decoder in HEVC [19]:

Source video, consisting of a sequence of video frames, is encoded or compressed by a video encoder to create a compressed video bit stream. The compressed bit stream is stored or transmitted. A video decoder decompresses the bit stream to create a sequence of decoded frames.

The video encoder performs the following steps: Partitioning each picture into multiple units Predicting each unit using inter or intra prediction, and subtracting the prediction from

the unit Transforming and quantizing the residual (the difference between the original picture

unit and the prediction) Entropy encoding transform output, prediction information, mode information and

headers

The video decoder performs the following steps: Entropy decoding and extracting the elements of the coded sequence Rescaling and inverting the transform stage Predicting each unit and adding the prediction to the output of the inverse transform Reconstructing a decoded video image

13 | P a g e

Page 14: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

The Figures 8 [17] and 9 [21] represent the detailed block diagrams of HEVC encoder and decoder respectively:

Fig 8: Block Diagram of HEVC Encoder [17]

Fig 9: Block diagram of HEVC Decoder [21]6.4 Features of HEVC:

14 | P a g e

Page 15: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

6.4.1 Partitioning [19]:

HEVC supports highly flexible partitioning of a video sequence. Each frame of the sequence is split up into rectangular or square regions (units or blocks) [19], each of which is predicted from previously coded data. After prediction, any residual information is transformed and entropy encoded. Each coded video frame, or picture, is partitioned into tiles and/or slices, which are further partitioned into coding tree units (CTUs). The CTU is the basic unit of coding, analogous to the macro-block in earlier standards, and can be up to 64x64 pixels in size. A coding tree unit can be subdivided into square regions known as coding units (CUs) using a quad-tree structure. Each CU is predicted using inter or intra prediction and transformed using one or more transform units.

Fig 10: Picture, Slice, Coding Tree Unit (CTU), Coding Unit (CU) [19]

6.4.2 Prediction [1]:

Frames of video are coded using intra or inter prediction:

Intra prediction: Each PU is predicted from neighboring image data in the same picture, using DC prediction (an average value for the PU), planar prediction (fitting a plane surface to the PU) or directional prediction (extrapolating from neighboring data).

Inter prediction: Each PU is predicted from image data in one or two reference pictures (before or after the current picture in display order), using motion compensated prediction.

15 | P a g e

Page 16: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig 11: Modes and angular intra prediction directions in HEVC [1]

6.4.3 Transform and Quantization [19] : Any residual data remaining after prediction is transformed using a block transform based on the Discrete Cosine Transform (DCT) [13] or Discrete Sine Transform (DST). One or more block transforms of size 32x32, 16x16, 8x8 and 4x4 are applied to residual data in each CU.

Fig 12: CTU showing range of transform (TU) sizes [19]Then transformed data is quantized.

6.4.4 Entropy coding:

A coded HEVC bit stream consists of quantized transform coefficients, prediction information such as prediction modes and motion vectors, partitioning information and other header data. All of these elements are encoded using Context Adaptive Binary Arithmetic Coding (CABAC) [14] similar to H.264/AVC.

16 | P a g e

Page 17: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

7. VP9 [3] [4] :

7.1 Introduction : VP9 is an open and royalty free video compression standard being developed by Google. VP9 had earlier development names of Next Generation Open Video (NGOV) and VP-Next. VP9 is a successor to VP8. Development of VP9 started in Q3 2011. One of the goals of VP9 is to reduce the bit rate by 50% compared to VP8 while having the same video quality [22]. Also VP9 aims to improve it to the point where it would have better compression efficiency than High Efficiency Video Coding. VP9 expands techniques used in H.264/AVC and VP8 and is very likely to replace AVC at least in the YouTube video service [23].

7.2 Encoder and Decoder in VP9 [24]:

A large part of the advances made by VP9 over its predecessors is natural progression from current generation video codecs to the next. Figures 13 and 14 represent block diagrams of encoder and decoder of VP9 respectively.

Fig 13: Encoder block diagram for VP9 [24]

17 | P a g e

Page 18: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig 14: Decoder block diagram for VP9 [24]

7.3 Features of VP9 :

7.3.1 Prediction Block Sizes:A large part of the coding efficiency improvements achieved in VP9 can be attributed to incorporation of larger prediction block sizes [4] [23]. VP9 introduces super-blocks(SB) of size up to 64x64 and allows breakdown using recursive decomposition all the way down to 4x4.

Each sub-block may be further split into prediction blocks and transform blocks. Intra-prediction in VP9 is still performed on square regions thus rectangular prediction blocks represent two square prediction blocks with the same prediction mode.

Giving an analogy to HEVC [1], prediction splitting 2Nx2N, NxN, 2NxN or Nx2N is available where 2Nx2N is the size of the block being split. It is worth mentioning that 4x4 prediction blocks are determined within corresponding 8x8 blocks as a group, unlike other prediction sizes when prediction data is stored per each prediction block. Like in HEVC, a sub-block can be split into transform blocks in a quad-tree structure down to the smallest 4x4 block. The allowed sizes are 32x32, 32x16, 16x16, 8x16, 8x8 and 4x4.

18 | P a g e

Page 19: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig 15: Example partitioning of a 64x64 Super-block [4] [23]

7.3.2 Prediction Modes

7.3.2.1 Intra-prediction Modes [4] :VP9 supports a set of 10 Intra prediction modes for block sizes ranging from 4x4 up to 32x32: DC_PRED (DC prediction), TM_PRED (True-motion prediction), H_PRED (Horizontal prediction), V_PRED (Vertical prediction), and 6 oblique directional prediction modes: D27, D153, D135, D117, D63, D45 corresponding approximately to angles 27, 153, 135, 117, 63, and 45 degrees (counter-clockwise measured against the horizontal axis). The horizontal, vertical and oblique directional prediction modes involve copying (or estimating) pixel values from surrounding blocks into the current block along the angle specified by the prediction mode. Figure 16 shows angular Intra-prediction modes in VP9.

Fig 16: VP9 angular intra-prediction modes [4]

19 | P a g e

Page 20: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

7.3.2.2 Inter Prediction Modes [4]:

VP9 supports a set of 4 inter prediction modes for block sizes ranging from 4x4 up to 64x64 pixels: NEARESTMV, NEARMV, ZEROMV, and NEWMV.

7.3.3 Transform and quantization [4]:The residuals after subtraction of predicted pixel values are subjected to transformation and quantization. Transform blocks can be 32x32, 16x16, 8x8 or 4x4 pixels. Like most other coding standards, these transforms are an integer approximation of the DCT [13].

For intra coded blocks either or both the vertical and horizontal transform pass can be DST (discrete sine transform) instead. This is with respect to the specific characteristics of the residual signal of intra blocks. In addition, VP9 introduces support for a new transform type, the Asymmetric Discrete Sine Transform (ADST), which can be used in combination with specific intra-prediction modes. Intra-prediction modes that predict from a left edge can use the 1-D ADST in the horizontal direction, combined with a 1-D DCT in the vertical direction.

Similarly, the residual signal resulting from intra-prediction modes that predict from the top edge can employ a vertical 1-D ADST transform combined with a horizontal 1-D DCT transform. Intra-prediction modes that predict from both edges such as the True Motion mode and some diagonal intra-prediction modes use the 1-D ADST in both horizontal and vertical directions.

7.3.4 Entropy coding:

VP9 uses 8-bit arithmetic coding engine from VP8 known as bool-coder [4]. Unlike AVC or HEVC, the probabilities of VP9 bool-coder do not change adaptively within a frame. VP9 makes use of forward context updates through the use of flags in the frame header that signal modifications of the coding contexts at the start of each frame. These probabilities are stored in what is known as a frame context. The decoder maintains four of these contexts, and each frame specifies which one to use in bitstream.

8. Comparison Metrics :

8.1 Peak Signal to Noise Ratio [25]:Peak signal-to-noise ratio (PSNR) [25] is an expression for the ratio between the maximum possible value (power) of a signal and the power of distorting noise that affects the quality of its representation. Because many signals have a very wide dynamic range (ratio between the largest and smallest possible values of a changeable quantity), the PSNR is usually expressed in terms of the logarithmic decibel scale. PSNR is most commonly used to measure the quality of reconstruction of lossy compression codecs. The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs, PSNR is an approximation to human perception of reconstruction quality. Although a higher PSNR generally indicates that the reconstruction is of higher quality, in some cases it may not. One has to be extremely careful

20 | P a g e

Page 21: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content. PSNR is defined via the mean squared error (MSE). Given a noise-free m x n monochrome image f and its noisy approximation g , MSE is defined as:

f represents the matrix data of original image

g represents the matrix data of degraded image

m represents the numbers of rows of pixels of the images and i represents the index of that row

n represents the number of columns of pixels of the image and j represents the index of that column

The PSNR is defined as:

dB

MAXf is the maximum signal value that exists in the original image

For i 4:2:0 , PSNR can be given as ,

YUV - PSNR = ((6*Y-PSNR) + (U-PSNR) + (V-PSNR)) / 8 dB

8.2 Structural Similarity Index [5] [20] [31]:The structural similarity index is a method for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measuring of image quality based on an initial uncompressed or distortion-free image as reference. SSIM is designed to improve on traditional methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proven to be inconsistent with human eye perception. The difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the other hand, SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. The SSIM metric is calculated on various windows of an image. The measure between two windows x and y of common size N×N is:

21 | P a g e

Page 22: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Where

8.3 Bjontegaard Delta Bitrate (BD-BR) and Bjontegaard Delta PSNR (BD-PSNR) [6]:To objectively evaluate the coding efficiency of video codecs, Bjontegaard Delta PSNR (BD-PSNR) was proposed. Based on the rate-distortion (R-D) curve fitting, BD-PSNR provides a good evaluation of the R-D performance. BD metrics allow computing the average gain in PSNR or the average per cent saving in bitrate between two rate-distortion curves. However, BD-PSNR has a critical drawback: It does not take the coding complexity into account.

8.4 Implementation Complexity:The computational time of HEVC, AVC and VP9 encoders will be compared and this serves as an indication of implementation complexity.

9. Profiles used for comparison:

The HM 16.2 [26] [33], JM 18.6 [27] [32] and VPX encoder from The WebM Project [28] test models for HEVC, H.264 and VP9 respectively will be used for comparison in this project.

10. Test Sequences :

The following test sequences will be used for study and comparison of the codecs.

22 | P a g e

Page 23: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig.17 akiyo_qcif.yuv [38]

Fig.18 waterfall_cif.yuv [38]

23 | P a g e

Page 24: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig.19 BasketballDrill_832x480.yuv [39]

Fig.20 Jockey_1920x1080.yuv [29]

24 | P a g e

Page 25: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Fig.21 PeopleOnStreet_2560_1600_30_crop.yuv [29]

11. Comparison Methodology :

Test sequences in .yuv format are encoded using HM 16.2 [26] [33] and JM 18.6 [27] [32]. Different resolutions of test sequences are used. The main profile in HM 16.2 and high profile in JM 18.6 are used for comparison. Appendices A and B give brief description of configuration settings used in the encoding tools.

11.1 Test Platform :

Processor : Intel(R) Core(TM) i5-4210U CPU @ 1.70GHzInstalled Memory(RAM): 8.00 GBSystem Type: 64-bit operating system, x-64 based processor

25 | P a g e

Page 26: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

12. Results : All-Intra Configuration: Simulation results of all-intra configuration for the test sequences are tabulated in Tables 1-10:

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 369.4440 44.1475 45.5187 46.1953 44.6421 5.836

20 27 243.0660 40.4033 42.0082 43.1946 41.0097 5.195

20 32 154.0740 36.6338 39.4612 40.9767 37.5213 4.732

20 37 93.1920 33.0135 37.0586 38.7985 34.1119 4.257

Table 1 : Implementation results for akiyo_qcif.yuv sequence in HEVC

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 5916.4800 41.6564 41.4438 42.3547 41.7281 31.461

20 27 3577.2120 37.5031 37.5308 39.1391 37.7411 25.583

20 32 1946.4480 33.5487 34.8837 37.1640 34.1946 21.392

20 37 934.8480 30.0778 33.2125 35.9060 31.0944 19.569

Table 2 : Implementation results for waterfall_cif.yuv sequence in HEVC

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 20407.3400 41.7518 43.8621 44.5998 42.4283 108.617

20 27 11013.2600 38.3628 41.0689 41.4844 39.1335 94.157

20 32 5847.1600 35.4131 38.8356 39.0766 36.3051 75.877

20 37 3200.3200 32.7989 37.1853 37.3361 33.8437 67.456

Table 3 : Implementation results for BasketballDrill_832x480_50.yuv sequence in HEVC

NUMBER OF

QP BIT RATE(kbps

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

26 | P a g e

Page 27: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

FRAMES )20 22 19657.3440 44.3542 45.3285 45.3272 44.6549 456.055

20 27 10902.2400 42.5709 43.4684 43.8861 42.9072 350.005

20 32 6208.9800 40.4281 41.9067 42.5447 40.9445 326.497

20 37 3532.1520 37.9666 40.6736 41.4394 38.7692 330.297

Table 4 : Implementation results for Jockey_1920x1080.yuv sequence in HEVC

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 104202.7080 43.2513 45.6042 45.3551 43.8747 996.289

20 27 60437.8080 39.8094 43.2090 43.5335 40.7055 887.023

20 32 34337.0160 36.6802 41.2146 41.8615 37.7822 778.793

20 37 19982.2920 33.8337 39.7221 40.5919 35.1134 687.384

Table 5 : Implementation results for PeopleOnStreet_2560_1600_30_crop.yuv sequence in HEVC

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 474.04 44.069 45.238 46.204 44.4821 1.394

20 27 299.65 40.151 41.742 42.634 40.6602 1.111

20 32 188.64 36.241 39.026 37.7822 36.7820 0.998

20 37 119.56 32.884 36.841 38.264 34.0513 0.994

Table 6 : Implementation results for akiyo_qcif.yuv sequence in H.264

27 | P a g e

Page 28: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 6757.96 41.507 41.016 41.931 41.4990 6.508

20 27 4105.03 37.185 37.087 38.768 37.3712 5.424

20 32 2187.35 33.053 34.584 36.811 33.7145 4.467

20 37 1115.40 29.982 32.846 35.440 31.0221 3.911

Table 7 : Implementation results for waterfall_cif.yuv sequence in H.264

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 27444.20 41.543 43.774 44.481 42.1891 22.822

20 27 15208.06 37.967 40.742 41.053 38.6996 17.026

20 32 8149.28 35.051 38.278 38.435 35.8773 14.691

20 37 4607.00 32.563 36.362 36.392 33.5165 14.375

Table 8 : Implementation results for BasketballDrill_832x480_50.yuv sequence in H.264

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 49494.50 44.081 44.978 45.154 44.3273 71.600

20 27 28679.62 42.186 43.058 43.495 42.4586 63.995

20 32 16782.50 39.852 41.644 42.289 40.3806 59.007

20 37 10544.34 37.614 40.281 40.733 38.3373 56.011

Table 9 : Implementation results for Jockey_1920x1080.yuv sequence in H.264

28 | P a g e

Page 29: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

NUMBER OF FRAMES

QP BIT RATE(kbps)

Y-PSNR(dB)

U-PSNR(dB)

V-PSNR(dB)

YUV-PSNR(dB)

ENCODING TIME(secs)

20 22 227090.16 42.903 45.352 45.246 43.5020 178.802

20 27 138330.46 39.501 42.913 43.347 40.4083 153.843

20 32 81363.74 36.311 41.379 41.975 37.6525 136.863

20 37 50664.26 33.643 40.018 40.803 35.3349 164.348

Table 10 : Implementation results for PeopleOnStreet_2560_1600_30_crop.yuv sequence in H.264

Appendix A:

This section describes configuration settings , command line parameters required for encoding a video test sequence in HM 16. 2 in intra mode.

Main all-intra profile settings :

IntraPeriod : 1 # Period of I-Frame ( -1 = only first) GOPSize : 1 # GOP Size (number of B slice = GOPSize-1) QP : 22 # Quantization parameter(0-51) (22, 27, 32 or 37 is used at a time)

Command line parameters for using HM 16.2 encoder:

TAppEncoder [-h] [-c config.cfg] [--parameter=value] Options:-h Prints parameter usage-c Defines configuration file to use. Multiple

configuration files may be used with repeated –c options.

--parameter=value Assigns value to a given parameter.

Sample command line parameters for HM 16.2 encoder: C:\HEVC\bin\vc10\Win32\Release>TAppEncoder.exe -c encoder_intra_main.cfg -wdt 2560 -hgt 1600 -fr

30 -f 20 -i C:\HEVC\bin\vc10\Win32\testsequences >> C:\HEVC\bin\vc10\Win32\final_results\ PeopleOnStreet_2560x1600_30_crop_qp37.txt

A.1 : Sample output file of HM 16.2:

HM software: Encoder Version [16.2] (including RExt)[Windows][VS 1600][32 bit]

29 | P a g e

Page 30: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Input File : C:\HEVC\bin\vc10\Win32\testsequences\PeopleOnStreet_2560x1600_30_crop.yuvBitstream File : str.binReconstruction File : rec.yuvReal Format : 2560x1600 30HzInternal Format : 2560x1600 30HzSequence PSNR output : Linear average onlySequence MSE output : DisabledFrame MSE output : DisabledCabac-zero-word-padding : DisabledFrame/Field : Frame based codingFrame index : 0 - 19 (20 frames)Profile : mainCU size / depth : 64 / 4RQT trans. size (min / max) : 4 / 32Max RQT depth inter : 3Max RQT depth intra : 3Min PCM size : 8Motion search range : 64Intra period : 1Decoding refresh type : 0QP : 37.00Max dQP signaling depth : 0Cb QP Offset : 0Cr QP Offset : 0Max CU chroma QP adjustment depth : -1QP adaptation : 0 (range=0)GOP size : 1Input bit depth : (Y:8, C:8)MSB-extended bit depth : (Y:8, C:8)Internal bit depth : (Y:8, C:8)PCM sample bit depth : (Y:8, C:8)Extended precision processing : DisabledIntra reference smoothing : EnabledImplicit residual DPCM : DisabledExplicit residual DPCM : DisabledResidual rotation : DisabledSingle significance map context : DisabledCross-component prediction : DisabledHigh-precision prediction weight : DisabledGolomb-Rice parameter adaptation : DisabledCABAC bypass bit alignment : DisabledSao Luma Offset bit shifts : 0Sao Chroma Offset bit shifts : 0Cost function: : Lossy coding (default)RateControl : 0Max Num Merge Candidates : 5

TOOL CFG: IBD:0 HAD:1 RDQ:1 RDQTS:1 RDpenalty:0 SQP:0 ASR:0 FEN:1 ECU:0 FDM:1 CFM:0 ESD:0 RQT:1 TransformSkip:1 TransformSkipFast:1 TransformSkipLog2MaxSize:2 Slice: M=0 SliceSegment: M=0 CIP:0 SAO:1 PCM:0 TransQuantBypassEnabled:0 WPP:0 WPB:0 PME:2

30 | P a g e

Page 31: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

WaveFrontSynchro:0 WaveFrontSubstreams:1 ScalingList:0 TMVPMode:1 AQpS:0 SignBitHidingFlag:1 RecalQP:0

Non-environment-variable-controlled macros set as follows:

RExt__DECODER_DEBUG_BIT_STATISTICS = 0 RExt__HIGH_BIT_DEPTH_SUPPORT = 0 RExt__HIGH_PRECISION_FORWARD_TRANSFORM = 0 O0043_BEST_EFFORT_DECODING = 0 RD_TEST_SAO_DISABLE_AT_PICTURE_LEVEL = 0

Input ChromaFormatIDC = 4:2:0 Output (internal) ChromaFormatIDC = 4:2:0

POC 0 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 665136 bits [Y 33.8541 dB U 39.7260 dB V 40.5735 dB] [ET 34 ] [L0 ] [L1 ]POC 1 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 665256 bits [Y 33.8549 dB U 39.7137 dB V 40.5744 dB] [ET 35 ] [L0 ] [L1 ]POC 2 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 662384 bits [Y 33.8615 dB U 39.7464 dB V 40.6264 dB] [ET 34 ] [L0 ] [L1 ]POC 3 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 664384 bits [Y 33.8483 dB U 39.6710 dB V 40.5561 dB] [ET 35 ] [L0 ] [L1 ]POC 4 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 663688 bits [Y 33.8635 dB U 39.7420 dB V 40.5813 dB] [ET 34 ] [L0 ] [L1 ]POC 5 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 664408 bits [Y 33.8356 dB U 39.6683 dB V 40.5395 dB] [ET 34 ] [L0 ] [L1 ]POC 6 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 666480 bits [Y 33.8424 dB U 39.7187 dB V 40.5803 dB] [ET 34 ] [L0 ] [L1 ]POC 7 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 663968 bits [Y 33.8295 dB U 39.6969 dB V 40.5360 dB] [ET 35 ] [L0 ] [L1 ]POC 8 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 667352 bits [Y 33.8295 dB U 39.7262 dB V 40.6232 dB] [ET 34 ] [L0 ] [L1 ]POC 9 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 670600 bits [Y 33.8243 dB U 39.7014 dB V 40.5947 dB] [ET 35 ] [L0 ] [L1 ]POC 10 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 669728 bits [Y 33.8094 dB U 39.6825 dB V 40.5722 dB] [ET 34 ] [L0 ] [L1 ]POC 11 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 666568 bits [Y 33.8387 dB U 39.7818 dB V 40.6047 dB] [ET 35 ] [L0 ] [L1 ]POC 12 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 665920 bits [Y 33.8335 dB U 39.7523 dB V 40.5711 dB] [ET 34 ] [L0 ] [L1 ]POC 13 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 668088 bits [Y 33.8158 dB U 39.7389 dB V 40.6237 dB] [ET 35 ] [L0 ] [L1 ]POC 14 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 667288 bits [Y 33.8197 dB U 39.7215 dB V 40.6026 dB] [ET 34 ] [L0 ] [L1 ]POC 15 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 668376 bits [Y 33.8222 dB U 39.7685 dB V 40.6008 dB] [ET 34 ] [L0 ] [L1 ]POC 16 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 665200 bits [Y 33.8177 dB U 39.7120 dB V 40.6102 dB] [ET 35 ] [L0 ] [L1 ]POC 17 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 668048 bits [Y 33.8286 dB U 39.6775 dB V 40.6382 dB] [ET 34 ] [L0 ] [L1 ]

31 | P a g e

Page 32: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

POC 18 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 664488 bits [Y 33.8247 dB U 39.7216 dB V 40.6059 dB] [ET 34 ] [L0 ] [L1 ]POC 19 TId: 0 ( I-SLICE, nQP 37 QP 37 ) 664168 bits [Y 33.8206 dB U 39.7748 dB V 40.6229 dB] [ET 34 ] [L0 ] [L1 ]

SUMMARY --------------------------------------------------------Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR 20 a 19982.2920 33.8337 39.7221 40.5919 35.1134

I Slices--------------------------------------------------------Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR 20 i 19982.2920 33.8337 39.7221 40.5919 35.1134

P Slices--------------------------------------------------------Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR 0 p -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND

B Slices--------------------------------------------------------Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR 0 b -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND

RVM: 0.000Bytes written to file: 1665282 (19983.384 kbps)

Total Time: 687.384 sec.

Appendix B :

This section describes configuration settings , command line parameters required for encoding a video test sequence in H.264 in intra mode.

High all-intra profile settings

FramesToBeEncoded : 20 # Number of frames to be coded FrameRate : 50.0 # Frame Rate per second (0.1-100.0) ProfileIDC : 100 # Profile IDC (66=baseline, 77=main, 88=extended; FREXT Profiles: 100=High, 110=High 10, 122=High 4:2:2, 244=High 4:4:4, 44=CAVLC 4:4:4 Intra, 118=Multiview High Profile, 128=Stereo High Profile) IntraProfile : 1 # Activate Intra Profile for FRExt (0: false, 1: true) LevelIDC : 40 # Level IDC (e.g. 20 = level 2.0) IntraPeriod : 1 # Period of I-pictures (0=only first) IDRPeriod : 1 # Period of IDR pictures (0=only first)

32 | P a g e

Page 33: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

QPISlice : 22 # Quant. param for I Slices (0-51) (22, 27, 32 or 37 is used at a time)

Command line parameters for using JM 18.6 encoder:

lencod [-h] [-d defenc.cfg] {[-f curenc1.cfg]...[-f curencN.cfg]} {[-p EncParam1=EncValue1]...[-p EncParamM=EncValueM]}

Options:-h Prints parameter usage.-d Use <defenc.cfg> as default file for parameter

initializations. If not used then file defaults to “encoder.cfg” in local directory.

-f Read <curencM.cfg> for resetting selected encoder parameters. Multiple files could be used that set different parameters.

-p Set parameter <EncParamM> to <EncValueM>. The entry for <EncParamM> is case insensitive.

Sample command line parameters for JM 18.6 encoder:

C:\jm 18.6\JM\bin>lencod.exe –f encoder.cfg -p InputFile=" C:\jm 18.6\JM\bin\ Jockey_1920x1080.yuv" –p SourceWidth=832 -p SourceHeight=480 >> C:\jm 18.6\JM\bin\final_results\ jockey_37.txt

B.1 Sample output file of H.264:

Setting Default Parameters...Parsing Configfile encoder.cfg............................................................................................................................................................................................................................................................................................................................................................................................................................Parsing Configfile encoder.cfg............................................................................................................................................................................................................................................................................................................................................................................................................................Parsing command line string 'InputFile = Jockey_1920x1080.yuv'.Parsing command line string 'SourceWidth = 1920'.Parsing command line string 'SourceHeight = 1080'.

Warning: PocMemoryManagement not supported for Intra Profiles. Process Disabled.Warning: ReferenceReorder not supported for Intra Profiles. Process Disabled.

33 | P a g e

Page 34: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

------------------------------- JM 18.6 (FRExt) ------------------------------- Input YUV file : Jockey_1920x1080.yuv Output H.264 bitstream : test.264 Output YUV file : test_rec.yuv YUV Format : YUV 4:2:0 Frames to be encoded : 20 Freq. for encoded bitstream : 50.00 PicInterlace / MbInterlace : 0/0 Transform8x8Mode : 1 ME Metric for Refinement Level 0 : SAD ME Metric for Refinement Level 1 : Hadamard SAD ME Metric for Refinement Level 2 : Hadamard SAD Mode Decision Metric : Hadamard SAD Motion Estimation for components : Y Image format : 1920x1080 (1920x1088) Error robustness : Off Search range : 32 Total number of references : 5 References for P slices : 5 References for B slices (L0, L1) : 5, 1 Sequence type : IIII (QP: I 37) Entropy coding method : CABAC Profile/Level IDC : (100,50) Motion Estimation Scheme : EPZS EPZS Pattern : Extended Diamond EPZS Dual Pattern : Extended Diamond EPZS Fixed Predictors : Aggressive EPZS Aggressive Predictors : Disabled EPZS Temporal Predictors : Enabled EPZS Spatial Predictors : Enabled EPZS Threshold Multipliers : (1 0 2) EPZS Subpel ME : Basic EPZS Subpel ME BiPred : Basic Search range restrictions : none RD-optimized mode decision : used Data Partitioning Mode : 1 partition Output File Format : H.264/AVC Annex B Byte Stream Format -------------------------------------------------------------------------------Frame Bit/pic QP SnrY SnrU SnrV Time(ms) MET(ms) Frm/Fld Ref -------------------------------------------------------------------------------00000(NVB) 312 00000(IDR) 210176 37 37.507 40.232 40.711 2769 0 FRM 300001(NVB) 312 00001(IDR) 211984 37 37.560 40.247 40.705 2799 0 FRM 300002(NVB) 312 00002(IDR) 213352 37 37.579 40.207 40.706 2795 0 FRM 300003(NVB) 312 00003(IDR) 215024 37 37.565 40.229 40.703 2793 0 FRM 300004(NVB) 312 00004(IDR) 213552 37 37.544 40.281 40.648 2832 0 FRM 300005(NVB) 312

34 | P a g e

Page 35: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

00005(IDR) 214432 37 37.553 40.219 40.670 2785 0 FRM 300006(NVB) 312 00006(IDR) 214520 37 37.551 40.204 40.637 2792 0 FRM 300007(NVB) 312 00007(IDR) 215112 37 37.565 40.222 40.714 2807 0 FRM 300008(NVB) 312 00008(IDR) 214944 37 37.552 40.213 40.711 2801 0 FRM 300009(NVB) 312 00009(IDR) 213736 37 37.530 40.209 40.677 2797 0 FRM 300010(NVB) 312 00010(IDR) 212032 37 37.543 40.243 40.660 2807 0 FRM 300011(NVB) 312 00011(IDR) 210352 37 37.564 40.285 40.666 2803 0 FRM 300012(NVB) 312 00012(IDR) 210272 37 37.586 40.272 40.736 2813 0 FRM 300013(NVB) 312 00013(IDR) 210432 37 37.623 40.312 40.755 2806 0 FRM 300014(NVB) 312 00014(IDR) 208024 37 37.668 40.316 40.743 2801 0 FRM 300015(NVB) 312 00015(IDR) 207944 37 37.714 40.348 40.845 2822 0 FRM 300016(NVB) 312 00016(IDR) 205008 37 37.741 40.326 40.821 2795 0 FRM 300017(NVB) 312 00017(IDR) 203904 37 37.728 40.374 40.824 2801 0 FRM 300018(NVB) 312 00018(IDR) 203312 37 37.769 40.426 40.842 2796 0 FRM 300019(NVB) 312 00019(IDR) 203384 37 37.842 40.456 40.888 2790 0 FRM 3------------------------------------------------------------------------------- Total Frames: 20 Leaky BucketRateFile does not have valid entries. Using rate calculated from avg. rate Number Leaky Buckets: 8 Rmin Bmin Fmin 10528700 243522 243124 13160850 215112 210176 15793000 215112 210176 18425150 215112 210176 21057300 215112 210176 23689450 215112 210176 26321600 215112 210176 28953750 215112 210176 ------------------ Average data all frames -----------------------------------

Total encoding time for the seq. : 56.011 sec (0.36 fps) Total ME time for sequence : 0.000 sec

Y { PSNR (dB), cSNR (dB), MSE } : { 37.614, 37.613, 11.26575 } U { PSNR (dB), cSNR (dB), MSE } : { 40.281, 40.281, 6.09576 } V { PSNR (dB), cSNR (dB), MSE } : { 40.733, 40.733, 5.49312 }

35 | P a g e

Page 36: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

Total bits : 4217736 (I 4211496, P 0, NVB 6240) Bit rate (kbit/s) @ 50.00 Hz : 10544.34 Bits to avoid Startcode Emulation : 23162 Bits for parameter sets : 6240 Bits for filler data : 0

-------------------------------------------------------------------------------Exit JM 18 (FRExt) encoder ver 18.6

13.References:

[1] G. J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Transactions on circuits and systems for video technology, vol. 22, no.12, pp. 1649 – 1668, Dec . 2012.

[2] JVT Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264-ISO/IEC 14496-10 AVC), March 2003, JVT-G050 available on http://ip.hhi.de/imagecom_G1/assets/pdfs/JVT-G050.pdf .

[3] D. Grois et al, “Performance Comparison of H.265/ MPEG-HEVC, VP9, and H.264/MPEG-AVC Encoders”, IEEE PCS 2013, pp 394-397, San José, CA, USA, Dec 8-11, 2013 .

[4] D. Mukherjee et al, “The latest open-source video codec VP9–An overview and preliminary results”, Google Inc., United States .

[5] Z. Wang et al, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004 .

[6] G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves”, ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001 .

[7] X. Li et al, “Rate-complexity-distortion evaluation for hybrid video coding”, IEEE International Conference on Multimedia and Expo (ICME), pp. 685-690, July 2010 .

[8] N. Ling, “High efficiency video coding and its 3D extension: A research perspective,” Keynote Speech, ICIEA, pp. 2150-2155, Singapore, July 2012

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6361087 .

36 | P a g e

Page 37: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

[9] V. sze and M. Budagavi , " Design and Implementation of Next Generation Video Coding Systems (H.265/HEVC Tutorial) " , IEEE ISCAS Tutorial 2014 , Melbourne , Australia , June.2014 - filehttp://www.rle.mit.edu/eems/wp-content/uploads/2014/06/H.265-HEVC-Tutorial-2014-ISCAS.pdf .

[10] I. E. G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, Wiley, 2002 .

[11] A. Puri et al, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol. 19, pp. 793-849, Oct. 2004.

[12] H.264 tutorial by I.E.G. Richardson: https://www.vcodex.com/h264.html

[13] N. Ahmed , T. Natarajan and K. R. Rao, “Discrete Cosine Transform”, IEEE Transactions on Computers, Vol. C-23, pp. 90-93, Jan. 1974.

[14] D. Marpe, H. Schwarz, and T. Wiegand, “Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, pp. 620–636, July 2003.

[15] J . Ostermann et al, " Video coding with H.264/AVC tools, performance , and complexity", IEEE Circuits and Systems Magazine , Vol.4 , pp.7-28, Aug. 2004.

[16] J. Ohm et al, "Comparison of the Coding Efficiency of Video Coding Standards - including High Efficiency Video Coding (HEVC) ", IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, Issue: 12 , pp. 1669 -1684 , Dec. 2012.

[17] G. Sullivan et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.

[18] HEVC white paper - http://www.ateme.com/an-introduction-to-uhdtv-and-hevc

[19] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html

[20] W. Malpica and A. Bovik , "Range image quality assessment by structural similarity", IEEE ICASSP 2009, pp. 1149 - 1152, Apr. 2009.

[21] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.

[22] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved 2012-12-29. Available on: http://downloads.webmproject.org/ngov2012/pdf/04-ngov-project-update.pdf

[23] M. P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences, Vol. 7, no. 137, pp.6803 – 6824, Hikari Ltd, 2013

37 | P a g e

Page 38: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

[24] J. Padia, “Complexity reduction for VP6 to H.264 transcoder using motion vector reuse,” M.S. Thesis, EE Dept., UTA, Arlington, TX, 2010. Available on:

http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html

[25] White paper on PSNR-NI - http://www.ni.com/white-paper/13306/en/

[26] Access to HM Reference Software: http://hevc.hhi.fraunhofer.de/

[27] Access to JM 18.6 Reference Software: http://iphome.hhi.de/suehring/tml/

[28] Chromium® open-source browser project, VP9 source code, Online: http://git.chromium.org/gitweb/?p=webm/libvpx.git;a=tree;f=vp9;hb=aaf61dfbcab414bfacc3171501be17d191ff8506

[29] http://ultravideo.cs.tut.fi/#testsequences - Video test sequences

[30] Cisco Visual Networking Index - http://www.cisco.com/c/en/us/solutions/service-provider/visual-networking-index-vni/index.html

[31] J. Wang et al, "Fractal image coding using SSIM", IEEE 18th International Conference on Image Processing, pp.241-244, Brussels, Belgium, 11-14 Sept. 2011.

[32] H.264/AVC Software Reference Manual:

http://iphome.hhi.de/suehring/tml/JM%20Reference%20Software%20Manual%20(JVT-AE010).pdf

[33] HEVC Software Reference Manual :

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-9.2-dev/doc/software-manual.pdf

[34] K. R. Rao, D. N. Kim and J. J. Hwang,

“Video Coding Standards : AVS China , H.264/MPEG-4 Part10 , HEVC, VP6 , DIRAC and VC-1”, Springer, 2014.

[35] V. Sze , M. Budagavi , and G. J. Sullivan , "High Efficiency Video Coding (HEVC) : Algorithms and Architectures", Springer, 2014.

[36] M. Wien, "High Efficiency Video Coding : Coding Tools and Specification" , Springer , 2014.

[37] I. E. Richardson , "Coding Video : A practical guide to HEVC and beyond " , Wiley , 11 May 2015.

38 | P a g e

Page 39: INTERIM REPORT ON - UT Arlington –   Web viewThe SSIM metric is calculated on various windows of an image. ... Cabac-zero-word-padding :

[38] https://media.xiph.org/video/derf/ - test sequences

[39] ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ - test sequences

[40] G. Correa et al , " Fast HEVC Encoding Decisions Using Data Mining " , IEEE Transactions on Circuits and Systems for Video Technology , Vol . 25 , No. 4 , pp. 660 - 673, Apr. 2015.

[41] D. K. Kwon and M. Budagavi , " Combined scalable and multiview extension of High Efficiency Video Coding (HEVC) " , IEEE Picture Coding Symposium , pp. 414 - 417 , Dec . 2013 .

[42] Encoding Time Evaluation Intel VTune Amplilfier  XE  Software profiler

Available http://software.intel.com

[43] M. Budagavi , " HEVC/H.265 and recent developments in video coding standards " , EE Department Seminar , UT Arlington , Arlington , 21 Nov. 2014 .

39 | P a g e