4c8 dr. david corrigan

58
4C8 Dr. David Corrigan Jpeg and the DCT

Upload: pravat

Post on 24-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

4C8 Dr. David Corrigan. Jpeg and the DCT. 2D DCT. DCT Basis Functions . 2D DCT. Q step = 15. Each band is the same size and there are 64 bands in total so the entropy is. Optimum Block Size is 8!. Slow DCT. Sledgehammer implementation for 8 point DCT - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 4C8 Dr.  David Corrigan

4C8

Dr. David Corrigan

Jpeg and the DCT

Page 2: 4C8 Dr.  David Corrigan
Page 3: 4C8 Dr.  David Corrigan
Page 4: 4C8 Dr.  David Corrigan
Page 5: 4C8 Dr.  David Corrigan
Page 6: 4C8 Dr.  David Corrigan

2D DCT

DCT Basis Functions

Page 7: 4C8 Dr.  David Corrigan

2D DCT

Page 8: 4C8 Dr.  David Corrigan

Qstep = 15

Page 9: 4C8 Dr.  David Corrigan

Each band is the same size and there are 64 bands in total so the entropy is

bits/pixel 36.164

entropies band

H

Page 10: 4C8 Dr.  David Corrigan
Page 11: 4C8 Dr.  David Corrigan

Optimum Block Size is 8!

Page 12: 4C8 Dr.  David Corrigan

Slow DCT

• Sledgehammer implementation for 8 point DCT

• Each row multiply requires 8 MADDs (approx)• So for all 8 rows requires 64 MADDs (approx)

Page 13: 4C8 Dr.  David Corrigan

Fast DCT

• Exploit Symmetry

Page 14: 4C8 Dr.  David Corrigan

Fast DCT

• So split Matrix T into two parts...

)8()7()6()5()4()3()2()1(

yyyyyyyy

Page 15: 4C8 Dr.  David Corrigan

Fast DCT

• split Matrix T into two parts, change y...

)8()7()6()5()4()3()2()1(

yyyyyyyy

)5()4()6()3()7()2()8()1(

yyyyyyyy

)5()4()6()3()7()2()8()1(

yyyyyyyy

Page 16: 4C8 Dr.  David Corrigan

Fast DCT

)8()7()6()5()4()3()2()1(

yyyyyyyy

)5()4()6()3()7()2()8()1(

yyyyyyyy

)5()4()6()3()7()2()8()1(

yyyyyyyy

4 “adds”, 16 MADDS for each operation = 8 adds and 32 MADDS = 40 opsCompare with 64 MADDS from before .

Page 17: 4C8 Dr.  David Corrigan

Fast DCT

)5()4()6()3()7()2(

)8()1(

yyyyyyyy

This sub-matrix can be simplified with symmetry again!

4 “adds”, 8 MADDS in total = 12 ops (down from 20)

So now we are at 20 (for the first sub matrix) + 12 (for these two) = 32 ops

So we have saved about x2!

Page 18: 4C8 Dr.  David Corrigan

JPEG and Colour Images

• JPEG uses YCBCR colourspace.• The chrominance channels are usually

downsampled. • There are 3 commonly used modes– 4:4:4 – no chrominance subsampling– 4:2:2 – Every 2nd column in the chrominance

channels are dropped.– 4:2:0 – Every 2nd column and row is dropped.

Page 19: 4C8 Dr.  David Corrigan

Subjectively Weighted Quantisation

• In JPEG it is standard to apply different thresholds to different bands

9910310011298959272101120121103877864499211310481645535247710310968563722186280875129221714566957402416131455605826191412126151402416101116

lumQ

99999999999999999999999999999999999999999999999999999999999999999999999999996647999999999956262499999999662621189999999947241817

chrQ

Page 20: 4C8 Dr.  David Corrigan

Subjectively Weighted Quantisation

• These values are obtained by perceptual tests.

• A user is asked to view an image of a particular size on at specified distance from the screen.– Usually a multiple of the screen height.

• User is presented with an image and is asked to increase the gain of a given band until he/she just notices a difference in the image.

– Note typically a flat grey image is used to avoid masking effects caused by edges and texture

• The set of form the quantisation matrix.

),(),(),( yxyxIyxI klklorigvis

Page 21: 4C8 Dr.  David Corrigan

Subjectively Weighted Quantisation

• Lower Frequency Bands are assigned lower step sizes.

• There is a slight drop of in step size from the DC coefficient to low frequency coefficients.

• The step sizes for the chrominance channels increase faster than for luminance.

9910310011298959272101120121103877864499211310481645535247710310968563722186280875129221714566957402416131455605826191412126151402416101116

lumQ

99999999999999999999999999999999999999999999999999999999999999999999999999996647999999999956262499999999662621189999999947241817

chrQ

Page 22: 4C8 Dr.  David Corrigan

We have seen this before

Page 23: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Qstep = Qlum

Uncompressed JPEG

Page 24: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Qstep = QlumPSNR = 32.9 dB

Page 25: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Qstep = 2 * QlumPSNR = 30.6 dB

Uncompressed JPEG

Page 26: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Qstep = 15 PSNR = 37.6 dB

Qstep = QlumQstep = 15

Page 27: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Qstep = 30 PSNR = 33.4 dBQstep = Qlum

Qstep = 30

Page 28: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Qstep = 30 PSNR = 33.4 dBQstep = Qlum

Qstep = 30PSNR indicates better quality for Qstep = 30 over Qstep = Qlum but this clearly is not true from a subjective analysis.

Page 29: 4C8 Dr.  David Corrigan

Comparing Different Quantisations

Quantisation PSNR (dB) Subjective Ranking

Entropy (bits/pel)

15 37.6 2 1.36

30 33.4 4 0.820.5 * Qlum 35.6 1 1.28

Qlum 32.9 3 0.862*Qlum 30.6 5 0.55

Using the subjectively weighted Quantisation achieves much higher levels of compression for equivalents levels of quality.

Page 30: 4C8 Dr.  David Corrigan

JPEG Coding• The most obvious way might seem to code each band

separately– ie. Huffman with RLC like we suggested with the Haar Transform.– We could get close to the entropy

• This is not the way it is coded because– It would require 64 different codes. High cost in computation and

storage of codebooks.– It ignores the fact that the zero coefficients occur at the same

positions in multiple bands.

Page 31: 4C8 Dr.  David Corrigan

JPEG Coding• Instead we code each block separately

– A block contains 64 coefficients, one from each band.

• Each block contains 1 DC coefficient (from the top left band) and 63 AC coefficients

• Two codebooks are used in total for all the blocks, one for the DC coefficients and the other for the AC coefficients.

• At the end of each Block we insert an End Of Block (EOB) symbol in the datastream

Page 32: 4C8 Dr.  David Corrigan

Data Ordering

• Each block covers is a 8x8 grid of coeffs– A Zig-Zag scan converts them into

a 1D stream.– As most non-zero values occur in

the top left corner using a Zig-Zag scan maximises the lengths zero runs so improves efficiency of RLC

Page 33: 4C8 Dr.  David Corrigan

Zig-Zag Scan Example

000000000000000000000000000000010000000000000006010002313

-13, -3, 6, 0, 0, 2, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 36 more zeros, the end

Typical DCT Block Coefficients

Non-Zero values are at the top left corner of the block Zig-Zag scan concentrates

the non-zero coefficients at the start of the stream

Page 34: 4C8 Dr.  David Corrigan

Coding the DC Coefficients

Differential Coding

Page 35: 4C8 Dr.  David Corrigan

000000000000000000000000000000010000000000000006010002313

Coding the DC Coefficients

This value is actually the difference between the dc coefficient of the current and previous blocks

Typical DCT Block Coefficients

Page 36: 4C8 Dr.  David Corrigan

Coding DC Coefficients• There is potentially a large number of levels to encode.

– Up to 4096 depending on the quantization step size.

• We break down the symbol value into a size index pair

Page 37: 4C8 Dr.  David Corrigan

Coding DC Coefficients

• So if the DC value is -13– The size is 4– The index is 0010

• In JPEG only the size is encoded using Huffman– The index is uncoded, efficiency is not dramatically

affected.– Only 12 codes required in huffman table– Table size is 16 + 12 = 28 bytes

Page 38: 4C8 Dr.  David Corrigan

Value Size Index

-7 3 000

-6 3 001

-5 3 010

-4 3 011

-3 2 00

-2 2 01

-1 1 0

0 0 -

1 1 1

2 2 10

3 2 11

4 3 100

5 3 101

6 3 110

7 3 111

More examples of Coefficient to size/index pair conversions

Page 39: 4C8 Dr.  David Corrigan

Coding the AC Coefficients

000000000000000000000000000000010000000000000006010002313

4 0010, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

Typical DCT Block Coefficients

The block usually ends with a long run of zeros

The length of the run and the value of the coeff after it are strongly correlated

Size/Index Pair for DC coefficient

Page 40: 4C8 Dr.  David Corrigan

Coding the AC Coefficients

• Code/Size Correlations– High coeffs follow short runs and low coeffs follow

long runs

• Final run of zeros– These don’t need to be coded– Just tell the encoder that there are no more non-

zero coefficients and move onto the next block.

Page 41: 4C8 Dr.  David Corrigan

SymbolsRun/Coefficient Symbols

eg. 0, 0, 9 is a run of 2 zeros followed by a 9

However we represent 9 using the size/index format from the dc coeffs

9 has a size of 4 and an index 1001

So we code the run/size pair (2,4) and the index 1001 is appended to the stream

Page 42: 4C8 Dr.  David Corrigan

Symbols

• Run/Size Symbols– All possible combinations of runs from 0->15 and

size from 1->10– 160 total symbols– Huffman Codes are used for each symbol– Index values are not coded further

Page 43: 4C8 Dr.  David Corrigan

Special Symbols• ZRL

– Used to represent a run of 16 zeros– Used when the run of zeros is greater than 15– Eg. 17 zeros, 14 - is coded as (ZRL) (1,4) 1110

• EOB– Inserted when a block ends with a run of zeros

In total there are 160 run/size symbols and 2 special symbols 162 symbols to 2 encodecodetable is 16 + 162 = 178 bytes

Page 44: 4C8 Dr.  David Corrigan

Coding Example

000000000000000000000000000000010000000000000006010002313

Typical DCT Block Coefficients

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

DC Coefficient is -13. The size is 4 and the index is 0010

Current Stream State: 4 0010

Page 45: 4C8 Dr.  David Corrigan

Coding Example

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

The first ac value is -3. That is a run of 0 zeros followed by -3.

-3 has size 2 and index 0000

Therefore the run/size pair is (0,2)

Current Stream State: 4 0010 (0,2) 00

Page 46: 4C8 Dr.  David Corrigan

Coding Example

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

The next ac value is 6. That is a run of 0 zeros followed by 6.

6 has size 3 and index 110

Therefore the run/size pair is (0,3)

Current Stream State: 4 0010 (0,2) 00 (0,3) 110

Page 47: 4C8 Dr.  David Corrigan

Coding Example

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

The next ac value to encode is a run of 2 zeros followed by a ac coefficient 2.

2 has size 2 and index 10

Therefore the run/size pair is (2,2)

Current Stream State: 4 0010 (0,2) 00 (0,3) 110 (2,2) 10

Page 48: 4C8 Dr.  David Corrigan

Coding Example

The next ac value to encode is a run of 3 zeros followed by a ac coefficient -1.

-1 has size 1 and index 0

Therefore the run/size pair is (3,1)

Current Stream State: 4 0010 (0,2) 00 (0,3) 110 (2,2) 10 (3,1) 0

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

Page 49: 4C8 Dr.  David Corrigan

Coding Example

The next ac value to encode is a run of 17 zeros followed by a ac coefficient 1.

As the run is > 15 zeros we have to use the ZRL symbol to code the first 16 zeros. The remaining run length consists of (17 - 16) = 1 zero.

An ac coefficient of 1 has size 1 and index 1

Therefore we insert the run/size pair (1,1) after the ZRL marker

Current Stream State: 4 0010 (0,2) 00 (0,3) 110 (2,2) 10 (3,1) 0 ZRL (1,1) 1

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

Page 50: 4C8 Dr.  David Corrigan

Coding Example

The remaining coeffs are all 0. Therefore the EOB marker is used.

If the last ac coeff is non-zero, then the EOB marker is not used.

Current Stream State: 4 0010 (0,2) 00 (0,3) 110 (2,2) 10 (3,1) 0 ZRL (1,1) 1 EOB

-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end

Page 51: 4C8 Dr.  David Corrigan

Huffman Coding

• Best Solution is to define the 2 Huffman codes for each image during compression

• However a default Huffman codetable is defined in the JPEG standard.

Final Stream: 4 0010 (0,2) 00 (0,3) 110 (2,2) 10 (3,1) 0 ZRL (1,1) 1 EOB

Encoded using dc codetable

Encoded using ac codetable

No further encoding

Page 52: 4C8 Dr.  David Corrigan

Default CodetablesAC tableDC table

Final Stream: 4 0110 (0,2) 0000 (0,3) 110 (2,2) 10 (3,1) 0 ZRL (1,1) 1 EOB

Fully Encoded Stream:

101 0110 01 0000 100 110 11111001 10 111010 0 11111111001 1100 1 1010

56 bits to encode 64 coefficients = 0.875 bits/coefficient

Page 53: 4C8 Dr.  David Corrigan

How good is this scheme?

Page 54: 4C8 Dr.  David Corrigan

Should we use default codetables?

Even though doubling the quantisation sizes reduces the number of events the distribution of those events doesn’t change much. Only the EOB probability changes significantly.

Therefore using the same codetable for both cases is reasonable

Page 55: 4C8 Dr.  David Corrigan

How good is this scheme?

In fact using the same codetable for multiple images doesn’t reduce the efficiency of the code much.

Efficiency when the default codetable is used

97.35%

95.74%

Page 56: 4C8 Dr.  David Corrigan

Special Markers

Page 57: 4C8 Dr.  David Corrigan

Synchronisation Markers

• There are 8 synch markersFFD0 ->FFD7

They can be placed at intervals which can be specified by using the DRI (FFDD) marker

Each marker is sent sequentially so if any marker is corrupted its absence can be easily detected.

Page 58: 4C8 Dr.  David Corrigan

Summary

• We have covered the basics of JPEG standard

• The standard specifies a syntax rather than specifying exactly how it is implemented

• Most implementations use the recommended settings provided by the JPEG community.