coding and compression 1

1

AGENDACoding and Compression

• Introduction- Sampling, Nyquist, Transform • Lossless Data Compression

– Runlength, Huffman, Dictionary compression

• Audio– PCM, DPCM

• Image– hierarchical coding, subband coding– MPEG, JPEG, DCT– Wavelet, HAAR Transform

2

Introduction

• A key problem with multimedia is the huge quantities of data that result from raw digitized data of audio, image or video source.

• The main goal for coding and compression is to alleviate the storage, processing and transmission costs for these data.

• There are a variety of coding and compression techniques commonly used in the Internet and other system.

3

Introduction

• The components of a system are capturing, transforming, coding and transmitting.

Sample Transform Coding

4

Introduction

• Sampling --- Analog to Digital Conversion.– An input signal is converted from some continuously

varying physical value(e.g. pressure in air, or frequency or wavelength of light) into a continuously electrical signal by some electro-mechanical device.

– This continuously varying electrical signal can then be converted to a sequence of digital values, called samples, by some analog to digital conversion circuit.

• Two factors determine the accuracy of the sample with the original continuous signal:

5

Introduction

Sampling and Nyquist theorem– The maximum rate at which we sample.

• Based on Nyquist’s theorem, the digital sampling rate must be twice of the highest frequency in continuous signal.

– The number of bits used in each sample. (known as the quantization level.)

– however, it is often not necessary to capture all frequencies in the original signal.

• For example, voice is comprehensible with a much smaller range of frequencies that we can actually hear.

6

Introduction - Transform

• The goal of transform is to decorrelate the original signal, and this decorrelation results in the signal energy being redistributed among only a small set of transform coefficients.

• The original data can be transformed in a number of ways to make it easier to apply certain compression techniques.

• The most common transform in current techniques are the Discrete Cosine Transform and wavelet transform.

7

• Compression techniques were developed early in the life of computers, to cope with the problems of limited memory and storage capacity

• Hardware advances have limited the requirement for such techniques in desktop applications

• Network and communication capacity restrictions have resulted in continuing work on compression

• The advent of distributed multimedia has resulted in considerable developments in compression

• Problem : real-time, or timely, transmission of audio and video over communications networks

8

Image compression - Technique of reducing the numbers required to store an image

• Image compression is necessitated by:

•–A need to store data efficiently in available memory•–A need to efficiently transmit data over available communication channels

9

Media SampleRate

Data Size &Rate

Speech 8000samples/ sec.

7.8 KB/ s

CD Audio 44,100samples/ sec.

2 bytes/ sample

172 KB/ s

Satellite Images 180x180 km2

30 m2

resolution

1030MB/ image

VGA Video 25 frames/ sec.640x480 pixels3 bytes/ pixel

22 MB/ s

10

Another View

Data Rate Size/Hour–128 Kbs 60 MB–384 Kbs 170 MB–1.5 Mbs 680 MB–3.0 Mbs 1.4 GB–6.0 Mbs 2.7 GB–25 Mbs 11.0 GB

11

Video Data Size

1920x1080 1280x720 640x480 320x240 160x1201 sec 0.19 0.08 0.03 0.01 0.001 min 11.20 4.98 1.66 0.41 0.101 hour 671.85 298.60 99.53 24.88 6.22

1000 hours 671,846.40 298,598.40 99,532.80 24,883.20 6,220.80

size of uncompressed video in gigabytes

image size of video

1280x720 (1.77) 640x480 (1.33) 320x240 160x120

12

Bandwidth requirements of images in some applications

• Fax - 250 KB/image• Digital Cameras - 18-150 MB/image• Digital Television - 166 MB/second

13

Image compression standards are necessitated for ease of exchange in software and hardware

Standards are developed by different standards bodies – ISO, ITU, ANSI, etc.Some popular image compression standards – JPEG, MPEG-1, MPEG-2, MPEG-4 etc.It is important to note that there are many proprietary compression codes!

14

How (and why) can images be compressed ?

Images can be compressed by exploiting two characteristics of digital images– Redundancy

Redundancy looks at “properties” of an image and reduces redundant data– Irrelevancy

Much of the data in an image may be irrelevant to a human observer

15

Video Bit Rate Calculation

width pixels (160, 320, 640, 720, 1280, 1920, …)

height pixels (120, 240, 480, 485, 720, 1080, …)

depth bits (1, 4, 8, 15, 16, 24, …)

fps frames per second (5, 15, 20, 24, 30, …)

compression factor (1, 6, 24, …)

width * height * depth * fps

compression factor= bits/sec

16

Effects of Compression

storage for 1 hour of compressed video in megabytes

1920x1080 1280x720 640x480 320x240 160x1201:1 671,846 298,598 99,533 24,883 6,2213:1 223,949 99,533 33,178 8,294 2,0746:1 111,974 49,766 16,589 4,147 1,037

25:1 26,874 11,944 3,981 995 249100:1 6,718 2,986 995 249 62

3 bytes/pixel, 30 frames/sec

17

Categories of Compression Techniques

Entropy Encoding Run-Length Encoding

Huffman Coding

LZW

Arithmetic Coding

Source Encoding Prediction

Transformation (e.g. DCT)

Layered Coding

Vector Quantization

Hybrid Coding JPEG

MPEG

H.261

DVI RTV, DVI PLV

18

• Digital Video and Image Coding, Compression

Simple Interpolative Predictive

Transform Statistical

Truncation

CLUTRun-length

Subsample

DPCMADPCM

DCT Huffman

Fixed Adaptive

Video CompressionAlgorithm

ColourComponents

Bit AssignmentVideo

Input

Compressed Bit-Stream

Motion Comp.

19

• As can be seen from the diagram, the majority of video compression algorithms use a combination of compression techniques to produce the bit-stream. We will consider each of the individual techniques identified in the diagram.

• We assume that all input to the system is in the form of a PCM (Pulse Code Modulation - we will discuss this later when considering Sound sampling) digitised signal in colour component (RGB, YUV) form.

• Selection of colour component form can be important, where there are differences in colour processing between compression and decompression.

• Techniques can be made adaptive to the image content.

20

– Simple Compression (Encoding) Techniques• Truncation

– throw away least significant bits for each pixel– too much truncation will affect contouring, image becomes

cartoon-like– for real images, truncation from 24bpp to 16bpp gives

good results (RGB = 5:5:5 + keying bit; YUV=6:5:5)

• CLUT– Colour Lookup Table– pixel values in the bitmap represent an index into a table of

colours– usually 8bpp, so image limited to 256 colours– unique CLUT can be created for each image, but this

results in non-trivial preprocessing– bpp can be increased for better quality, but once you reach

16bpp truncation is better and simpler

21

• Run-length Encoding– blocks of repeated pixels are replaced with a single value plus

a count

– works well on images with large repeated blocks of solid colours, can achieve compression rates below 1bpp

– good for computer-generated images, cartoons, etc.

– poor for real images, video, etc.

– Interpolative Techniques• Interpolative encoding works at the pixel level by

transmitting a subset of the pixels and using interpolation to reconstruct the intervening pixels

– not really compression as we are reducing the number of pixels rather than the size of their representation

– it is validly used in colour subsampling, working with luminance-chrominance component images (YUV), can reduce 24bpp to 9bpp

– also used in motion video compression (i.e. MPEG)

22

– Predictive Techniques• Based on the fact that we can store the previous

item (frame, line, pixel, etc.) and use it to help build the next item, allowing us to transmit only that part of the item that has changed.

• DPCM– Compare adjacent pixels and only transmit the difference

between them, because adjacent pixels are likely to be similar the difference value have a high probability of being small and can safely be transmitted with fewer bits. Hence we can use 4 bit difference values for 8 bit pixels.

– In decompression the difference value is used to modify the previous pixel to get the new one, which works well as long the amplitude change is small. If the change is a full-amplitude, say from black to white, it would overload the DPCM system, requiring a number of pixel times to make the change and causing smearing of the edges in high-contrast images (slope overload).

23

• ADPCM– Adaptive DPCM– Can adapt the step size for the difference values to cope

with full amplitude changes. Some extra overhead in data and processing to achieve adaptation.

– Replaces slope overload with quantisation noise for high-contrast edges.

• Since predictive encoding is dependent on previous pixels for future ones, any errors are likely to be exacerbated. To avoid this typically predictive schemes make differential start overs, often at the beginning of each scanning line or each frame.

– Transform Coding Techniques• A transform is a process that converts a bundle of data into an

alternate form which is more convenient for some purpose.• Transforms are usually reversible, using an inverse transform.

24

Lossless Data Compression

• Lossless means the reconstructed image doesn’t lose any information according to the original one.

• There is a huge range of lossless data compression techniques.

• The common techniques used are:– runlength encoding– Huffman coding– dictionary techniques

25


• Runlength compression– Removing repetitions of values and replacing them

with a counter and single value.– Fairly simple to implement.– Its performance depends heavily on the input data

statistics. The more successive value it has, the more space we can compress.

26


• Huffman compression– Use more less bits to represent the most frequently

occurring characters/codeword values, and more bits for the less commonly occurring once.

– It is the most widespread way of replacing a set of fixed size code words with an optimal set of different sized code words, based on the statistics of the input data.

– Sender and receiver must share the same codebook which lists the codes and their compressed representation.

27


• Dictionary compression– Look at the data as it arrives and form a dictionary.

when new input comes, it look up the dictionary. If the new input existed, the dictionary position can be transmitted; if not found, it is added to the dictionary in a new position, and the new position and string is sent out.

– Meanwhile, the dictionary is constructed at the receiver dynamically, so that there is no need to carry out statistics or share a table separately.

28

• In image and video compression, the bundle of data is usually a two-dimensional array of pixels, i.e. 8x8.

A

DC

B

2x2 Array of Pixels

Transform Inverse Transform

X0 = A

X1 = B - A

X2 = C - A

X3 = D - A

An = X0

Dn = X3 + X0

Cn = X2 + X0

Bn = X1 + X0

29

• In the simple example shown, if the pixels were 8 bits each then the block would use 32 bits :

– Using the transform we could assign 4 bits each for the difference values and 8 bits for the base pixel, A. This would reduce the data to 8 + (3x4) or 20 bits for the 2x2 block - compressing from 8bpp to 5bpp.

– This example is too small to be useful, typically transforms are enacted on 8x8 blocks and the trick is to develop good transforms with calculations that are easy to implement in hardware or software.

• The Discrete Cosine Transform– especially important for video and image compression

– typically used on 8x8 pixel blocks, processing 64 pixel values and 64 new values are output, representing the amplitudes of the two-dimensional spatial frequency components of the 64-pixel block - these are referred to as DCT coefficients.

– the coefficient for zero spatial frequency is called the DC coefficient, the remaining 63 are the AC coefficients, and they all represent amplitudes of progressively higher spatial frequencies in the block

30

– As adjacent pixel values tend to be similar or vary slowly from one to another, the DCT processing provides the opportunity for compression by forcing most of the signal energy into the lower spatial frequency components. In most cases, many of the higher-frequency coefficients will have zero or near-zero values and can be ignored.

– Statistical Coding• Uses the statistical distribution of the pixel values in an

image, or of the data created from one of the techniques already described.

• Also known as entropy encoding• Can be used in bit assignment as well as part of the

compression algorithm itself.• Due to the non-uniform distribution of pixel values, we

can set up a coding technique where the more frequently occurring values are encoded using fewer bits.

31

• A codebook is created which sets out the encodings for the pixel values, this is transmitted separately from the image data and can apply to part of an image, a single image or a sequence of images.

• Because the most frequently occurring values are transmitted using fewer bits high compression ratios can be achieved.

• One of the most widely used forms of statistical coding is called Huffman encoding.

– Motion Compensation• If we are transmitting video frames on the basis of

describing the difference between one frame and the next, how do we describe motion?

• Compare frames for differences

• Set threshold value for motion

32

• Use DPCM approach to encode the data• Use block structure to determine motion in parts of

image (similar to transform approach)• In sophisticated compression systems, motion

vectors can be developed to ensure fidelity of reproduction

– Classification of Compression Algorithms• Lossless compression

– image is mathematically equivalent to original– only achieves modest level of compression (5:1)

• Lossy compression– image shows degradation from original– high rates of compression (up to 200:1)– Objective - achieve highest possible rate of compression

while maintaining quality of image to be “virtually lossless”

33

What is…• JPEG - Joint Photographic Experts Group

– Still image compression, intraframe picture technology

– MJPEG is sequence of images coded with JPEG• MPEG - Moving Picture Experts Group

– Many standards MPEG1, MPEG2, and MPEG4– Very sophisticated technology involving intra- and

interframe picture coding and many other optimizations => high quality and cost in time/computation

• H.261/H.263/H.263+ - Video Conferencing– Low to medium bit rate, quality, and computational

cost Used in H.320 and H.323 video conferencing standards

34

• Image, Video and Audio Standards– JPEG

• good compression• widespread applicability• lossless compression

– predictor for each pixel– comparison with surrounding pixels– difference computed– value of difference replaced with code from code table,

developed using Huffman encoding– code table forms part of encoded image

• lossy compression– image broken down into 8x8 blocks– apply DCT and then quantize the image– encode using same system as for lossless

35

• JPEG Compression Ratios

Quality Bits per pixel

Moderate to good

Good to Very good

Excellent

Near original quality

0.25 - 0.5

0.5 - 0.75

0.75 - 1.5

1.5 - 2.0

36

• JPEG can be used for video information (Motion JPEG) but it makes no concession to the nature of video, maintaining the same structure and, more importantly, bit rate and structure for each frame of the video.

– JBIG• lossless compression• one bit/pixel, binary or bi-level images• based on template structure to model redundancy

within image• uses arithmetic encoding• intended primarily for use with fax

37

– MPEG• MPEG - 1 : Data rate 1-1.5Mbps• Features :

– random access to frames

» to allow starting of the video sequence at any point

– fast forward and reverse searches

» to view video in either direction at more than the original speed

– reverse playback

» to permit a reverse play mode - not appropriate in situations such as video telephony

– audio-video synchronisation

» manage lip-synch

– robustness to errors

» should be able to recover from errors, and not propagate errors through frames, particularly important when dealing with non-error-free communication channels

38

– adjustable delay time or real-time operation» not a factor in normal video playback, but of particular

importance in video telephony– editability

» permit inclusion of other video in encoded sections– flexible format

» permit different window sizes and frame rates– implementable in hardware

» dedicated chipset for decoding is desirable

• Algorithm– Based on 3 types of frame– I-frame : similar to JPEG still image, basis of the encoding

as it contains the maximum amount of information.– P-frame : contains less information than an I-frame and is

obtained by using motion-compensated prediction from past I-frames.

39

– B-frame: has the greatest level of compression, or the least level of information. Obtained by interpolation between an I-frame and a P-frame.

• Audio– various elements of audio capture are defined in the

MPEG standard, see the handout.

– MPEG-2– MPEG-4– MPEG-7– MPEG-21

40

Audio

• The input audio signal from a microphone is passed through several stages:– firstly, a band pass filter is applied eliminating

frequencies in the signal that we are not interested in.– then the signal is sampled, converting the analog

signal into a sequence of values.– This is then quantised, or mapped into one of a set of

fixed value.– These values are then coded for storage or

transmission.

41

Audio

• Some techniques for audio compression:– ADPCM– LPC– CELP

42

Audio

• ADPCM -- Adaptive Differential Pulse Code Modulation – ADPCM allows for the compression of PCM encoded

input whose power varies with time.– Feedback of a reconstructed version of the input

signal is subtracted from the actual input signal, which is quantised to give a 4 bits output value.

– This compression gives a 32 kbit/s output rate.

43

Audio

Qunatizer

Predictor

Coder

ChannelEm*Em

Original

Xm

Xm*Xm'

+

+

+-

Transmitter

Predictor

Reconstructed

Xm*

Channel

Xm'

++

Receiver

Em*

Decoder

44

Audio

• LPC -- Linear Predictive Coding – The encoder fits speech to a simple, analytic model

of the vocal tract. Only the parameters describing the best-fit model is transmitted to the decoder.

– An LPC decoder uses those parameters to generate synthetic speech that is usually very similar to the original.

– LPC is used to compress audio at 16 Kbit/s and below.

45

Audio -- CELP

• CELP -- Code Excited Linear Predictor – CELP does the same LPC modeling but then

computes the errors between the original speech and the synthetic model and transmits both model parameters and a very compressed representation of the errors.

– The result of CELP is a much higher quality speech at low data rate.

46

CODING

47

Huffman

• Uncompressed images, audio, and video data require considerable storage capacity.

• Data transfer of uncompressed video data over digital networks requires very high bandwidth to be provided for a single point-to-point communication.

• Without compression, a CD with a storage capacity of approximately 600 million bytes would only be able to store about 260 pictures (1024x768 true color) or at the 25 frames per second rate of a motion picture, about 10 seconds of a movie.

48

Compression Terminology

• Compression Ratio• The ratio of raw data to compressed data• It is computed by dividing the original number of bits

or bytes by the number of bits or bytes remaining after data compression is applied or as a percentage of compressed/original.

• For lossless compression, compression ratios of 2:1 (50%) or 3:1 (30%) are typical.

• For lossy compression on video, compression ratios of more than 100:1 may be achievable with the effectiveness of the compression algorithms and acceptable information loss.

49

Image Compression

• 2-stage Coding technique1. A linear predictor such as DPCM, or some linear

predicting function Decorrelate the raw image data

2. A standard coding technique, such as Huffman coding, arithmetic coding, …

Lossless JPEG:

- version 1: DPCM with arithmetic coding

- version 2: DPCM with Huffman coding

50

Entropy Encoding

• Used regardless of media’s specific characteristics.

• The data stream to be compressed is considered to be a simple digital sequence and the semantics of the data is ignored.

• It is a lossless process.

51

Source Encoding

• Takes into account the semantics of the data.

• The degree of compression that can be reached by source encoding depends on the data contents.

• It is usually a lossy process.

52

Run-Length Encoding (RLE)

• RLE is mostly useful when we have to deal with palette-based images that contain large sequences of equal colours.

• The idea in RLE to encode a string of repeated characters by a count of the number of times it is repeated and one copy of the character.

• For example, the string AAAABBBAABBBBBCCCCCCCCDABCDBAAABBBBCCCDcan be compressed as “4A3BAA5B8CDABCB3A4B3CD” where “4A” means “four A”, and so forth.

• This example represents 38 bytes of data with 22 bytes, achieving a compression ratio of 38/22 = 1.73.

53

Run-Length Encoding (RLE)

• For binary files, simply store the run lengths, taking the advantage of the fact that the runs alternate between 0 and 1, assuming run of 0 goes first.

• For example, “0000011000” can be compressed as “5#2#3” while “11000” be compressed as “0#2#3”.

• Run length encoding works very well for images with solid backgrounds like cartoons. For natural images, it doesn't work that well.

54

The Huffman Coding algorithm- History

• In 1951, David Huffman and his MIT information theory classmates given the choice of a term paper or a final exam

• Huffman hit upon the idea of using a frequency-sorted binary tree and quickly proved this method the most efficient.

• In doing so, the student outdid his professor, who had worked with information theory inventor Claude Shannon to develop a similar code.

• Huffman built the tree from the bottom up instead of from the top down

55

Huffman Coding Algorithm

1. Take the two least probable symbols in the alphabet(longest codewords, equal length, differing in

last digit)

2. Combine these two symbols into a single symbol, and repeat.

56

A simple example

• Suppose we have a message consisting of 5 symbols, e.g. [►♣♣♠☻►♣☼►☻]

• How can we code this message using 0/1 so the coded message will have minimum length (for transmission or saving!)

• 5 symbols at least 3 bits • For a simple encoding,

length of code is 10*3=30 bits

57

A simple example – cont.

• Intuition: Those symbols that are more frequent should have smaller codes, yet since their length is not the same, there must be a way of distinguishing each code

• For Huffman code,

length of encoded message

will be ►♣♣♠☻►♣☼►☻

=3*2 +3*2+2*2+3+3=24bits

58

Definitions

• An ensemble X is a triple (x, Ax, Px)

– x: value of a random variable

– Ax: set of possible values for x , Ax={a1, a2, …, aI}

– Px: probability for each value , Px={p1, p2, …, pI}

where P(x)=P(x=ai)=pi, pi>0,

• Shannon information content of x

– h(x) = log2(1/P(x))

• Entropy of x

–

1 ip

xAx xP

xPxH)(

1log).()(

i ai pi h(pi)

1 a .0575 4.1

2 b .0128 6.3

3 c .0263 5.2

.. .. ..

26 z .0007 10.4

59

Source Coding Theorem

• There exists a variable-length encoding C of an ensemble X such that the average length of an encoded symbol, L(C,X), satisfies – L(C,X)[H(X), H(X)+1)

60

Symbol Codes

• Notations:– AN: all strings of length N– A+: all strings of finite length– {0,1}3={000,001,010,…,111}– {0,1}+={0,1,00,01,10,11,000,001,…}

• A symbol code C for an ensemble X is a mapping from Ax (range of x values) to {0,1}+

• c(x): codeword for x, l(x): length of codeword

61

Example

• Ensemble X:– Ax= { a , b , c , d }

– Px= {1/2 , 1/4 , 1/8 , 1/8}

• c(a)= 1000• c+(acd)=

100000100001

(called the extended code)

40001d

40010c

40100b

41000a

lic(ai)ai

C0:

62

The code should achieve as much compression as possible

• The expected length L(C,X) of symbol code C for X is

||

1

)()(),(x

x

A

iii

Ax

lpxlxPXCL

63

Example

• Ensemble X:– Ax= { a , b , c , d }

– Px= {1/2 , 1/4 , 1/8 , 1/8}

• c+(acd)=0110111

(9 bits compared with 12)

• prefix code?

3111d

3110c

210b

10a

lic(ai)ai

C1:

64

Example

• Ax={ a , b , c , d , e }

• Px={0.25, 0.25, 0.2, 0.15, 0.15}

d0.15

e0.15

b0.25

c0.2

a0.25

0.3

0 1

0.45

0 1

0.55

0

1

1.0

0

1

00 10 11 010 011

65

Huffman Coding Algorithm for Image Compression

• Step 1. Build a Huffman tree by sorting the histogram and successively combine the two bins of the lowest value until only one bin remains.

• Step 2. Encode the Huffman tree and save the Huffman tree with the coded value.

• Step 3. Encode the residual image.

66

Example Huffman encoding• A = 0

B = 100C = 1010D = 1011R = 11

• ABRACADABRA = 01001101010010110100110

• This is eleven letters in 23 bits• A fixed-width encoding would require 3 bits for five

different letters, or 33 bits for 11 letters• Notice that the encoded bit string can be decoded!

67

• In this example, A was the most common letter

• In ABRACADABRA :– 5 As code for A is 1 bit long– 2 Rs code for R is 2 bits long– 2 Bs code for B is 3 bits long– 1 C code for C is 4 bits long– 1 D code for D is 4 bits long

68

Creating a Huffman encoding

• For each encoding unit (letter, in this example), associate a frequency (number of times it occurs)– You can also use a percentage or a probability

• Create a binary tree whose children are the encoding units with the smallest frequencies– The frequency of the root is the sum of the

frequencies of the leaves

• Repeat this procedure until all the encoding units are in the binary tree

69

Example, step I• Assume that relative frequencies are:

– A: 40– B: 20– C: 10– D: 10– R: 20

• Smallest number are 10 and 10 (C and D), so connect those

70

Example, step II

• C and D have already been used, and the new node above them (call it C+D) has value 20

• The smallest values are B, C+D, and R, all of which have value 20– Connect any two of these

71

Example, step III

• The smallest values is R, while A and B+C+D all have value 40

• Connect R to either of the others

72

Example, step IV

• Connect the final two nodes

73

Example, step V• Assign 0 to left branches, 1 to right branches• Each encoding is a path from the root

• A = 0B = 100C = 1010D = 1011R = 11

• Each path terminates at a leaf

74

Example 2

• Suppose we want to compress the following message ABEACADABEA.

• The count table is:character A B C D E

frequency 5 2 1 1 2

• The following binary tree can then be constructed:

A

C

E

B

D

75

Example 2• The following table is used to encode the

characters:

• The message ABEACADABEA can then be encoded with the string 10100010010100111010001.

Character A B C D E

representation 1 01 0010 0011 000

76

Example 2• Total number of bits to code the string

= 51 + 22 + 14 + 14 + 23 = 23

• If the original message uses 8 bit for 1 character, its length is 118 = 88.

• The compression ratio is 88/23 = 3.83 (26.14%).

• If the original message uses 3 bit for 1 character, its length is 113 = 33.

• The compression ratio is 33/23 = 1.43 (69.70%).

77

Example 3

a) Using Huffman Coding Algorithm, compress a string with the following probability of occurrence:

b) Assuming that the above string was originally represented by 3-bit code, calculate the compression ratio achieved.

Character A B C D E F G

Probability 0.325 0.250 0.150

0.125

0.050 0.075 0.025

78

Example 3

A F E G CD B


Probability 0.325 0.250 0.150 0.125 0.050 0.075 0.025

0.4

0.325 0.075 0.050 0.025 0.125 0.1500.250

0.075

0.15

0.275

0.6

1.0

Encode the charactersA(00) B(10) C(11) D(011) E(01010) F(0100) G(01011)

79

Example 3

• A(00) B(10) C(11) D(011) E(01010) F(0100) G(01011)

• Original expected length: 3 bits• Compressed expected length: • 0.325(2) + 0.250(2) + 0.150(2) + 0.125(3) + 0.050(5)

+ 0.075(4) + 0.025(5) = 2.375

• Compression ratio = 3/2.375 = 1.26 (79.17%)


Probability 0.325 0.250 0.150 0.125 0.050 0.075 0.025

80

LZW• LZW (Lempel-Ziv-Welch) is a dictionary-

based compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch.

• It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978.

• The algorithm is designed to be fast to implement but not necessarily optimal since it does not perform any analysis on the data.

81

LZW

• The principle of encoding– The algorithm is surprisingly simple. It replaces

strings of characters with single codes.– It does not do any analysis of the incoming text. – Instead, it just adds every new string of characters it

sees to a table of strings. – Compression occurs when a single code is output

instead of a string of characters.– It became very widely used after it became part of

the GIF image format in 1987.

82

LZW

• The principle of encoding– Most implementations of LZW used 12-bit codewords

to represent 8-bit input characters. – The string table is 4096 locations. – The first 256 locations are initialized to the single

characters (location 0 stores 0, location 1 stores 1, and so on).

– As new combinations of characters are parsed in the input stream, these strings are added to the string table, and will be stored in locations 256 to 4095 in the table.

83

Algorithm• The compression algorithm is as follows:

Initialize table with single character strings

STRING = first input character

WHILE not end of input stream

CHARACTER = next input character

IF STRING + CHARACTER is not in the string table

add STRING + CHARACTER to the string table

output the code for STRING

STRING = CHARACTER

ELSE

STRING = STRING + CHARACTER //wait until for a new string

END WHILE

output code for string

84

LZW Encoding Example

• Compression of the string BABAABAAA yields the following trace:

• The compressed data is <66><65><256><257><65><260>

string character index entry output code

B A 256 BA 66

A B 257 AB 65

B A

BA A 258 BAA 256

A B

AB A 259 ABA 257

A A 260 AA 65

A A

AA 260

85

LZW Decoding AlgorithmInitialize table with single character stringsOLD = first input codeoutput translation of OLDCHARACTER = translation of OLD WHILE not end of input streamNEW = next input codeIF NEW is in the string table

STRING = translation of NEWELSE

STRING = translation of OLD + CHARACTERoutput STRINGCHARACTER = first character of STRINGadd translation of OLD + CHARACTER to the string tableOLD = NEWEND WHILE

86

LZW Decoding Algorithm

• LZW algorithm does not need to pass the string table to the decompression code.

• The table can be built exactly as it was during compression, using the input stream as data.

• It starts with the first 256 table entries initialized to single characters.

• The decompression algorithm adds a new string to the string table each time it reads in a new code.

• When a new code read is undefined, it translates the value of OLD_CODE and then adds the CHARACTER value to the string.

87

LZW Decoding Example• The decompression of our compressed data <66><65><256><257><65><260> gives

the following trace:

OLD = 66 (B)

character = B

output = B

• The output is BABAABAAA 0..25

5..ABC..

NEW string output character index entry OLD

65 (A) A A A 256 BA 65 (A)

256 (BA) BA BA B 257 AB 256 (BA)

257 (AB) AB AB A 258 BAA 257 (AB)

65 (A) A A A 259 ABA 65 (A)

260 (AA) AA AA A 260 AA 260 (AA)

88

DPCMDifferential Pulse Code Modulation

• DPCM is an efficient way to encode highly correlated analog signals into binary form suitable for digital transmission, storage, or input to a digital computer

• Patent by Cutler (1952)

89

DPCM

90

DCT : Discrete Cosine TransformDCT converts the information contained in a block(8x8) of

pixels from spatial domain to the frequency domain.– A simple analogy: Consider a unsorted list of 12 numbers between 0

and 3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0). Consider a transformation of the list involving two steps (1.) sort the list (2.) Count the frequency of occurrence of each of the numbers ->(4,4,3,1 ). : Through this transformation we lost the spatial information but captured the frequency information.

– There are other transformations which retain the spatial information. E.g., Fourier transform, DCT etc. Therefore allowing us to move back and forth between spatial and frequency domains.

1-D DCT: 1-D Inverese DCT:

F()a(u)2

f(n)cos(2n1)16

n0

N 1

a(0) 12

a(p)1 p0

1N

0161)(2n)cosF(

2a(u))(f’

n

91

Discrete Cosine Transform

• JPEG is a lossy compression scheme based on colour space conversion and discrete cosine transform (DCT).

• The Discrete Cosine Transform (DCT) is a method of decomposing a block of data into a weighted sum of spatial frequencies.

• Each of these spatial frequency patterns has a corresponding coefficient, the amplitude needed to represent the contribution of that spatial frequency pattern in the block of data being analyzed.

92


• If only the low-frequency DCT coefficients are nonzero, the data in the block vary slowly with position. If high frequencies are present, the block intensity changes rapidly from pixel to pixel.

93


0.00 200.00 400.00 600.00

0.00

40.00

80.00

120.00

160.00

200.00Row 256 of lenna

0.00 200.00 400.00 600.00

0.00

500.00

1000.00

1500.00

2000.00

2500.00

absolute dct values of lenna row 256

94


• MPEG uses a 2-D 8x8 form of the DCT. • The coefficients of DCT U(i, j) for input data

V(x, y) are determined by the following formula:

where C(i), C(j) = for i, j = 0 and otherwise C(i), C(j) = 1.

]16

)12(cos[]

16

)12(cos[),()()(

4

1),(

7

0

7

0

x y

jyixyxVjCiCjiU

2

1

95

DCT vs. Wavelet: Which is Better?

• 3dB improvement?– Wavelet compression was claimed to have 3dB

improvement over DCT-based compression– Comparison is done on JPEG Baseline

• Improvement not all due to transforms– Main contribution from better rate allocation,

advanced entropy coding, & smarter redundancy reduction via zero-tree

– DCT coder can be improved to decrease the gap

coding and compression 1

Documents