erlz: compression coding for palettized colored images compression coding for palettized colored...

ERLZ: Compression Coding for Palettized Colored Images

Mohamed Hamdy. El-Sheikh, Shehab Gamalel-Din

Hassanein Al-Barhmtoshy, and Ahmed Kolkila

Al-Azhar University, Systems & Computers Engineering Dept.

Cairo, Egypt

Abstract. In this work, we deal with palettized colored images. These types of images are widely used with most of the applications on personal computers and small workstations and, more importantly, satellite images. In the era of Internet and e-learning, efficiency of transmission of such images is critical. Unfortunately, palettized images are usually large and hence require large transmission bandwidth. Compression is one way of managing the size of transmitted images. Therefore, a key criterion of successful compression for such types of applications is the compaction ratio. Accuracy is also as important; hence, we focus on lossless compression techniques. Two commonly used techniques in lossless image compression are RLE and LZW. The main advantage in RLE technique is the use of small memory size requiring the implementation of a high-speed compression process, that is in addition to its easy-to-implement algorithm; while the main advantage in LZ techniques (especially LZW) is the high compression ratio. In this research we propose a new compression algorithm for lossless color palettized images that we called it ERLZ. It merges the two known algorithms of RLE and LZW to gain the benefits of them both. The design and implementation of ERLZ is also presented in this paper. This algorithm is tested using two image sets. The first contained small changes in colors. The second contained many color changes. Comparisons of the results of ERLZ to those of both RLE and LZW are also presented. The analysis of such results and comparisons allow us to identify the characteristics of the images that are best compressed by ERLZ.

1. INTRODUCTION.

Image compression is important for many emerging applications in the fields of

multimedia, visual communication systems and telecommunication networks [1]. The need

for image compression is increasing with an aim to minimize the costs of both the storing

memory and the transmitting time and bandwidth. Image compression is concerned with

minimization of the number of information carrying units used to represent an image. The

efficiency of a compression algorithm is measured by its data compressing ability, the

resulting distortion, and storage and time required in compressing and decompressing

processes as well as its implementation complexity [2,3].

In image compression, there is a common assumption that “small numerical errors in

decoded values lead to small visual errors that are unrecognizable”. This is not true for

some image types and applications, such as palettized images that require critical

performance criteria [4]. Examples are atmospheric observatory images that usually

contain repeated data and that require high transmission and translation speeds. Other

examples are cartoon and animated multimedia and applications. For these reasons, we are

more interested in lossless compression techniques.

In this research we have comparatively studied several of the commonly used lossless

compression techniques to identify the strengths and weaknesses of each. Then we studied

the viability of these techniques in compressing the special type of palettized images. We

have been able to conclude that none of the studied techniques is powerful enough to meet

the unique characteristics of palettized images. Some are too complex, while others are

time expensive in both compression and decompression, and still others require large

storage to manipulate either of the two processes. However, palettized images require high

compression ratio and high-speed compression/decompression processors that makes

neither of these techniques ideal when used separately for the compression of such type of

images.

Therefore, in this article we propose a new image compression approach for lossless

compression of color palettized images that we called it ERLZ (Enhanced Run Length and

Lempel Ziv). This new technique is envisioned from two commonly known techniques,

namely Run Length Encoding (RLE) [8] and Lempel Ziv Welch (LZW) [5,6,7]. Hence,

ERLZ gains some of the benefits of them both and avoids some of their limitations and

weaknesses. For instance, it gains the high compression ratio of LZW and the high

compression speed of RLE.

In the following section we review some of the compression techniques upon which we

have built ERLZ, namely RLE, LZW and Huffman. A comparison between them all is

then given for analyzing their strengths and weaknesses trying to assess their capabilities in

supporting the criteria required by the palettized images.

2. COMMENLY USED LOSSLESS COMPRESSION TECHNIQUES – A REVIEW OF RELATED WORK.

The most commonly used techniques in lossless compression are Huffman, RLE and LZW.

The Huffman coding technique is a statistical-based technique, while RLE is simple and

effective for string data, and LZW is substitutional or a dictionary-based encoding

technique. Each of them has its advantages and disadvantages. A brief introduction to

each of them is given in the next sections followed by a comparative analysis.

2.1 The Run Length Encoding (RLE) Compression Technique.

Run Length is a simple method for lossless compression of sequential data. The

assumption upon which this algorithm is based is that “in many data streams, the

consecutive single tokens (symbols) are identical”; an assumption that is commonly valid

for palettized images. Therefore, RLE checks the stream for these identical tokens and

whenever more than four equal input tokens (symbols) are found, it replaces them with a

two-field structured special token carrying data about the number of occurrences of this

repeated token and the symbol itself [9]. For more details on how the RLE algorithm

works, please refer to Appendix A.

The decoding (decompression) process is simple. When one of the special tokens is found

in the received input compressed stream, it is replaced in the decompressed output data

stream by a string of concatenated repeated original symbol with the appropriate length.

All other tokens are placed as is in the output stream, since they represent the original

message symbol.

2.2 Lempel Ziv Welsh (LZW) Compression Technique.

The two computer scientists, Abraham Lempel and Jacob Ziv introduced the substitutional

compression technique in 1977 [10], and since then, it is known as LZ77. This technique

depends on capturing the high order relationship between words and phrases. Lempel and

Ziv came up with an improvement for LZ77 scheme in 1978 [11], known as LZ78, which

was refined by Terry Welsh in 1984 and hence, became known by LZW [12]. LZW works

by creating a “dictionary” of phrases that occur in the input data. When an encountered

phrase is already present in the dictionary, the index number of that phrase in the dictionary

is placed in the output stream. This means that the dictionary entries are of variable length.

The first 256 entries are used to contain the values for individual bytes, so the actual first-

string index is 256. As the string is compressed, the dictionary is built up to contain every

possible string combination that can be obtained from the message/image, starting with two

characters, then increasing to three characters, and so on. The details of LZW are shown in

Appendix A.

2.3 Huffman Encoding Technique.

Huffman encoding uses a strategy to analyze the file and assign codes on a character

frequency basis. Once the file has been analyzed, a binary tree is constructed to represent

the coded alphabet. The individual characters are leaves in the tree, and their codes are

indicated by the path of each leaf from the root of the tree. Tracing the path from the root

to a specific leaf yields the code for the character found at that leaf [13].

Huffman coding technique has optimum case when all input symbols probabilities are

integral power of 0.5; otherwise, the binary tree and the code length are large. Huffman

algorithm is slow because it requires scanning all input symbols to calculate its

probabilities [14]. Although it requires a medium-sized memory, its overall average

compression ratio is usually good.

2.4 Analytical Comparison.

Table 1 compares the three lossless compression techniques, namely Huffman, LZW and

RLE.

Studying this comparative data, we can conclude the following characteristics. RLE is a

simple lossless compression technique. It is an easy and fast way to compress data but the

compression ratio isn’t nearly as good as that obtained with Huffman or LZW.

Table (1): Comparison among Huffman, LZW and RLE.

Huffman

LZW

RLE

Compression

technique

Statistical Substitution (dictionary) Simple.

Basic idea Gives a reduction in

the average code length

used to represent the

symbols of input

alphabet.

Entering phrases into a

dictionary and then, when

a repeat occurrences of

that particular phrase is

found, outputting the

dictionary index instead of

the phrase.

Replaces redundant data

with token.

Optimal case Where all symbols

probabilities are

integral power of 0.5.

When there are repeated

sequence of string symbols

(identical or different)

When there is high

redundancy of identical

(repeated) sequence data.

Examples (for

optimal case).

Input alphabet a, b.

Abababab 10101010

(optimal)

Aaaabbbb 11110000

(optimal)

Aabbaabb 11001100

(optimal)

12352(a:1,b:2,ab:3,ba:4,ab

a:5)

13262

(a:1,b:2,aa:3,aaa:4,ab:5,bb:

6, bbb:7)

Not compressed.

!4a!4b(! special token).

Not compressed.

Applied on Sequential data. Sequential data. Sequential and identical

(repeated) data.

At work

(application)

Doesn’t work alone,

but with other

techniques

Work alone effectively. Good work alone.

Usually used in JPEG formats Graphics (GIF format). Graphics ( bmp & pcx

format). Used in facsimile

that apply CCITT(ITU-

T)T4&T6 recommendations

Speed of

compression

Medium

Statistical process -

building binary tree.

Medium.

Compare & add –

build a symbol table.

High

Count repeated symbols and

encode them.

Compressed output

data (C.R.)

Good.

Middle C.R.

Very good.

High C.R.

Good.

Low C.R.

Restriction on this

type.

Number of input

symbols alphabet

(binary tree).

Dictionary length. Redundant sequential data

only.

Example

( input =

abbababaac)

A=0.5 b=0.4 c=0.1

The o/p code

100001001001101

Dictionary- process-o/p

code

A=1 ab 1

B=2 bb 2

C=3 ba 2

//ab=4 ab

bb=5 (ab)a 4

ba=6 ab

aba=7 aba

abaa=8 (aba)a 7

ac=9 ac 1

c(special case) 3

Not compressed, i.e.,

abbababaac

LZW is dictionary-based having a high compression ratio but it consumes a long time in

both of the compression and decompression processes. It also requires a relatively large

memory size (depending on the size of the created symbol table).

Huffman technique doesn’t work separately, but rather it is used to to improve the results of

other techniques.

This analysis reveals that neither of these techniques suits the requirements of palettized

images that have a high repetition of identical data elements and that require fast

compression/decompression processes with high compression ratio and minimum memory

requirements. This led us to investigate the possibility of merging these three techniques

together for better and improved results. In the following section we introduce ERLZ, the

proposed compression technique inspired from both RLE and LZW integrated. In Section

4, an improvement of ERLZ by utilizing Huffman technique is introduced.

3. ERLZ: THE PROPOSED TECHNIQUE.

Theoretical possibility of image compression is primarily on the redundancy in the data

representing the image. Such redundancy occurs because of the correlation between

neighboring pixels with identical physical values. In palettized images, redundancy in

identical pixels is common not only for neighboring pixels but at multi different places and

areas of a single image. One can safely conclude that palettized images are composed of

spots of identical pixels in most of the image space with exceptions only occurring at object

edges. For typical examples of palettized images, see Appendix B.

Based on this assumption, both RLE and LZW techniques are suitable for image

compression with limitations and disadvantages as discussed in last section. RLE utilizes

this assumption very well and has relatively good output, though it only considers the

neighboring identical pixels in the data stream and ignoring other similarly identical pixels

when scanning the area of an image. On the other hand, LZW considers this unique feature

of palettized images by using string table, symbol table and dictionary. However, it

requires relatively large temporary storage, consumes more time and sends the table as part

of the message.

In this paper we propose a new algorithm that we called it ERLZ “Enhanced Run Length

and Lempel Ziv”. ERLZ gains the advantages of both RLE and LZW together. Let us

explain. RLE encodes identical consecutively repeated symbols with a single token having

the symbol and its occurrence count; so at decompression, the symbol is repeatedly inserted

in the output stream with the exact number as was in the uncompressed image. On the

other hand, LZW builds a dictionary of all possible strings of pixels found in the image

including strings of identical symbols with different repetition count. Strings of identical

symbols with different lengths are considered different entries with new encoding. ERLZ

utilizes these two ideas in order to use the same encoding symbol of repeated strings in the

dictionary table. Each entry is considered a single symbol and hence, ERLZ reduces the

sizes of both the symbol table and the compressed output. This improves the execution

speed of the compression process.

In the following sections, we present the compression and decompression algorithms of

ERLZ, then we review some of our experimental work to evaluate and compare ERLZ with

the other algorithms. Section 4 presents an improved version of ERLZ with the utilization

of Huffman algorithm.

3.1 The ERLZ Compression Algorithm.

The ERLZ Compression algorithm is shown in Table 2.a. The detailed algorithm and

source code written in C language can be found in [15].

This algorithm begins the compression process with an empty dictionary table that is

incrementally built along the compression process. Each time a new input symbol is read,

ERLZ compares it with previous symbols. All consecutive repeated symbols are

considered of a single entry in the Symbol Table with adjusted frequency. Other Symbol

table entries are created along the process of scanning the input stream. A new entry is

created when the look-ahead symbol concatenated to the scanned sub-stream is not already

an entry in the table. At which time, a new code is generated for that sub-stream and its

value is also sent to the output stream. In this process, all duplicate consecutive input

symbols are considered as a single symbol with a repetition frequency.

It should be noted that the symbol table entries are of variable length. Each symbol entry is

in itself a string like that generated by the RLE algorithm, i.e., consists of symbols and their

repetition frequencies. Those symbols are codes of other entries of the same Symbol Table,

which might be primitive symbols (of the alphabet used) or composite of consecutive

primitive or other composite symbols. So, ERLZ extends the concept of a symbol to mean

more complex streams and strings. This is a unique characteristic of ERLZ that

contributes to the efficiency of this algorithm in terms of both memory and time as well as

compression ratio.

Table2.a. The ERLZ Compression Algorithm.

Create and Initialize the Symbol Table. (Frequency, Symbol, Code)

Initialize.

While not EOF in input stream do

Repeat Until (Sin ≠ Sin+1)

Get next input symbol (Sin );

Increment counter

End Repeat

IF (number of repeated Sin and Sin value is not an entry in Symbol Table)

Then Create a new entry in Symbol Table

End IF

Continue Until S = String + Sin is not in Symbol Table S = S + Get next input symbol (Sin );

IF S is not in Symbol Table

Then Add a new entry in Symbol table

End IF

End Continue

Output its code to the output compressed data stream.

End While

Output symbol table to table output stream.

Concatenate the two streams (table stream and output stream) to a single

output compressed file.

3.2 The ERLZ Decompression Algorithm.

The ERLZ Decompression algorithm is illustrated in Table 2.b. The details of this

algorithm and the source code written in C language can be found in [15]. As shown, it has

a simple and straightforward implementation. This is due to the aid of the Symbol Table

whose each of it entries is considered as an output of the simple RLE algorithm.

The decompression algorithm begins its job by initializing the Symbol Table by that read

from the input compressed file. This table is the same dictionary table (Symbol Table)

generated by ERLZ at the compression time of that image file, but in an inverted way to

simplify the search for entry codes. For each scanned code in the input compressed file, the

table is searched for the appropriate entry. All input codes should be found in the Symbol

Table; otherwise an error might have taken place, e.g., transmission corruption. Each

scanned code in the input stream is replaced in the output stream by the appropriate string

of symbols as per the corresponding table entry. Each symbol is repeated according to the

indicated frequency count. It should be noted that this replacement process is recursive.

Each table entry code in an entry string is recursively replaced by the string of the entry

corresponding to sub-string code.

Table2.b. The ERLZ Decompression Algorithm.

Initialize the dictionary.

(Symbol Table consists of the following structure: frequency of symbol, the

symbol value and the code).

While not EOF in input stream do Get next code from compressed input stream.

IF the compressed code is in Symbol Table:

Then In the output stream: Place the symbol value repeated a number of the frequency value. (This is a recursive process in substituting code from

table entries.)

Else Error (this code is either not compressed by ERLZ or the compressed input is

distorted.

End While.

3.3 Experimental Analysis

For evaluating and comparing ERLZ to its ancestors’ algorithms, each of them is separately

applied to different images of different types, e.g., engineering drawings, photos, satellite

images, cartoon pictures, and logos, see Appendix B for the tested images. These images

are selected in such a way to cover different applications of palettized images. They can be

divided into two groups. The first group contains images with small number of colors and

a small change in the degree of colors’ depth from an area to another in the same image,

hence, they have few details with respect to 16 and 256 color systems. The second group

contains images with sharp differences in colors’ depth in the same area and hence, they

contain more details. The two image sets are in the BMP image file format.

The comparison criteria that we evaluated in this experimental work are: compression

ratio, temporary memory size and compression time. See Table 3 for the results of these

tested compressed images. The analysis of the resulted data reveals the following

conclusions.

In most cases, the compression ratio of ERLZ was relatively better than that of RLE and

close to that of LZW, though improved. This was expected because it merges the two

approaches in a single one. For the time cost, RLE technique was shortest, as it is the

simplest and doesn’t have to build time-consuming tables. However, ERLZ was relatively

better in compression time than LZW, especially for large sized images, because the String

Table of LZW contains larger strings and takes longer time for searching and comparing

the input string with all strings in the table. For the size of memory required in the

compression process, again RLE was the best of them all as it uses constant memory size to

check the repeated identical characters and it doesn’t requires any additional storage as it is

needed by the other two algorithms for storing and manipulating their Symbol Tables. On

the other hand, ERLZ consumed less memory size than LZW, as the sizes of most entries

of ERLZ table are shorter than those of LZW, especially when there are high frequencies of

repeated symbols or patterns; also, the overall number of entries in ERLZ table is less

because of the recursive nature of its entries.

Table 3. Comparison among LZW, RLE and ERLZ

(First Level of Compression).

Original

File

LZW RLE ERLZ

Compress

-ion Ratio

%

Memory

Size

Time

(sec.)

Ratio %

Time

(sec.)

Ratio %

Memory

Size

Time

(sec.)

Bub

77.68 15261 0.50 48.86 0.001 64.68 264 0.002

Bub5 87.85 127428 1.650 73.123 0.01 90.32 660 0.5

Mark 81.80 267595 13.20 75.98 0.60 88.05 2392 0.280

Map 90.314 280572 7.470 77.697 0.01 92.82 664 0.11

Setup16 83.89 360171 13.648 67.291 0.50 88.25 1304 1.648

Eclipse 91.63 429597 12.654 89.277 0.11 93.81 1588 0.22

Bal 92.253 551431 37.80 90.703 0.11 93.74 4628 0.380

Ger 97.81 541291 14.60 95.845 0.50 96.80 2248 1.648

Tree 95.645 1019911 77.654 94.270 0.11 96.03 3200 0.280

4. IMPROVED ERLZ.

Although the compression ratio in ERLZ was acceptable, improvement is still possible.

Therefore, we investigated whether integrating ERLZ with Huffman coding, as a statistical

compression method, will reveal better results. Therefore, we applied the Huffman coding

technique on the output of ERLZ as a second level of compression. This is depicted in

Figure 1.

Input

LZW

RLE

ERLZ

Huffman

First-Level Compression

Second-Level

Compression

Output

Figure 1: Improved ERLZ.

4.1 Experimental Analysis.

Table 4 shows a comparative study of applying the improved ERLZ technique on the

same set of test images used in our previous experimental analysis. It shows results with

and without applying Huffman coding. It indicates slight improvements when Huffman

coding is used as a second level compression that is applied on the results of ERLZ.

Table 4. Improved ERLZ (Second Level Compression).

Original

File

ERLZ Improved

ERLZ

Compression Ratio %

Compression Ratio %

Bub 64.68 62.88

Bub5 90.32 91.38

Mark 88.05 89.00

Map 92.82 93.50

Setup16 88.25 89.19

Eclipse 93.81 95.00

Bal 93.74 94.36

Ger 96.80 97.56

These data are statistically depicted as in Figures 2 and 3. Figure 2 shows compression

ratio differences among LZW, RLE and ERLZ for the first order level compression, while

Figure 3 shows comparisons for the second order level compression ratios.

4.2 Verification

In testing the correctness of our implementations of both compression and decompression,

we designed a program and called it Checkit to match the outputs resulted from the

decompressed image files with the original uncompressed input image files.

Figure 2. Chart of output sizes for the first order of compression.

Figure 3. Chart of output sizes for the second order of compression.

Compression Samples Test First Level

0

10000

20000

30000

40000

50000

60000

Input File Size

Output File Size

LZW

RLE

ERLZ

LZW 748 5150 13458 9088 18052 12226 15698 8286 19072

RLE 1083 14300 27046 28231 50279 21206 23343 10773 27567

ERLZ 607 6796 20498 12261 24753 16550 19452 6804 20951

Bub-2118Bub5-

53206

Mark-

112624

Map-

126582

Setup-

153718

Eclipse-

197770Bal-251078

Ger-

259278

Tree-

481078

Com pression Sam ple s Test Second Lev e l

0

10000

20000

30000

40000

50000

60000 LZWLZ+HUF

RLERL+HUFERLZERLZ+HUF

LZW 748 5150 13458 9088 18052 12226 15698 8286 19072

LZ+HUF 786 4584 12379 8218 16602 9872 14158 6323 15475

RLE 1083 14300 27046 28231 50279 21206 23343 10773 27567

RL+HUF 564 6151 14741 9688 21853 10691 13950 5516 14707

ERLZ 607 6796 20498 12261 24753 16550 19452 6804 20951

ERLZ+HUF 570 4257 13589 6250 15460 9741 14594 4998 14316

Bub-2118 Bub15-53206 Mark-112624 Map-126582 Setup-153718Ec lipse-

197770Bal-251078 Ger-259278 Tree-481078

5. CONCLUSION.

In this paper we have introduced and discussed a new proposed

compression/decompression technique – namely, ERLZ – that is suitable for the special

features of palettized images that assumes that “images are usually composed of spots of

identical pixels in most of the image space with exceptions only occurring at object edges”.

ERLZ is inspired by two commonly used lossless compression algorithms, namely, RLE

and LZW. It integrates them both in such a way to gain their benefits and avoid their

limitations.

ERLZ was tested on several palettized images of different types, e.g., engineering

drawings, photos, satellite images, cartoon pictures, and logos. These images are selected

in such a way to cover different applications of palettized images and are classified into two

groups: images with small number of colors and a small change in the degree of colors’

depth from one area to another, and images with sharp differences in colors’ depth in the

same area and containing more details. The implementations of both the compression and

decompression algorithms of ERLZ are verified by comparing the decompressed images to

those of the uncompressed ones.

Experimental and comparative tests revealed that ERLZ has improved results over its

inspirers in terms of compression ratio, time and memory requirements. Further

improvements for the results are obtained when the statistical coding technique (Huffman

encoding) is applied on the outputs of ERLZ for a second level of compression.

We believe that the proposed ERLZ technique is suitable in the compression of special

types of images such as atmospheric observatory images and engineering drawings.

REFERENCES.

[1] Jerry D. Gibson et al, Digital Compression for Multimedia, Morgan Kaufman Publishers, California,

1998.

[2] M. Rabbani, Image & Video Compression Fundamentals and the International Standards NTSC,

IEEE press, 1998.

[3] Belur V. Dasarathy, Image Data Compression – Bolck Truncation Coding, IEEE press, 1995.

[4] Noura A. Saleh, “A New Compression Algorithm for Color Palettized Images”, an M. Sc. Thesis,

Electronics and Communications Dep., Cairo University, 1998.

[5] Leszek Gasieniec and Wojciech Rytter, “Almost Optimal Fully LZW-Compressed Pattern Matching”,

in the Proceedings of the Data Compression Conference, IEEE, 1998.

[6] Sergei Hludov and Christoph Meinel, “DICOM - Image Compression”, in the Proceedings of the 12th

IEEE Symposium on Computer-Based Medical Systems, 1998.

[7] Kun-Jin Lin, Cheng-Wen Wu, “A Low-Power CAM Design for LZ Data Compression “, IEEE

Transactions on Computers, Vol. 49, No. 10, October 2000.

[8] Fikret Ercal, Mark Allen, Hao Feng, “A Systolic Image Difference Algorithm for RLE-Compressed

Images”, IEEE Transactions on Parallel and Distributed Systems, Vol. 11, No. 5, May 2000.

[9] Sayood Khalid, Introduction to Data Compression, Morgan Kaufman publishers, San Francisco,

USA, 1996.

[10] J. Ziv and A. Lempel, “A Universal Algorithm for Sequential Data Compression,” IEEE Transactions

on Information Theory, Vol. IT-23, No. 3, May 1977.

[11] J. Ziv and A. Lempel, “Compression of Individual Sequences via Variable Rate Coding, “ IEEE

Transactions on Information Theory, Vol. IT-24, No. 5, Sept.1978.

[12] T. A. Welch, “A Technique for High Performance Data Compression”, IEEE Computers, Vol. 17,

No. 6, June 1984.

[13] Knuth, D., Sorting and Searching, Addison Wesley Readings, MA, 1973.

[14] P. Surti; L. F. Chao, and A. Tyagi, “Low Power FSM Design Using Huffman-Style Encoding”, in the

Proceedings of the 1997 European Design and Test Conference (ED&TC '97), IEEE, 1997.

[15] A. A. Kolkila, “ Multimedia Compression”, M. Sc. Thesis, Systems & Computer Engineering Dept.,

Al-Azhar University, Cairo, Egypt, 2000.

Appendix A

Compression Algorithms

A.1. RLE Compression Algorithm:

The steps for RLE Compression algorithm are as follows:

Step1: Read current stream from input buffer.

Step2: For all input data stream

Check the input stream for sequence of identical token (t).

Step3: IF there are more than four identical tokens (t)?

Count the number of repeated token.

Replace these identical tokens with;

(A) Special token (or flag, such as!).

(B) The number of repeated tokens.

(C) Token which counted.

ELSE

Output the token(s) as it to compressed data (file).

Step 4: End

A.2. Lempel Ziv Welch (LZW) Encoding Algorithm:

The steps for LZW compression algorithm are as follows:

Step1: Initialize the dictionary (string table, which consists of all possible

roots for selecting language and the unique code for each of them).

Step2: Reserve space of buffer called prefix (P) and current (C).

(P: contain sequence of characters (tokens) that precede one

character and C: current character)

Step3: Put the prefix (P) equals empty (only at the start).

Step4: Read next character (token) from input data buffer and put it in C.

Step5: IF prefix and current (P+C) present in dictionary

Make the string (P+C) as prefix to next one (P = P+C).

ELSE

Output equivalent compressed code for prefix (P) to the compressed file.

Add the string P+C to the dictionary (as new string and code).

Make current element (token) as prefix to next one (P = C).

Step6: IF there are more symbols in input data

Go to step 4.

ELSE

Output the code that denotes prefix (P) to output compressed

file.

END.

Appendix B

Original Image Files

The tested bitmap image files are shown in the following as mentioned in Tables 3 and 4.

Bub Bub5 Mark

Map Setup16

Eclipse Bal

Ger Tree

erlz: compression coding for palettized colored images compression coding for palettized colored...

Documents