present v0.2

Improvement of lossless Compression for JPEG files

Irina Bocharova, Kirill Yurkov,Mikhail Bogdanov, Roman Bolshakov, Alexander Buslaev,

Yuri Konoplev, Anrew Tereskin, Oleg FinkelshteynITMO

autumn 2010 - spring 2011

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 1 / 27

Agenda

Purpose

Schemes of encoder and decoder

encoding DC

encoding RUN’s and AC

Levenstein encoder

Arithmetic encoder

Results

Problems


Purpose

Realize a recoder of JPEG to reduce bit stream

Requirements: bit-to-bit corrsepondense


Scheme of encoder


Scheme of decoder


encoding DC (DC Prediction)

B

A

C

X?

?

P =

DCC , |DCB − DCA| < |DCB − DCC |

DCA, otherwise

x - P encoded by arithmetic encoder.


encoding DC ( zero map, numbers of nonzero encoding )

y0 y1 y2

y3 x

Context for encoding x :

y0 + λ1y1 + λ2y2 + λ3y3


AC blocks encoding


Runs and levels encoding

We need to encode the pairs: (l0, r0), (l1, r1), . . . , (ln, rn, )

The value n known to encoder. For encoding pair (li , ri ) we constructtwo dimensional context:

n

n − i


Arithmetic coding

Arithmetic + Adaptivemodel

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 10 /

27

Levenstein code

A universal code encoding the non-negative integers

It works so:code of 0 is "0 and if we want to encode a positive number we donext:

1 Init the step count var C to 12 Write a binary representation of the number without the leading "1"to

the beginning of the code.3 Let M be the number of bits written in step 2.4 If M is not 0, increment C, repeat from step 2 with M as the new

number.5 Write C "1"bits and a "0"to the beginning of the code.


27

Some samples


27

Some information about samples


27

Results and Comparison

Picture Size PackJpg PCAR

A10 842 KB 19.2 % 11.5 %

Afisha 213 KB 28.6 % 20.0 %

Bird 82 KB 17.7 % 9.4 %

Document 103 KB 29.7 % 25.4 %

Flower 5 KB 18.5 % 6.0 %

Monkey 30 KB 30.6 % 24.7 %

Portrait 63 KB 25.5 % 25.0 %


27

Problems (bit-to-bit)

We need to read and write JFIF (JPEG) files maintatining bitwiseidentity.

Two possible implementation paths:Full parser: file → internal structrures → file

Pros: very flexible, easy to process once we have the structureCons: implementing a writer adhering to the bitwise identityrequirement is difficult. High serialization overhead.

Stream encoder: leaves most of non-interesting metadata as is(compressing using general-purpose stream methods)

Pros: faster, no serialization code (decoder reuses the jpeg headerparser from encoder), guarantees exactness in metadataCons: we lose flexibility, save some redundant information (e.g.standard Huffman tables)

After several attempts, we settled on the latter solution which worksfor an estimate of 95% of JPEG files in the wild (for those we areunable to process, a diagnostic is provided)


27

Problems (Unknown alphabet size)

Starts from alphabet contains one symbol Ω = ζ,where ζ is escape symbol

For each new input symbol at+1

1 a ∈ Ω,encode a with probality distribution p(a) = τ(a)

t+1

2 a /∈ Ω

encode escape symbol with probability distribution p(a) = τ(a)t+1

encode a with Levenstein codeΩ = Ω ∪ a


27

Thanks

Questions ?


27

References

[Rissanen, J.J.; Langdon, G.G., 1979]Arithmetic codingIBM Journal of Research and Development, p: 149-162.

[Levenstein V.I., 1968]About redundancy and slowdown of difference coding of naturalnumbersProblems of cybernetics, Moscow, Science, p: 173-179.

[Krichevsky, R.E.; Trofimov V.K., 1981]The Performance of Universal EncodingIEEE Trans. Information Theory, Vol. IT-27, No. 2, pp. 199–207.


27

other information


27

present v0.2

Documents

big team

compression of jpeg

compression of jpeg27

compression of jpegautumn

jpeg headerparser

jfif jpeg

recoder of jpeg

jpeg lesirina bocharova