present v0.2

Improvement of lossless Compression for JPEG files

Irina Bocharova, Kirill Yurkov,Mikhail Bogdanov, Roman Bolshakov, Alexander Buslaev,

Yuri Konoplev, Anrew Tereskin, Oleg FinkelshteynITMO

autumn 2010 - spring 2011

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 1 / 27

Agenda

Purpose

Schemes of encoder and decoder

encoding DC

encoding RUN’s and AC

Levenstein encoder

Arithmetic encoder

Results

Problems

Purpose

Realize a recoder of JPEG to reduce bit stream

Requirements: bit-to-bit corrsepondense

Scheme of encoder

Scheme of decoder

encoding DC (DC Prediction)

DCC , |DCB − DCA| < |DCB − DCC |

DCA, otherwise

x - P encoded by arithmetic encoder.

encoding DC ( zero map, numbers of nonzero encoding )

y0 y1 y2

Context for encoding x :

y0 + λ1y1 + λ2y2 + λ3y3

AC blocks encoding

Runs and levels encoding

We need to encode the pairs: (l0, r0), (l1, r1), . . . , (ln, rn, )

The value n known to encoder. For encoding pair (li , ri ) we constructtwo dimensional context:

n − i

Arithmetic coding

Arithmetic + Adaptivemodel

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 10 /

Levenstein code

A universal code encoding the non-negative integers

It works so:code of 0 is "0 and if we want to encode a positive number we donext:

1 Init the step count var C to 12 Write a binary representation of the number without the leading "1"to

the beginning of the code.3 Let M be the number of bits written in step 2.4 If M is not 0, increment C, repeat from step 2 with M as the new

number.5 Write C "1"bits and a "0"to the beginning of the code.

Some samples

Some information about samples

Results and Comparison

Picture Size PackJpg PCAR

A10 842 KB 19.2 % 11.5 %

Afisha 213 KB 28.6 % 20.0 %

Bird 82 KB 17.7 % 9.4 %

Document 103 KB 29.7 % 25.4 %

Flower 5 KB 18.5 % 6.0 %

Monkey 30 KB 30.6 % 24.7 %

Portrait 63 KB 25.5 % 25.0 %

Problems (bit-to-bit)

We need to read and write JFIF (JPEG) files maintatining bitwiseidentity.

Two possible implementation paths:Full parser: file → internal structrures → file

Pros: very flexible, easy to process once we have the structureCons: implementing a writer adhering to the bitwise identityrequirement is difficult. High serialization overhead.

Stream encoder: leaves most of non-interesting metadata as is(compressing using general-purpose stream methods)

Pros: faster, no serialization code (decoder reuses the jpeg headerparser from encoder), guarantees exactness in metadataCons: we lose flexibility, save some redundant information (e.g.standard Huffman tables)

After several attempts, we settled on the latter solution which worksfor an estimate of 95% of JPEG files in the wild (for those we areunable to process, a diagnostic is provided)

Problems (Unknown alphabet size)

Starts from alphabet contains one symbol Ω = ζ,where ζ is escape symbol

For each new input symbol at+1

1 a ∈ Ω,encode a with probality distribution p(a) = τ(a)

2 a /∈ Ω

encode escape symbol with probability distribution p(a) = τ(a)t+1

encode a with Levenstein codeΩ = Ω ∪ a

Thanks

Questions ?

References

[Rissanen, J.J.; Langdon, G.G., 1979]Arithmetic codingIBM Journal of Research and Development, p: 149-162.

[Levenstein V.I., 1968]About redundancy and slowdown of difference coding of naturalnumbersProblems of cybernetics, Moscow, Science, p: 173-179.

[Krichevsky, R.E.; Trofimov V.K., 1981]The Performance of Universal EncodingIEEE Trans. Information Theory, Vol. IT-27, No. 2, pp. 199–207.

other information

present v0.2

big team

compression of jpeg

compression of jpeg27

compression of jpegautumn

jpeg headerparser

jfif jpeg

recoder of jpeg

jpeg lesirina bocharova

Documents

timorex gold tomate v0.2

mmekni poster v0.2

otc mau selenium v0.2

bastardos v0.2

libro afcla v0.2

edi issue list v0.2

facts intro v0.2

tarkeeb capabilities v0.2

개발자 사원 교육 ( v0.2)

pensions v0.2

grafický manuál v0.2

i.grecea - moartea lebedei [v0.2]

topera v0.2

php lab manual v0.2

sermes electric systems v0.2

thesis concept km v0.2

mobile internet figures v0.2

nsgic annual status_briefing v0.2

musculoskeletal v0.2

plataforma cloudintelligence v0.2