present v0.2

19
Improvement of lossless Compression for JPEG files Irina Bocharova, Kirill Yurkov, Mikhail Bogdanov, Roman Bolshakov, Alexander Buslaev, Yuri Konoplev, Anrew Tereskin, Oleg Finkelshteyn ITMO autumn 2010 - spring 2011 -: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 1 / 27

Upload: d4k

Post on 23-Jun-2015

246 views

Category:

Documents


2 download

DESCRIPTION

... it's very interesting

TRANSCRIPT

Page 1: Present v0.2

Improvement of lossless Compression for JPEG files

Irina Bocharova, Kirill Yurkov,Mikhail Bogdanov, Roman Bolshakov, Alexander Buslaev,

Yuri Konoplev, Anrew Tereskin, Oleg FinkelshteynITMO

autumn 2010 - spring 2011

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 1 / 27

Page 2: Present v0.2

Agenda

Purpose

Schemes of encoder and decoder

encoding DC

encoding RUN’s and AC

Levenstein encoder

Arithmetic encoder

Results

Problems

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 2 / 27

Page 3: Present v0.2

Purpose

Realize a recoder of JPEG to reduce bit stream

Requirements: bit-to-bit corrsepondense

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 3 / 27

Page 4: Present v0.2

Scheme of encoder

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 4 / 27

Page 5: Present v0.2

Scheme of decoder

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 5 / 27

Page 6: Present v0.2

encoding DC (DC Prediction)

B

A

C

X?

?

P =

DCC , |DCB − DCA| < |DCB − DCC |

DCA, otherwise

x - P encoded by arithmetic encoder.

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 6 / 27

Page 7: Present v0.2

encoding DC ( zero map, numbers of nonzero encoding )

y0 y1 y2

y3 x

Context for encoding x :

y0 + λ1y1 + λ2y2 + λ3y3

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 7 / 27

Page 8: Present v0.2

AC blocks encoding

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 8 / 27

Page 9: Present v0.2

Runs and levels encoding

We need to encode the pairs: (l0, r0), (l1, r1), . . . , (ln, rn, )

The value n known to encoder. For encoding pair (li , ri ) we constructtwo dimensional context:

n

n − i

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 9 / 27

Page 10: Present v0.2

Arithmetic coding

Arithmetic + Adaptivemodel

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 10 /

27

Page 11: Present v0.2

Levenstein code

A universal code encoding the non-negative integers

It works so:code of 0 is "0 and if we want to encode a positive number we donext:

1 Init the step count var C to 12 Write a binary representation of the number without the leading "1"to

the beginning of the code.3 Let M be the number of bits written in step 2.4 If M is not 0, increment C, repeat from step 2 with M as the new

number.5 Write C "1"bits and a "0"to the beginning of the code.

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 11 /

27

Page 12: Present v0.2

Some samples

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 12 /

27

Page 13: Present v0.2

Some information about samples

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 13 /

27

Page 14: Present v0.2

Results and Comparison

Picture Size PackJpg PCAR

A10 842 KB 19.2 % 11.5 %

Afisha 213 KB 28.6 % 20.0 %

Bird 82 KB 17.7 % 9.4 %

Document 103 KB 29.7 % 25.4 %

Flower 5 KB 18.5 % 6.0 %

Monkey 30 KB 30.6 % 24.7 %

Portrait 63 KB 25.5 % 25.0 %

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 14 /

27

Page 15: Present v0.2

Problems (bit-to-bit)

We need to read and write JFIF (JPEG) files maintatining bitwiseidentity.

Two possible implementation paths:Full parser: file → internal structrures → file

Pros: very flexible, easy to process once we have the structureCons: implementing a writer adhering to the bitwise identityrequirement is difficult. High serialization overhead.

Stream encoder: leaves most of non-interesting metadata as is(compressing using general-purpose stream methods)

Pros: faster, no serialization code (decoder reuses the jpeg headerparser from encoder), guarantees exactness in metadataCons: we lose flexibility, save some redundant information (e.g.standard Huffman tables)

After several attempts, we settled on the latter solution which worksfor an estimate of 95% of JPEG files in the wild (for those we areunable to process, a diagnostic is provided)

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 15 /

27

Page 16: Present v0.2

Problems (Unknown alphabet size)

Starts from alphabet contains one symbol Ω = ζ,where ζ is escape symbol

For each new input symbol at+1

1 a ∈ Ω,encode a with probality distribution p(a) = τ(a)

t+1

2 a /∈ Ω

encode escape symbol with probability distribution p(a) = τ(a)t+1

encode a with Levenstein codeΩ = Ω ∪ a

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 16 /

27

Page 17: Present v0.2

Thanks

Questions ?

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 17 /

27

Page 18: Present v0.2

References

[Rissanen, J.J.; Langdon, G.G., 1979]Arithmetic codingIBM Journal of Research and Development, p: 149-162.

[Levenstein V.I., 1968]About redundancy and slowdown of difference coding of naturalnumbersProblems of cybernetics, Moscow, Science, p: 173-179.

[Krichevsky, R.E.; Trofimov V.K., 1981]The Performance of Universal EncodingIEEE Trans. Information Theory, Vol. IT-27, No. 2, pp. 199–207.

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 18 /

27

Page 19: Present v0.2

other information

-: big team from ITMO :- () Compression of JPEGautumn 2010 - spring 2011 19 /

27