ripping your s apart

Post on 21-Oct-2014

257 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

iText Summit 2012 by Mark Stephens, CEO/Developer at IDRsolutions, explaining "What you really need to know about the guts of your PDF files"

TRANSCRIPT

RIPPING YOUR PDF FILES APART

What you need to know about what goes on inside your PDF files

Mark Stephens

Thursday, 29 March 12

RIPPING YOUR PDF FILES APART

What you need to know about what goes on inside your PDF files

Mark Stephens

Thursday, 29 March 12

Mark’s Bio

Thursday, 29 March 12

Mark’s Bio

Thursday, 29 March 12

Mark’s Bio

Thursday, 29 March 12

Mark’s Bio

Working with Java and PDF since 1997

Thursday, 29 March 12

Mark’s Bio

Working with Java and PDF since 1997Founded IDRsolutions 1999

Thursday, 29 March 12

Mark’s Bio

Working with Java and PDF since 1997Founded IDRsolutions 1999Speaker at Seybold, Javaone, Business of Software

Thursday, 29 March 12

Mark’s Bio

Working with Java and PDF since 1997Founded IDRsolutions 1999Speaker at Seybold, Javaone, Business of Software

Thursday, 29 March 12

Mark’s Bio

Working with Java and PDF since 1997Founded IDRsolutions 1999Speaker at Seybold, Javaone, Business of SoftwareMA degree in Mediaeval History from St Andrews (how useless is that)

Thursday, 29 March 12

Mark’s Bio

Working with Java and PDF since 1997Founded IDRsolutions 1999Speaker at Seybold, Javaone, Business of Software

Ask me about Java, PDF, business or anything which happened before 1500 AD

MA degree in Mediaeval History from St Andrews (how useless is that)

Thursday, 29 March 12

BUT FIRST SOME KITTENS...

The support team at IDRsolutions are waiting for your call (maybe)

Thursday, 29 March 12

The PDF reference guide

Thursday, 29 March 12

Loading page 1124 of a file

WordRead pages 1-1123 (time passes - scroll bar shrinks)Found it (eventually)

Thursday, 29 March 12

Loading page 1124 of a file

PDFRead the metadata refs table(s) - where do I find all the objectsSkip to page 1124

WordRead pages 1-1123 (time passes - scroll bar shrinks)Found it (eventually)

Thursday, 29 March 12

Loading page 1124 of a file

PDFRead the metadata refs table(s) - where do I find all the objectsSkip to page 1124

WordRead pages 1-1123 (time passes - scroll bar shrinks)Found it (eventually)

PDF (in detail)Read the refs table(s) - where do I find all the objectsRead the Root object - points to the Pages objectRead object for page 1124 (tells me the linked font, image, content objects)Draw it

Thursday, 29 March 12

Your PDF file is a Tree

A root linked to all the branches

Thursday, 29 March 12

The PDF reference guide

Thursday, 29 March 12

The PDF reference guideLike you have never seen it before...

Thursday, 29 March 12

The PDF reference guideLike you have never seen it before...

Thursday, 29 March 12

The PDF reference guideLike you have never seen it before...

You can use vi or emacs if you preferThursday, 29 March 12

The PDF reference guideEnd of the file

Thursday, 29 March 12

The PDF reference guideLike you have never seen it before...

Thursday, 29 March 12

The PDF reference guide

Thursday, 29 March 12

The PDF reference guideLike you have never seen it before...

Thursday, 29 March 12

The PDF root objectLike you have never seen it before...

Thursday, 29 March 12

The PDF root objectLike you have never seen it before...

Thursday, 29 March 12

PDF files on the webIsn’t having the marker at the end a problem??

Thursday, 29 March 12

PDF files on the webNot if you create it properly

Thursday, 29 March 12

Key takeaways from the PDF structure

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file It is equally fast to load any part of it

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file It is equally fast to load any part of itIt is very easy to replace objects with new versions

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file It is equally fast to load any part of itIt is very easy to replace objects with new versionsThere are certain key locations - like at the end of a file

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file It is equally fast to load any part of itIt is very easy to replace objects with new versionsThere are certain key locations - like at the end of a fileYou should not edit it in a text editor

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file It is equally fast to load any part of itIt is very easy to replace objects with new versionsThere are certain key locations - like at the end of a fileYou should not edit it in a text editorIf you want to use PDF files across the Internet, there is a special mode to make these load the most important parts first.

Thursday, 29 March 12

Key takeaways from the PDF structure

We do not need to load the whole file It is equally fast to load any part of itIt is very easy to replace objects with new versionsThere are certain key locations - like at the end of a fileYou should not edit it in a text editorIf you want to use PDF files across the Internet, there is a special mode to make these load the most important parts first.Lots of features need you to setup the PDF file correctly.

Thursday, 29 March 12

Those PDF objects in more detail

All PDF objects have:-1. An ID number2. (Optional) A set of dictionary key pairs3. (Optional) A block of binary data.

Thursday, 29 March 12

Those PDF objects in more detail

All PDF objects have:-1. An ID number2. (Optional) A set of dictionary key pairs3. (Optional) A block of binary data.

Thursday, 29 March 12

PDF images are not Tiff, Png or JPeg

Thursday, 29 March 12

PDF images are not Tiff, Png or JPeg

Thursday, 29 March 12

A word on colour

Thursday, 29 March 12

A word on colour

DeviceRGBCalRGB

DeviceCMYKICC

SeparationDeviceN

DeviceGrayCalGray

LabPattern

Thursday, 29 March 12

PDF pages are ‘drawn’

Thursday, 29 March 12

PDF pages are ‘drawn’

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to black

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen(L*) Tj draw the text L*

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen(L*) Tj draw the text L*/T1_1 1Tf change font

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen(L*) Tj draw the text L*/T1_1 1Tf change font0.856 0 Td move to a different location onscreen

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen(L*) Tj draw the text L*/T1_1 1Tf change font0.856 0 Td move to a different location onscreen( = 100) Tj draw the text = 100

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen(L*) Tj draw the text L*/T1_1 1Tf change font0.856 0 Td move to a different location onscreen( = 100) Tj draw the text = 100 -0.324 -1.133Td move to a different location onscreen

Thursday, 29 March 12

PDF pages are ‘drawn’

0 0 0 1k set cmyk color of text to blackBT start of some text/T1_01Tf Use the font defined as T1_0 elsewhere0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen(L*) Tj draw the text L*/T1_1 1Tf change font0.856 0 Td move to a different location onscreen( = 100) Tj draw the text = 100 -0.324 -1.133Td move to a different location onscreen[(whit)6(e)] Tj draw the text white (put a gap between t and e)

Thursday, 29 March 12

Thursday, 29 March 12

PDF myth - files are cross platform

Only if you create them properly...

Thursday, 29 March 12

Obfuscation for idiots!

No-one will be able to guess the secret password

Thursday, 29 March 12

20 seconds later...

And the password is....

Thursday, 29 March 12

Lastly a plea

Not all PDF creation tools are equal

Thursday, 29 March 12

In summary

Thursday, 29 March 12

top related