ieee-wvu, anchorage - 2008 1 the unseen challenge data sets anderson rocha walter scheirer siome...
Post on 29-Dec-2015
221 Views
Preview:
TRANSCRIPT
1IEEE-WVU, Anchorage - 2008
The Unseen Challenge Data Sets
Anderson Rocha Walter Scheirer
Siome Goldenstein Terrance Boult
2IEEE-WVU, Anchorage - 2008
The Data Sets
• Two data sets are provided– PNG: lossless compression– JPEG: lossy compression
• Prevalence of images on the Internet– Sources: Google images, Yahoo Images,
and Flickr
3IEEE-WVU, Anchorage - 2008
Message Sizes• For each tool, we provide four different
embedding size:– Tiny: < 5% of the channel capacity– Small: > 5% & < 15% of the channel capacity– Medium: > 15% & < 40% of the channel capacity– Large: > 40% of the channel capacity
• For the PNG set, the message size is explicitly stated
• For the JPEG set, the message size is NOT stated
4IEEE-WVU, Anchorage - 2008
Message Content
• Random bit sequences
• Snippets of mp3 songs
• Plain text
• Other images
A B C
5IEEE-WVU, Anchorage - 2008
Categories
• Each set consists of clean and stego images• Clean set
– Modified: cropping, overlay, object-appending– Non-modified: original
• Stego set– 4 categories for JPEG, 3 categories for PNG, one
for each tool
6IEEE-WVU, Anchorage - 2008
Categories
• JPEG subcategories– Stego
• Animals• Business• Maps• Natural• Tourist• Vacation
– Clean• Misc
QuickTime™ and a decompressor
are needed to see this picture.
7IEEE-WVU, Anchorage - 2008
Clean Manipulated Images
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Object Appending
Image Cropping
Overlay
8IEEE-WVU, Anchorage - 2008
PNG Tools
• Camaleão (http://www.ic.unicamp.br/~rocha/sci/stego)
– Simple LSB insertion/modification software
– Uses cyclic permutations and block ciphering to hide messages in LSBs
• SecurEngine
(http://www.sharewareplaza.com/SecurEngine-download_4268.html)
– Incorporates 5 crypto algorithms: Blowfish, Gost, Vernam, Cast256, and Mars
– LSB encoding
9IEEE-WVU, Anchorage - 2008
PNG Tools
• Stash-It (http://www.smalleranimals.com/stash.htm)
– Windows based stego tool– Simple LSB insertion/modification software– No encryption feature
10IEEE-WVU, Anchorage - 2008
JPEG Tools• F5
(http://www.inf.tu-dresden.de/~aw4)– Resilient to 2 statistical attack– Instead of replacing LSBs directly, F5 decreases the
absolute value of the DCT coefficients– Chooses DCT coefficients randomly– Matrix embedding
• JPHide(http://linux01.gwdg.de/~alatham)– Uses blowfish to generate a stream of pseudo-random
control bits to define bit encodings – Large embeddings trivial to detect
11IEEE-WVU, Anchorage - 2008
JPEG Tools• JSteg
(http://zooid.org/~paul/crypto/jsteg)– 40 bit RC4 Encryption– Channel capacity determination– LSB encoding in quantized DCT coefficients
• Outguess(http://www.outguess.org/detection.php)– Preserves statistics based on frequency counts– Seed based iterator available to choose embedding locations– Change minimization calculation for each seed– Remains one of the most difficult tools to detect
12IEEE-WVU, Anchorage - 2008
PNG Data Set - Breakdown
• TrainingTiny Small Medium Large
Camaleão 400 400 400 400
SecurEngine 380 387 385 380
Stash-It 399 400 400 400
Total 1,179 1,187 1,185 1,180
Non-Modified
2,000
Append-Modified
666
Crop-modified
667
Overlay-modified
667
Total 4,000
4,731 total images in the PNG stego category
4,000 total images in the PNG clean category
13IEEE-WVU, Anchorage - 2008
PNG Data Set - Breakdown
• TestingTiny Small Medium Large
Camaleão 250 250 250 250
SecurEngine 250 250 250 243
Stash-It 250 250 250 250
Total 750 750 750 743
2,993 total images in the PNG stego category
14IEEE-WVU, Anchorage - 2008
JPEG Data Set - Breakdown
• TrainingF5 JPHide JSteg Outguess
Animals 1,732 2,127 244 436
Business 3,779 - 124 11
Maps 3,361 - 112 68
Natural 5,211 1,113 232 70
Tourist 4,968 1,721 268 160
Vacation 2,960 353 100 35
Total 22,011 5,314 1,080 780
29,185 total images in the JPEG stego category
15IEEE-WVU, Anchorage - 2008
JPEG Data Set - Breakdown
• TrainingAnimals-Non-modified 61
Business-Non-modified 31
Maps-Non-modified 28
Natural-Non-modified 58
Tourist-Non-modified 67
Vacation-Non-modified 25
Misc-Non-modified 1,996
Misc-Append-modified 665
Misc-Crop-modified 666
Misc-Overlay-modified 662
Total 4,259
29,185 total images in the JPEG stego category
16IEEE-WVU, Anchorage - 2008
JPEG Data Set - Breakdown
• TestingTiny Small Medium Large
F5 250 250 250 250
JPHide 240 322 318 101
Jsteg 198 202 199 198
Outguess 0.2 481 421 - -
Outguess 0.13 491 425 - -
Total 1,660 1,620 767 549
4,596 total images in the JPEG stego category
17IEEE-WVU, Anchorage - 2008
Sample Usage: stegdetect
• JPEG Training SetDetected, C Detected, I No Steg
Clean - 360 3809
F5 22011 0 0
JPHide 4506 604 204
JSteg 638 421 21
Outguess 0.13 220 10 295
Outguess 0.2 13 5 237
Detected, C: correct algorithm detected
Detected, I: incorrect algorithm detected
Overall false detect rate for the clean image set is 8.6%
18IEEE-WVU, Anchorage - 2008
Sample Usage: stegdetect
• JPEG Testing SetDetected, C Detected, I No Steg
Clean - 899 79
F5 0 216 784
JPHide 333 153 495
JSteg 444 353 0
Outguess 0.13 206 31 679
Outguess 0.2 32 37 833
Overall false detect rate for the clean image set is 8.0%
19IEEE-WVU, Anchorage - 2008
Sample Usage: stegdetect
• Detailed results for JPHide Test Set
Large Medium Small Tiny
Detected, C 51 249 25 8
Detected, I 47 61 35 10
Negative 3 8 262 222
20IEEE-WVU, Anchorage - 2008
Sample Usage: stegdetect
• Conclusions– Significant differences between the results
of training and testing• Weaker performance overall for testing• Designed difficulty of testing set
– Stegdetect performs poorly for large embeddings (non-intuitive), as well as small and tiny embeddings (expected)
21IEEE-WVU, Anchorage - 2008
The Unseen Challenge Data Sets
• Lossy (JPEG) and Lossless (PNG) imagery
• 3 tools for PNG set, 4 tools for JPEG set
• 4 distinct embedding sizes for PNG, varying sizes for JPEG
• Clean imagery across all sets
22IEEE-WVU, Anchorage - 2008
The Unseen Challenge Data Sets
• Valid approaches for use:– Detection– Detection and recovery (size or content)– Detection and destruction– Fusion
No standard data set exists for steg evaluation!
This set is a step in that direction!
top related