maintaining a malware collection
DESCRIPTION
Presented at the International Antivirus Testing Workshop 2007 by Dr. Vesselin Bontchev, Antivirus Researcher, FRISK Software.TRANSCRIPT
International Antivirus Testing Workshop, Reykjavik, 2007
104/13/23
Maintaining a Malware Collection
Dr. Vesselin Bontchev, anti–virus researcher
FRISK Software InternationalThverholt 18, IS-105 Reykjavik, ICELAND
National Laboratory of Computer VirologyBulgarian Academy of Sciences
Acad. G. Bontchev Str., Bl. 2, 1113, BULGARIA
E–mail: [email protected]
International Antivirus Testing Workshop, Reykjavik, 2007
204/13/23
Introduction
• The Naïve Idea of AV Product Testing– Get a large set of files from somewhere– Call it a “virus collection”– It’s good if most scanners report most of the
files as “something”– It’s even better if some files are not reported by
some or all of the popular scanners as anything– Run an on-demand scanner on the set– Classify the results by some criteria (e.g.,
number of detected objects) and publish them
International Antivirus Testing Workshop, Reykjavik, 2007
304/13/23
Let’s Concentrate onthe “Collection” Part
• Face It:– Most of the collections you can get easily consist mostly
of crap (Vx)
– Even the good collections contain some crap (AV)
– Or are inadequate (WLO)
• So– Analyze the contents and remove the crap
– Replicate the viruses
– And only then use the result for testing!
International Antivirus Testing Workshop, Reykjavik, 2007
404/13/23
Analyzing the Contents
• Unpack the Archives– Needs LOTS of disk space!– Sometimes there are nested archives– And/or encrypted ones (“infected”, “virus”,
etc.)– The file extensions are misleading! (Or non-
existent. Morons.)– Some files need decoding– Basically – look at the damn things! Do not
assume
International Antivirus Testing Workshop, Reykjavik, 2007
504/13/23
More Analyzing• Remove the Duplicates
– Of which there are LOTS! Both in the same collection and across collections
– You need a duplicate file locator• The commercial ones are crap! Big, slow, unreliable
and inadequate
• So, write your own
• Beware: with the huge number of files in the contemporary collections, CRC-32 is not adequate as a hash function (due to collisions). Use MD4 (not MD5, MD2 or SHA, because MD4 is faster and is secure enough)
International Antivirus Testing Workshop, Reykjavik, 2007
604/13/23
Even More Analyzing• Remove the Corrupted Files
– Zapped beginnings– Entry points going nowhere– Partially disinfected stuff– Breakpoints in the code– Just random garbage– Basically – stuff that doesn’t work– If you don’t know what the above means or how to
detect its presence, you’re not qualified to test AV products – find a different job, or learn it first
– Unfortunately, if you do know, that still doesn’t necessarily mean that you’re qualified to test AV products!
International Antivirus Testing Workshop, Reykjavik, 2007
704/13/23
Still More Analyzing
• Remove the Envelopes– “Immunizations”– Packers
• Sometimes you MUST NOT remove these!
• So, how do you know when to remove them?
– Sandwiches
International Antivirus Testing Workshop, Reykjavik, 2007
804/13/23
And Even More Analyzing
• Remove the Non-Malware– Utilities
– Legitimate tools used by malware
– Simulators
– False positives
– Buggy programs
– Just text files and pictures
– Build a database of unwanted crap, because you’ll keep receiving it over and over! The TRASHBIN tool
– Mike will tell you more on this subject
International Antivirus Testing Workshop, Reykjavik, 2007
904/13/23
And Finally…
• Separate the Viruses from the Non-Viral Malware– Trojan Horses– Dialers– Password stealers– Exploits– Kits– Germs– Injectors– Intended
International Antivirus Testing Workshop, Reykjavik, 2007
1004/13/23
Basic Rules of Thumb
• Know what you’re doing!
• Look at everything personally!
• Don’t put a file in your collection, unless you can explain why you have done so
• “A scanner reports it” or “Found it in a virus collection” are NOT good explanations!
International Antivirus Testing Workshop, Reykjavik, 2007
1104/13/23
And Now – the Real Work Begins
• Replicate All the Viruses Yourselves!– Yes, all of them!– Yes, yourselves!– If you cannot replicate something, either figure
out why not and then replicate it, or don’t put it in the virus collection!
– Yes, this requires multiple environments (OSes and devices), a lot of knowledge and a lot of work
– Did I ever say that maintaining a malware collection was easy?!
International Antivirus Testing Workshop, Reykjavik, 2007
1204/13/23
More About Replication
• Viruses Fail to Replicate for Various Reasons– CPU dependency
– OS dependency
– Memory dependency
– Date/time dependency
– File system dependency
– There are many others!
– Basically – analyze the damn thing and figure it out when/if it replicates
– If you cannot, you are not qualified to do AV product testing!
International Antivirus Testing Workshop, Reykjavik, 2007
1304/13/23
Order the Collection
• Separate by Malware Type• Group by Malware Family• Separate by Malware Variant• Rules of Thumb:
– If two samples contain the same malware, they should be in the same directory
– If two samples contain two different malware programs, they should be in two different directories
– You need to be able to tell if two samples contain the same malware or not. Obvious, isn’t it?
– Scanners can help but are by far not sufficient
International Antivirus Testing Workshop, Reykjavik, 2007
1404/13/23
Testing
• With a Collection, You Can Test:– Detection– Heuristic detection (Careful!)– Disinfection– Memory scanning (does anybody do that?)– Identification (Difficult!)– On-demand and on-access– Test on various platforms– Scanning speed? (Ooops! Never!!!)– Forget the WLO/ItW crap!
International Antivirus Testing Workshop, Reykjavik, 2007
1504/13/23
• That’s All, Folks
• Simple, Isn’t It?
• Just Some Common Sense– Which is so uncommon nowadays
• A Lot of Knowledge and Experience
• And Work, Work, Work!
• Questions?
Conclusion