cap6135: malware and software vulnerability analysis find software bugs cliff zou spring 2012

Download CAP6135: Malware and Software Vulnerability Analysis   Find Software Bugs Cliff Zou Spring 2012

Post on 21-Feb-2016




0 download

Embed Size (px)


CAP6135: Malware and Software Vulnerability Analysis Find Software Bugs Cliff Zou Spring 2012. Acknowledgement. This lecture is modified based on the lecture notes from: Dr. Dawn Song: CS161: computer security. Review. Memory-safety vulnerabilities Buffer overflow Format string - PowerPoint PPT Presentation


  • CAP6135: Malware and Software Vulnerability Analysis

    Find Software Bugs

    Cliff ZouSpring 2012

  • *AcknowledgementThis lecture is modified based on the lecture notes from:Dr. Dawn Song: CS161: computer security

  • *ReviewMemory-safety vulnerabilitiesBuffer overflowFormat stringInteger overflowRuntime detectionRuntime bounds checkPurify, Jones & kelly (string bound check)Expensive

    Runtime detection of overwriteStackguard, etc.Practical, but only cover certain types of attacks

    Runtime mitigation to make attacks hardRandomizationPractical, but not fool proof

  • *This Class: Software Bug FindingThe iPhone story

    Blackbox bug finding

    Whitebox bug finding

  • *IPhone Security FlawJul 2007: researchers at Independent Security Evaluators, said that they could take control of iPhones through a WiFi connection or by tricking users into going to a Web site that contains malicious code. The hack, the first reported, allowed them to tap the wealth of personal information the phones contain.

    Found by Charles MillerDr. Charlie Miller presented the details of the exploit at BlackHat in Las Vegas on August 2, 2007. The slides from this talk are also available.

    Details see:

  • *iPhone attackiPhone Safari downloads malicious web pageArbitrary code is run with administrative privilegesCan read SMS log, address book, call history, other dataCan perform physical actions on the phonesystem sound and vibrate the phone for a secondcould dial phone numbers, send text messages, or record audio (as a bugging device)Can transmit any collected data over network to attacker

  • *0days Are a Hacker ObsessionAn 0day is a vulnerability thats not publicly knownModern 0days often combine multiple attack vectors & vulnerabilities into one exploitMany of these are used only once on high value targets0day statisticsOften open for months, sometimes years

  • *How to Find a 0day?Step #1: obtain informationHardware, software informationSometimes the hardest step

    Step #2: bug findingManual audit(semi)automated techniques/toolsFuzz testing (focus of this lecture)

  • *The iPhone StoryStep #1: WebKit opensourceWebKit is an open source web browser engine. WebKit is also the name of the Mac OS X system framework version of the engine that's used by Safari, Dashboard, Mail, and many other OS X applications.

    Step #2: identify potential focus pointsFrom development site:The JavaScriptCore TestsIf you are making changes to JavaScriptCore, thereis an additional test suite you must run before landingchanges. This is the Mozilla JavaScript test suite.

    So we know what they use for code testingUse code coverage to see which portions of code is not well testedTools gcov, icov, etc., measure test coverage

  • *Results59.3% of 13622 lines in JavaScriptCore were covered79.3% of main engine covered54.7% of Perl Compatible Regular Expression (PCRE) coveredNext step: focus on PCREWrote a PCRE fuzzer (20 lines of perl)Ran it on standalone PCRE parser (pcredemo from PCRE library)Started getting errors: PCRE compilation failed at offset 6: internal error: code overflowEvil regular expressions crash mobileSafari

  • *The Art of FuzzingAutomatically generate test casesMany slightly anomalous test cases are input into a target interfaceApplication is monitored for errorsInputs are generally either file based (.pdf, .png, .wav, .mpg)Or network basedhttp, SNMP, SOAP

  • *Trivial ExampleStandard HTTP GET requestGET /index.html HTTP/1.1

    Anomalous requestsAAAAAA...AAAA /index.html HTTP/1.1GET ///////index.html HTTP/1.1GET %n%n%n%n%n%n.html HTTP/1.1GET /AAAAAAAAAAAAA.html HTTP/1.1GET /index.html HTTTTTTTTTTTTTP/1.1GET /index.html HTTP/

  • *Regression vs. FuzzingRegression: Run program on many normal inputs, look for badness.Goal: Prevent normal users from encountering errors

    Fuzzing: Run program on many abnormal inputs, look for badness.Goal: Prevent attackers from encountering exploitable errors

  • *Approach I: Black-box Fuzz TestingGiven a program, simply feed it random inputs, see whether it crashesAdvantage: really easyDisadvantage: inefficientInput often requires structures, random inputs are likely to be malformedInputs that would trigger a crash is a very small fraction, probability of getting lucky may be very low

  • *Enhancement I: Mutation-Based FuzzingTake a well-formed input, randomly perturb (flipping bit, etc.)Little or no knowledge of the structure of the inputs is assumedAnomalies are added to existing valid inputsAnomalies may be completely random or follow some heuristicse.g. remove NUL, shift character forwardExamples:ZZUF, very successful at finding bugs in many real-world programs,, GPF, ProxyFuzz, FileFuzz, Filep, etc.

  • *Example: fuzzing a pdf viewerGoogle for .pdf (about 1 billion results)Crawl pages to build a corpusUse fuzzing tool (or script to)1. Grab a file2. Mutate that file3. Feed it to the program4. Record if it crashed (and input that crashed it)

  • *Mutation-based Fuzzing In ShortStrengthsSuper easy to setup and automateLittle to no protocol knowledge requiredWeaknessesLimited by initial corpusMay fail for protocols with checksums, those which depend on challenge response, etc.

  • *Enhancement II: Generation-Based FuzzingTest cases are generated from some description of the format: RFC, documentation, etc.Using specified protocols/file format infoE.g., SPIKE by Immunity

    Anomalies are added to each possible spot in the inputsKnowledge of protocol should give better results than random fuzzing

  • *Generation-Based Fuzzing In ShortStrengthscompletenessCan deal with complex dependencies e.g. checksumsWeaknessesHave to have spec of protocolOften can find good tools for existing protocols e.g. http, SNMPWriting generator can be labor intensive for complex protocolsThe spec is not the codeOur goal is code testing, not spec testing

  • *Fuzzing ToolsHackers job made easy

    Input generationInput injectionBug detectionWorkflow automation

  • *Input GenerationExisting generational fuzzers for common protocols (ftp, http, SNMP, etc.)Mu-4000, Codenomicon, PROTOS, FTPFuzzFuzzing Frameworks: You provide a spec, they provide a fuzz setSPIKE, Peach, SulleyDumb Fuzzing automated: you provide the files or packet traces, they provide the fuzz setsFilep, Taof, GPF, ProxyFuzz, PeachSharkMany special purpose fuzzers already exist as wellActiveX (AxMan), regular expressions, etc.

  • *Input InjectionSimplestRun program on fuzzed fileReplay fuzzed packet traceModify existing program/clientInvoke fuzzer at appropriate pointUse fuzzing frameworke.g. Peach automates generating COM interface fuzzersE.g., PaiMei frameworkA reverse engineering framework

  • *Bug DetectionSee if program crashedType of crash can tell a lot (SEGV vs. assert fail)Run program under dynamic memory error detector (valgrind/purify)Catch more bugs, but more expensive per run.See if program locks upRoll your own checker e.g. valgrind skins

    Valgrind: for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail.

  • *Workflow AutomationSulley, Peach, Mu-4000 all provide tools to aid setup, running, recording, etc.Virtual machines can help create reproducable workload

  • *How Much Fuzz Is Enough?Mutation based fuzzers may generate an infinite number of test cases... When has the fuzzer run long enough?

    Generation based fuzzers may generate a finite number of test cases. What happens when theyre all run and no bugs are found?

  • *Code CoverageSome of the answers to these questions lie in code coverage

    Code coverage is a metric which can be used to determine how much code has been executed.

    Data can be obtained using a variety of profiling tools. e.g. gcov

  • *Types of Code CoverageLine/block coverageMeasures how many lines of source code have been executed.Branch coverageMeasures how many branches in code have been taken (conditional jmps)Path coverageMeasures how many paths have been taken

  • *ExampleRequires1 test case for line coverage2 test cases for branch coverage4 test cases for path coveragei.e. (a,b) = {(0,0), (3,0), (0,3), (3,3)}

  • *Code CoverageBenefits:How good is this initial file?Am I getting stuck somewhere?if(packet[0x10] < 7) { //hot path} else { //cold path}How good is fuzzer X vs. fuzzer YAm I getting benefits from running a different fuzzer?Problems:Code can be covered without revealing bugsMaybe the bug logic is not revealed by any testing inputs so far.

  • *Fuzzing Rules of ThumbProtocol specific knowledge very helpfulGenerational tends to beat random, better specs make better fuzzersMore fuzzers is betterEach implementation will vary, different fuzzers find different bugsThe longer you run, the more bugs you may findBest results come from guiding the processNotice where your getting stuck, use profiling!Code coverage can be very useful for guiding the processCan we do better?

  • *Approach II: Constraint-basedAutomatic Test Case GenerationLook inside the boxUse the code itself to guide the fuzzingAssert security/safety propertiesExplore different program execution paths to check for security propertiesChallenge:1. For a given path, need to check whether an input can trigger the bug, i.e., violate security property2. Find inputs that will go down different program execution paths

  • *Runn