address obfuscation: an efficient approach to combat a broad range of memory error exploits

Presentation for CDA6938 Network Security, Spring 2006Presentation for CDA6938 Network Security, Spring 2006

Address Obfuscation: An Efficient Approach to Combat a Broad Range of

Memory Error Exploits

Authors: Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar

Publish: Usenix Security Symposium 2003

Presented by: Hua Zhang


Contributions

• It systematically protects against a wide range of attacks which exploit memory programming errors

• It can be easily applied to existing legacy code without modifying the source code, or the underlying operating systems

• It can be applied selectively to protect security-critical applications without needing to change the rest of the system

• The transformation is fast and introduce only a low runtime overhead


Outline

• Introduction• Stack Smashing• Address Obfuscation – Transformations• Address Obfuscation – Implementation• Address Obfuscation – Effectiveness• Conclusion


Introduction

• Attacks exploit memory programming errors are one of today’s most serious security threats– It requires the attacker to have an in-depth

understanding of the internal details of a victim program

• Program Obfuscation– A technique to prevent such a understanding– Code Obfuscation

• Prevent such an understanding and reverse engineering

– Address Obfuscation• Each time the transformed code is executed, the virtual

addresses of the code and data are randomized.


Memory Layout of a Typical Binary

• Code– Machine code– Read only

• Data and BSS– Global variables– Initialized or Un-initialized– Not executable

• Stack– Local variables– Parameters, Return addresses

• Heap– Memory area when allocate

memory during execution, e.g. malloc()


Stack Smashing

• A stack-allocated buffer can be intentionally overflowed to overwrite the return address

• The attacker must– Guess the right value to put into the faked return address– Guess the location of the return address on the stack relatively

to the overflowed buffer


How Address Obfuscation Works

• Two ways to exploit a memory error– Overwriting pointer

• Code Pointer and Data Pointer• Point to the address of data or code chosen by the attacker• Require the attacker to know the absolute address of such

data or code

– Overwriting non-pointer Data• Code is protected and can not be overwritten• A example is to overwrite the arguments to chmod and

execve• Require the attacker to know the relative distance between a

buffer and the location of the data to be overwritten


Transformations in Address Obfuscation - 1

• Randomize the base address of memory regions– Randomize the base address of the stack

• All addresses on the stack are randomized• Make it very difficult to find the address of injected code and

pointer the return address to it

– Randomize the base address of the heap• Against attacks where code is injected to the heap, and then

a buffer overflow to pointer to this address

– Randomize the starting address of dynamically-linked libraries

– Randomize the locations of routines and static data in the executable



• Permute the order of variables/routines– Make it difficult to overwrite data without corrupting

other data that is critical for continued execution of the program

– Three possible ways• Permute the order of local variables in a stack frame• Permute the order of static variables• Permute the order of routines in shared libraries or the

routines in the executable



• Introduce random gaps between objects– Locations of objects can be randomized– Four possible ways

• Introduce random padding into stack frames• Introduce random padding between successive malloc

allocation requests• Introduce random padding between variables in the static

area• Introduce gaps within routines, and add jump instructions to

skip over these gaps


Implementation Issues - 1

• When to perform the transformations– Compile-time, link-time, installation-time or load-time– Compile-time means better performance– Load-time does not require special compilers and

linkers, and can be applied to binary program without source code

– Link-time is chosen by this paper


Implementation Issues - 2

• When to determine the transformation amounts– Transformation time

• Best performance• But the randomization will be the same every time the

program is executed, a possible solution is periodically re-transformation

– Beginning of program execution– Continuously changing during execution

• Most difficult to attack• Not good for performance

– Transformation time is chosen by the paper


Implementation Approach - 1

• Approach– At binary level– Inserting additional code with the LEEL binary-editing

tool• Only rewriting routines that can be completely analyzed

– Safe rewriting of machine code requires understanding of the complete control-flow graph, which is difficult because of

• Data may be intermixed with code• Indirect jumps and calls



• Stack base address randomization– By adding extra code to the text segment of the

program– Skipped from execution by inserting a jump instruction

at the beginning of the main routine– Decrement the stack pointer by a random number

between 1 and 108 – This gap is write-protected using the mpprotect

system call– Overflow beyond the base of the stack into this area

will cause crash



• DLL base address randomization– To prevent attacks that jump to library code

• Two options– Dynamically randomize library addresses using mmap

• Implemented by a wrapper to mmap• Location of shared memory will be different for every

execution

– Statically randomize library addresses at link-time• Implemented by dynamically linking the executable with a

dummy shared library• No change to the loader or rest of the system



• Text/Data segment randomization– Prevent attacks

• modify a static variable• Jump to existing program code

• Two approaches– Compile to a shared library and create a new main to

load this library and call the old main• Code in shared library are position independent• Less efficient than address dependent counterpart

– Relocate program’s code and data at link-time• No performance overhead



• Random stack frame padding– Pushing extra storage onto the stack during the

initialization phase of each subroutine• Two issues

– Padding size• Static – no runtime overhead• Dynamic

– Placement of padding• Between the base pointer and local variables• Before parameters to function



• Heap Randomization– Code that will allocate a randomly-sized large chunk

of memory is added– Wrapper functions is used to intercept calls to malloc– Dynamical memory allocation requests are randomly

increases by 0 to 25%


Effectiveness

• It is critical to have an estimate of the increase in attacker work load, as– Address obfuscation is foolproof– A probabilistic technique

• Mathematical analysis of the effectiveness on different kinds of attacks are conducted in the paper

For example, success rate of a single attack – Stack smashing – 4/(2.5*104)– Existing code attacks – 4x10-5


Conclusion

• Strong Points– A comprehensive study of approaches for address

obfuscation– A real tools is implemented– Mathematical analysis on the effectiveness

• Weak Points– Insufficient introduction on background info– Only some routines can be transformed– Transformed binaries or the memory map of these

binaries can be accessed by the attacker to extract random values from the binary


Questions?

address obfuscation: an efficient approach to combat a broad range of memory error exploits

Documents