application intrusion detection anita jones university of virginia

Application Intrusion Detection

Anita Jones

University of Virginia

Jan 2001 Application Intrusion Detection 2

Review of Research Project

• Application Intrusion Detection rationale: – application has richer structure & semantics than OS– so, intrusion leaves a potentially more evident trail

• Challenge– What is the nature of that rich semantics?– How and when can you use it to detect intrusion?– Can Appln. ID systems be substantially “stronger” than

OS ID systems?

• Three approaches taken in this research project


Approach 1 - Unique Threats

• Develop & demonstrate a technique that enables– determination of intrusions appropriate to appln.– in a form that delivers a definition of sensors that can

be embedded, & relations that can be computed by monitors to detect those intrusions

– so that, generic ID monitors can be injected into the appln. Code

• Work complete: Illustrated technique on two example systems

• Slides 4 - 26


State of Practice

• Assume the Operating System as the basis• Use what an OS knows about -- OS semantics

– users, processes, devices– controls on access and resource usage

• Record events in the life of the OS• Use OS audit records

OS Intrusion Detection Systems -- OS IDS


OS IDS - the two Approaches

• Anomaly Detection– assume that behavior can be characterized

• statically -- by known, fixed data encoding• dynamically -- by patterns of event sequences or by

threshold limits on event occurrences (e.g. system calls)

– detect errant behavior that deviates from expected, normal behavior

• Misuse Detection– look for known patterns (signatures) of intrusion,

typically as the intrusion unfolds


OS IDS - the two Approaches

• Anomaly Detection– Static: e.g. Tripwire, Self-Nonself– Dynamic: e.g. NIDES, Pattern Matching (UNM)

• Misuse Detection– e.g. NIDES, MIDAS, STAT

• Networks are handled as “extensions”– I.e. Use same two approaches listed above– Centralized: e.g. DIDS, NADIR, NSTAT– Decentralized: e.g. GrIDS, EMERALD

• Few genuinely new approaches


Take a Checkpoint

• An OS exists simply to manage resources

• Systems exists to perform some application, with the OS merely as a support

• Focus on the application, not the OS

The problem is to distinguish abusive behavior in the context of the application -- possibly by a legitimate user

An OS IDSis inherently limited

by the semantics of the OS

You can’t talk about somethingfor which you have no words!

A Complementary ApproachAssume that the OS IDS does its job.

Use the semantics of the application as a further basis

for detection of intruders

Application Intrusion DetectionApp IDS


App IDS -- What’s Possible?

• How do you define intrusion in the context of (in the semantics of) an application?

• Can an intrusion be “seen”? – Seen in progress?

• Can intrusive behavior be linked to users?• Is there a richer notion of history (of intrusion)?• Is there a richer notion of “abused system state”?


App IDS -- Guiding Questions

• Opportunity – what types of intrusions can be detected by an AppIDS?

• Effectiveness – how well can those intrusions be detected by an AppIDS?

• Cooperation – how can an AppIDS cooperate with the OS IDS to be more effective than either alone?


Case Studies

• Electronic Toll Collection– hierarchical – numerous devices

distributed– complementary device

state values – monitors external

behavior– accounting component

• Health Record Management– non-hierarchical; modular– no devices beyond

controlling computer– limited access in app’n– bound by known physical

& medical realities – no financial component – complex scheduling

components


Electronic Toll Collection (ETC)• Devices

– Toll Lane• Tag Sensor • Automated Coin Basket• Toll Booth Attendant• Loop Sensor • Axle Reader• Weigh-In-Motion Scale• Traffic Signal• Video Camera - Vehicle

Tag (Active/Passive)


ETC - Hierarchy

T o ll L a ne T o ll L a ne

T o ll P la za T o ll P la za

T o ll L a ne T o ll L a ne T o ll L a ne

T o ll P la za O th e r D e v ices

T o ll M a n ag e m en t C e n te r


Need Analysis Technique

• What intrusions make sense in app’n terms?

• How do you derive them?

• Is there a disciplined analysis approach that ensures that “all” intrusions are found?

• Once an intrusion is defined, is there a way to monitor for it within the application?

• Is there a relation to the OS, and information that it has?


ETC - One Approach

• Start with the known threat categories• How can they be manifested in app’n terms• Define app’n specific intrusions• Determine method that abuser would use• Define relations based on app’n state values that

can be the basis for monitoring method

Threat Categories

Specific Intrusions

Methods Relations


Threat Categories

• Denial of Service• Disclosure• Manipulation• Masqueraders• Replay• Repudiation• Physical Impossibilities• Device Malfunctions


ETC - Appl’n Specific Intrusions

• Annoyance (3 methods)

• Steal Electronic Money (10 methods)

• Steal Vehicle (4 methods)

• Device Failure (1 method)

• Surveillance (2 methods)

Threat Categories

Specific Intrusions

Methods Relations


ETC Intrusion - Steal Service

Rel#

RelationRelation

DescriptionExecutionLocation

Steal Service

No tagand

coverplate

Copytag

Packet filterthat discards

all a tag'spackets

1 Tag vs. Historical (Time) (stat) TBP/TMC X4 Tag vs. Historical (Sites) (stat) TMC X5 Tag vs. Time (rule) TMC X9 Tag vs. Axles (rule) TBL X X X25 Unreadable Tags (stat) TBP/TMC X

3 methods5 relations


Health Record Management (HRM)

• Components– Patient Records– Orders – lists of all requests for drugs,

tests, or procedures– Schedule – schedule for rooms for patient

occupancy, laboratory tests, or surgical procedures (does not include personnel)

• Users– doctors, laboratory technicians, and nurses


HRM - App’n Specific Intrusions

• Annoyance (4 methods)

• Steal Drugs (1 method)

• Patient Harm (6 methods)

• Surveillance (2 methods)

Threat Categories

Specific Intrusions

Methods Relations


HRM - Patient Harm IntrusionRel#

RelationRelation

DescriptionPatient Harm

Adm

in. W

rong

Dru

g

Adm

in. T

oo M

uch

of D

rug

Adm

in. a

n A

llerg

icD

rug

Adm

in. I

mpr

oper

Die

t

Ord

er N

eedl

ess

Dru

gs

Per

form

Nee

dles

sP

roce

dure

2 Drug vs. Allergy (rule) X X

5 Drug vs. Diet (rule) X X

8 Drug vs. Historical (dosage) (stat) X X

24 Patient Test Results vs. TestResults (Historical)

(stat) X X X X

4 relations 6 methods


Relate OS IDS to App IDS

• Similarities– detect intrusions by

evaluating relations to differentiate between anomalous and normal behavior

– centralized or decentralized (hierarchical)

– similar threat categories

• Differences– anomaly detection using

statistical and rule-based app’n relations

– internal intruders/abusers – event causing entity

• outside system

– resolution -- finer grain– tightness of thresholds


Relate OS IDS to App IDS (cont’d)

• Dependencies– OS IDS on App IDS

• None

– App IDS on OS IDS• basic security services• prevent abuser from

bypassing application control to access application components

• Cooperation– correlate audit/event record– communication

• bi-directional• request-response

– complications• terms of communication• resource usage - lowest

common denominator


Conclusion -- App IDS• Opportunity

– app’n semantics ARE a rich basis for detecting internal intruders (abusers)

– CAN (define &) detect intrusions not visible to OS

– intrusions more readily relate to real world!

– monitors similar: rule-based & statistical relations

• Effectiveness– grain and units of resolution much richer

– tighter of thresholds

– less ambiguity of anomalous and normal behavior

– tied to semantics, and therefore to correctness


Conclusion • Technique developed

– we have developed a systematic technique for analyzing application intrusion potentials

– repeatable– requires deep understanding of the application

• Generic? -- yes and no– analysis technique is generic– threats are a combination of generic & appln. specific– ID monitor software is generic– however predicates for monitoring hand-crafted– upgrades require re-evaluation


Approach 2 - Signatures • Determine if OS type signatures (for ID

monitoring) can be adapted to applns. Demo it.– Explore variant of S. Forrest signatures based on OS

system sequences– Explore appln’s library calls– Explore timing signatures

• Work will be complete by Spring: – Final experiments to be completed– Two papers to be completed

• Slides 28 - 39


Sig1: Library Call Sequences

• Use app’s calls to it’s language library routines (e.g. C or C++ libraries)– Unix system calls number about 190– C library routines number about 200– Comparable

• Build library call signature for applications– wu-ftpd, apache httpd, ….


App Calls to Language Library

• Status– Able to build robust database– C library routines number about 200– Comparable

• Build library call signature for applications– wu-ftpd, apache httpd, [Next app: mysql]


Robust Database is Derived

Different streams # Total seq # New seq # DB size3 631 196 1966 1155 106 3029 1893 56 35812 3888 63 42114 4474 87 50818 5533 4 51221 6083 6 51824 7053 53 57127 8428 0 571

Signature Database - ps

0

100

200

300

400

500

600

0 2000 4000 6000 8000 10000

Total Seq Scanned

DB

Siz

e


Signature Length Has Little EffectSequence Length k Largest minimum

Hamming DistanceNormalized AnomalySignal ~SA

2 2 13 3 14 4 15 4 0.86 5 0.8333333337 6 0.8571428578 7 0.8759 8 0.88888888910 8 0.811 9 0.81818181812 10 0.83333333313 11 0.84615384614 12 0.85714285715 13 0.86666666716 14 0.87517 14 0.82352941218 15 0.83333333319 16 0.84210526320 16 0.8… … …

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40

Sequence Length, k

Nor

mal

ized

Ano

mal

y Si

gnal phf

nph-test-cgi


Detection Results (1/2)

• Two buffer-overflow cases based on ftpd

Appication – BufferOverflow I

DB Size NumberMismatches

%Mismatches ~SA

ftpd – Library Call 1285 467 3.5 0.7ftpd – System Call 1004 348 3.1 0.6

Appication – BufferOverflow II


%Mismatches ~SA

ftpd – Library Call 1285 569 2.7 0.6ftpd – System Call 1004 501 4.0 0.6


Detection Results (2/2)

• DOS case based on vi

Appication – vi # of libcalls


%Mismatches ~SA

First run - Normal 3845 868Second run - Intrusion 3857 911 101 2.6 0.6


Sig2: Time-based Signatures

• Time sequences -- Monitor timing behavior sequences demarcated by application system calls or language library invocations

• Time sequences consist of time stamps – the relative time between the system calls or – the relative time within system calls– (idle time of the monitored application may be excluded

or not)


Results (1)

• Distribution of the temporal signature• Usually, the frequency distribution for the time stamp is a

normal distribution.

• Definition of robust database• High variance time stamps may be excluded.

• After excluding high variance time stamps, the robust database should include more than 90%.


relative time stamp (r2)

Distribution of relative stamps of system call getpid

Fre

qu

en

cy

500

400

300

200

100

0

Std. Dev = 13.08

Mean = 65

N = 2758.00

Sequence length: 10

Code name __sysctl open fstat seteuid socket setsockopt Bind Getpid seteuid setsockopt

Stamps:case1

34 88 24 22 43 21 27 62 20 19

case 2 32 84 23 23 37 21 27 61 19 19case 3 33 86 23 22 37 23 27 60 19 19case 4 33 56 24 22 38 21 27 66 20 20 : 33 55 25 23 41 22 28 60 20 19 : 34 50 23 23 40 23 28 62 21 20

34 83 25 23 41 23 29 63 20 2033 88 24 24 42 23 29 64 21 2035 54 24 23 43 22 28 61 20 2036 55 23 23 41 24 27 61 21 19

case 11 34 52 24 50 40 22 27 62 22 19case 12 33 53 23 23 41 22 27 62 20 19Mean 33.666 16.722 23.750 25.083 40.333 22.250 27.583 62.0 20.250 19.416

Standarddeviation

1.0274 67 0.72169 7.5328 1.97203 0.92421 0.7592 0.59512 0.82916 0.49301

Typical Features of the Time Stamp

• Frequency distribution of time stamps for a system call (length 10 sequence)

• Exclude high variance: open; sequence case 11


Results (2)• In the absence of intrusion

– temporal signatures of applications are very similar• . . . under changes of environment, e.g., workload, the speed

and throughput of the network used by the application

• In the presence of intrusion– Certain buffer overflow intrusions cannot be detected

using system call sequence as a signature.– Difference between the temporal signature of the

intrusion and that of normal applications can be observed


Buffer Overflow Intrusion Detection

• Compare system call signature to temporal signature

• Only the latter can detect the intrusion

Normal/buffer overflow


Conclusions -- Signatures

• Applications have a rich surface structure when it is defined in its own terms -- not OS terms

• I.e. signatures based on – calls to classic language libraries (e.g.C, C++)– timing sequences

• We have demonstrated that the signatures are a basis for intrusion detection in appln.s


Immediate Milestones

• Complete demo experiments/measurement on actual intrusion detection using– signatures based on language library calls, and– temporal signatures

• Paper on each by Spring– “Temporal Signatures for Intrusion Detection” by Song

Li & Anita Jones– “Intrusion Detection Using Library Calls” by Yu Lin &

Anita Jones


Approach 3 - Progress Patterns • Determine if it is possible (for some classes of

appln.s) to define patterns of progress– that can be monitored to ensure that the appln. is

proceeding as expected

– Embed sensors on both control structure and data products

– Use prior work on appln. threat identification as start point

• Work will be complete by August 2001 – See revised budget

• Slides 42 - 44


Progress Patterns • Progress pattern: characterization of an appln.

that highlights intermediate milestones as appln. makes progress to a conclusion, sometimes represented as a constructed data product

• E.g. alternative formulations:– finite state machines– fault trees– . . .

• Challenge is to identify (& be able to detect the appln. moving through the milestones/stages


Appln.s with Progress Patterns • Many military appln.s are cyclic -- they repeatedly

perform a function or build a data product. E.g. – construction of the air tasking order– building a transportation convoy order (inventory)– building a tank refueling operation for a suite of tank

trucks

• Challenge: specify when and in what form “progress” can characterized and monitored

• Request to DARPA: Help us acquire the specifications for some military software system that will have progress patterns


Progress Patterns: Immediate Milestones

• Adapt the analysis technique developed earlier for threat determination (April)

• Apply it to several applications to define progress for those applications (April, May, June)

• Determine how to derive a generic-as-possible monitoring system to monitor for progress– what analysis tools (e.g. fault tree decision tools) can be

used to derive the progress characterization? – embedding of sensors in code & on data– potential for generic monitor code

• Document initial results (June)


Immediate DARPA Questions

See Following Slides


Immediate Milestones

• Complete demo experiments/measurement on actual intrusion detection using– signatures based on language library calls, and– temporal signatures

• Paper on each by Spring– “Temporal Signatures for Intrusion Detection” by Song

Li & Anita Jones– “Intrusion Detection Using Library Calls” by Yu Lin &

Anita Jones


Immediate Milestones (cont)

• Complete initial analysis of progress patterns • Acquire specifications of military system through

DARPA• Apply progress pattern approach to several

example systems• Milestones

– Early analysis (April)– Case studies (April, May, June)– Final report and paper (August)


Exactly Who Works on Project

• Faculty: Anita Jones • Graduate Research Assistants (full time graduate

students)– Yu Lin– Song Li– Rick Li

application intrusion detection anita jones university of virginia

Documents