mitigating the insider threat using high-dimensional search and modeling

Mitigating the Insider Threat using High-dimensional Search and Modeling

Presenter:Eric van den [email protected], December 14, 2005Team:Shambhu Upadhyaya, Hung Ngo(SUNY Buffalo)Muthu Muthukrishnan, Raj Rajagopalan (Rutgers)

DARPA IPTO Program:

Self Regenerative Systems (SRS)

Program Manager: Lee Badger

PI: Eric van den Berg

SRS PI meeting December 2005 – 2

Outline

Project overview Results of Red Team Exercise Success against program metrics Lessons learned and insights for system

improvement Future work / Next steps


Project overview

Project goal: to build a system that defends critical services and resources against insiders, which

– Correlates large numbers of sensor measurements– Synthesizes appropriate pro-active responses

What is done today?– Reactive systems: Detect attacks late in cycle– Anomaly detection systems: Few streams for correlation– Human-based systems: not scalable– Collateral damage may be large


Project overview (continued)

Technical Approach– Large network of sensors, to let insider trigger alerts– High dimensional network state description using sensor alerts– Search engine finds top-K past states similar to sensor

snapshot– Insider modeler and analyzer tool used to identify attack points,

train search engine, guide sensor placement– Response engine to analyze impact on critical services and

synthesize reconfiguration response

Technical Challenges– Testing SVD-based search technology in a new domain– New ‘Insider analyzer’ key-challenge graph problem is hard– Training search engine, labeling and annotating states


Sensor

Sensor

Sensor

Sensor

Normalizer

Filter

Aggregator

SearchEngine

ResponseEngine

Network StateRepository

HostScans

AuditScans

StateData

TrafficMeasure-

ments

Reconfiguration

Top KList

RefinedQueriesN

ETWORK

High-DimensionalSearch

Insider Modeler

andAnalyzer

Organizationaldata

Labels andfilters for states

Post-processing

Architecture


Insider analyzer and modeler tool (MAPIT)

Network entity rules

Cost Rules

MAPIT EngineNetwork topology

Key challenge graph

vulnerabilities

Authentication mechanism

Social Eng. Awareness

Perform sensitivity analysis

Defense centric

approach

feedback


Telcordia Testbed


Scenario1 – Exploiting a Vulnerability (KCG)


MAPIT next steps

Integrate with detection system: – MAPIT can run e.g. once a day, based on network

configuration update Can recommend sensor (re-)positioning Refine costing models Improve heuristics for closer to optimal attack

sequence prediction


Red Team Experiment

Red Team given account on Telcordia testbed Given information about malicious target files:

– Directory tree containing target file– Keyword in file – Mimic ‘moderately informed’ inside attacker

Red Team success: read/modify contents of target file

Blue Team success: block network access before read/modify

10 ‘malicious goals’, 10 ‘non-malicious goals’.


Success against program metric

Metric: thwart or delay >= 10% of malicious insider attacker goals

Results of Red Team exercise: Thwarted 4 out of 9 attacks

– Without building additional ‘history’ after attacks– Implemented only binary response


Lessons learned from experiment

Success: current system can thwart moderately fast insider attacks

– Designed originally for slower attacks Sensor configuration

– Better configuration based on amount of training data Response

– Interaction between search and response better– Desirable: more varied response


Insights for system improvement

Automatic state generation– E.g.: exchange state definitions among hosts

Sensor configuration– Can we make sensor configuration more automatic?– Sensor configuration and selection e.g. guided by

amount of available training data Response generation

– Include e.g. a local ‘preliminary’ response which can be validated by central search/response system


Additional tests

How does the system perform in terms of detection rate / false alarm rate

– If we build ‘known attack’ state repository – If we add history under ‘normal operation’– If we re-configure sensors?

All these can help mitigate false alarms


Improving on the Phase I metrics

It appears possible to thwart / delay a larger range of insider attacks:

Refine response to delay / thwart fast attacks– Implement host-based methods for e.g. delay until

detection decision is reached No inherent limitation in detection or analysis

system to include other sources of information– Location access, biometrics, audio/visual

Multi-stage attack allows for better detection / response


Performance increase challenges

So far only detect insider attacks which leave a trace on the network

– Collusion, social engineering… Can detect attacks which are significantly

different from ‘normal behavior’– Easier of insiders to mimic / change normal behavior?

Implemented one response mechanism– Variety of responses (e.g. key-challenges) possible

for various levels of attack / detection confidence– Local response helps thwart or delay fast attacks


Sketch-based anomaly detector

Streaming data model– Large data volume and speed: in backbone 1 billion

packets/hour/router– Large data domain: IPv4: 2^32 addresses, IPv6: 2^128

Consequences: – Can scan data (at most) once– Need small-space structure to summarize data

Hard to store O(n) data points when n=2^32 Cannot store at 2^128

Idea: build synopsis data structure for IP-packets– CM-sketches, deltoid group-testing

Detect attacks based on changes in traffic volume– Currently: traffic to destination IP address (likely targets)

Can detect attacks exhibiting large changes in packet distribution


Test of anomaly detector

Based on week 2 of 1999 MITLL data– from inside sniffer

Traffic volume based anomaly detection– Ipsweep, portsweep, phf, httptunnel, etc.

Detects targets of all four above attacks– Does give additional big changes ~1%, not attacks

Both periodic and instantaneous, relative and absolute change detection

Sub-linear space sketch methods give results nearly as good as full space methods


Sketch-based anomaly detection:Next steps Use small-space sketches to profile insider

resource usage– E.g. file accesses, user commands

Change detection methods for sketches– Combine various methods to improve efficiency:

instantaneous vs periodic, absolute vs relative Apply sketches to detect change in traffic

burstiness


Detecting multistage attacks

How to represent time evolution in multi-stage attacks? Like learning attacks from documented historical network

states, we can also document attack precursors or attack

stages – Full attack now represented as a sequence of network state

vectors– Robust against slow attacks: no explicit dependence on

time– Would like to make ‘precursor’ attack stage annotation

(semi-) automatic Approaches to automatic precursor/state classification

– State sharing– Remember occurrences of previous stages


Future work / next steps

Enable local informed response– Share state information among hosts/search engines – Local preliminary response (e.g. delay) helps against

fast insider attacks Integrate MAPIT insider analysis tool with

response engine– Share configuration information for periodic static

analysis of insider attack vulnerabilities Integrate other SRS technologies

– Sophisticated sensors / response enablers– Large scale system diagnosis / situational awareness

mitigating the insider threat using high-dimensional search and modeling

Documents

stepssrs pi meeting

architecturesrs pi meeting

largesrs pi meeting

binary responsesrs pi

varied responsesrs pi

attackerred team success

insider threat

hardtraining search