logic-based, data-driven enterprise network security analysis
DESCRIPTION
Logic-based, data-driven enterprise network security analysis. Xinming (Simon) Ou Assistant Professor CIS Department Kansas State University. COS 598D: Formal Methods in Networking Princeton University March 08, 2010. Self Introduction. Brief Bio PhD, Princeton University, 2005 - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/1.jpg)
Logic-based, data-driven enterprise network security analysis
Xinming (Simon) OuAssistant Professor
CIS Department
Kansas State University
COS 598D: Formal Methods in Networking
Princeton University
March 08, 2010
1
![Page 2: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/2.jpg)
Self Introduction
• Brief Bio– PhD, Princeton University, 2005
– Post-doc, Purdue CERIAS, Idaho National Laboratory, 2006
– Assistant Professor, Kansas State University, 2006-now
• Research Interests– Computer and network security, especially on formal and quantitative
analysis
– Programming languages, formal methods
• Research Group– Argus: http://people.cis.ksu.edu/~xou/argus/
2
![Page 3: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/3.jpg)
Overview of the two lectures
• Lecture One– Datalog model for network attacks– SLG resolution for Datalog evaluation– Exhaustive proof generation for Datalog
• Lecture Two– Formulating security hardening problem as a SAT
solving problem– Applying MinCostSAT to achieve optimal security
configuration– Open research problems
3
![Page 4: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/4.jpg)
Cyber Defender’s Life
Security advisories
Apache1.3.4bug!
Vulnerability reports
Network configuration
IDS alertsUsers and data assets
Reasoning System
Automated Situation Awareness
4
![Page 5: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/5.jpg)
Multi-step Attacks
Internet
Demilitarized zone (DMZ)
Corporation
webServer
workStationwebPages
fileServer
Firewall 2
buffer
overrun
Trojan horsesharedBinaryNFS shell
Firewall 1
5
![Page 6: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/6.jpg)
Two Questions
• Are there potential attack paths in the system?– How can they happen?– How can they be addressed in an optimal way?
• Are there attacks that are going on/have succeeded in the system?– How do you know?– How to counter the attack?
What we are going to focus on
6
![Page 7: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/7.jpg)
MulVAL
Datalog Rules from Security Experts
Vulnerability Scanner
Analyzer
Could root be compromised on any of
the machines?Ou, Govindavajhala, and Appel. Usenix Security 2005
Answers
Network Analyzer
Vulnerability Information (e.g.
NIST NVD)
Network reachability information
Vulnerability definition (e.g. OVAL, Nessus
Scripting Language)
User information
Vulnerability Scanner
7
![Page 8: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/8.jpg)
Network config(firewall analyzer)
Host access-control lists
reachable(internet, webServer, tcp, 80)reachable(webServer, fileserver, nfs, -)
.
.
.
8
![Page 9: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/9.jpg)
Host config scanner
File permissions
fileOwner(webServer, /bin/apache, root)
fileAttr(webServer, /bin/apache, r,w,x,r,0,0,r,0,0)
9
![Page 10: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/10.jpg)
Host-based vulnerability scanner
Installed software
vulExists(webserver, ‘CVE-2006-3747’, httpd)
vulExists(dbServer, 'CVE-2009-2446', mySQL).
… …
10
![Page 11: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/11.jpg)
US-CERTNVD
Apache1.3.4bug!
Security advisories
vulProperty('CVE-2006-3747', remote, privEscalation).
vulProperty('CVE-2009-2446', remote, privEscalation).
… …
11
![Page 12: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/12.jpg)
Security expert
Datalog Rules
execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port).
Linux security behavior;Windows security behavior;Common attack techniques
The rules are completely independent of any site-specific
settings. 12
![Page 13: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/13.jpg)
Rule for NFS
dmz
corp
webServer
webPagesfileServer
sharedBinaryNFS shell
accessFile(Server, Access, Path) :-
nfsExport(Server, Path, Access, Client),
reachable(Client, Server, nfs, -),
execCode(Client, _Perm).
13
![Page 14: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/14.jpg)
Rule for Trojan Horse
corp
workStation
webPagesfileServer
Trojan horseprojectPlan
sharedBinary
execCode(H, User) :- accessFile(H, write, Path), fileOwner(H, Path, User).
14
![Page 15: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/15.jpg)
Deducing new facts
execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port).
internet
dmzwebServer
Firewall 1
vulExists(webServer, httpd, remote, privilegeEscalation).
serviceRunning(webServer, httpd, tcp, 80, apache).
networkAccess(webServer, tcp, 80).
execCode(attacker, webServer, apache).Oops!
From Vulnerability Scanner & NVD
From Vulnerability Scanner
Derived
15
![Page 16: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/16.jpg)
Advantages of using Prolog
• Prolog’s goal-oriented evaluation is potentially more efficient.
• Prolog provides more programming flexibility.
Can we evaluate Datalog programs in Prolog?
16
![Page 17: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/17.jpg)
However…
• Prolog as a programming language cannot be directly used to evaluate Datalog
ancestor(X,Y) :- parent(X,Y).
ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y).
parent(bill,mary).
parent(mary,john).
?- ancestor(X,Y).
17
![Page 18: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/18.jpg)
However…
• Prolog as a programming language cannot be directly used to evaluate Datalog
ancestor(X,Y) :- parent(X,Y).
ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z).
parent(bill,mary).
parent(mary,john).
?- ancestor(X,Y).
18
![Page 19: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/19.jpg)
However…
• Prolog as a programming language cannot be directly used to evaluate Datalog
ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z).
ancestor(X,Y) :- parent(X,Y).
parent(bill,mary).
parent(mary,john).
?- ancestor(X,Y).
19
![Page 20: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/20.jpg)
Z2=john
X=mary
Y=john
Y=john
X=bill
Y=mary
Problem of SLD resolutionancestor(X,Y) :- parent(X,Y).
ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y).
parent(bill,mary).
parent(mary,john).
parent(X,Y).
Success
Success
parent(X,Z), ancestor(Z,Y).
ancestor(X, Y).
X=bill
Z=mary
ancestor(mary,Y).
parent(mary,Y).
Success
parent(mary,Z2), ancestor(Z2,Y).
…Failure
…Failure
ancestor(john,Y).
X=mary
Z=john
ancestor(john,Y).
20
![Page 21: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/21.jpg)
Problem of SLD resolution
ancestor(X, Y).
ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z).
ancestor(X,Y) :- parent(X,Y).
parent(bill,mary).
parent(mary,john).
ancestor(Z, Y), parent(X, Z).
ancestor(Z1, Y), parent(Z, Z1), parent(X, Z).
ancestor(Z2, Y), parent(Z1, Z2), parent(Z, Z1), parent(X, Z).
…
21
![Page 22: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/22.jpg)
Problem of SLD resolution
• Termination of cyclic Datalog programs not only depends on logical semantics, but also the order of the clauses and subgoals.– This creates problems since in network security
analysis, such cyclic rules are common place.• e.g. after compromising one machine, the attacker can use it as a
stepping stone to compromise another.
– Datalog is a declarative language; thus order should not matter.
– A pure Datalog program shall always terminate due to the bound on the number of tuples.
22
![Page 23: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/23.jpg)
Bottom-up Evaluation
Semi-naïve Evaluation:
Step(1) (base case)ancestor(bill,mary),ancestor(mary,john)
Step(2)Iteration 1ancestor(bill, john)
Iteration 2No new tuples (“fixpoint”)
ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z).
ancestor(X,Y) :- parent(X,Y).
parent(bill,mary).
parent(mary,john).
23
![Page 24: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/24.jpg)
SLG Resolution
• Goal-oriented evaluation• Predicates can be “tabled”
– A table stores the evaluation results of a goal.– The results can be re-used later, i.e. dynamic
programming.– Entering an active table indicates a cycle.– Fixpoint operation is taken at such tables.
• The XSB system implements SLG resolution– Developed by Stony Brook (http://xsb.sourceforge.net/ ).– Provides full ISO Prolog compatibility.
24
![Page 25: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/25.jpg)
Z=bill
Y=mary
SLG resolution example
ancestor(X, Y).
ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z).
ancestor(X,Y) :- parent(X,Y).
parent(bill,mary).
parent(mary,john).
ancestor(Z, Y), parent(X, Z).
25
generator nodenew table created for ancestor(X,Y)
active noderesolve ancestor(Z,Y) against the results in the table for ancestor(X,Y)
parent(X, bill).
parent(X,Y). X=mary
Y=john
X=bill
Y=mary
Success
Success
Failure
Z=mary
Y=john
parent(X, mary).
X=bill Success
Z=bill
Y=john
parent(X, bill). Failure
![Page 26: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/26.jpg)
SLG in MulVAL
netAccess(H2, Protocol, Port) :-
execCode(H1, User),
reachable(H1, H2, Protocol, Port).
netAccess(…)
Possible instantiations
table for goal
execCode(…)
Possible instantiations
table for first subgoal
from input tuples
26
![Page 27: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/27.jpg)
SLG complexity for Datalog
• Total time dominated by the rule that has the maximum number of instantiations– Time for computing one table = Computation of the subgoals + retrieving information from input tuples + matching results in the rules bodies– Time for computing all tables = retrieving information from input tuples + matching results in the rules’ bodies
• See “On the Complexity of Tabled Datalog Programs” http://www.cs.sunysb.edu/~warren/xsbbook/node21.html
27
![Page 28: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/28.jpg)
MulVAL complexity in SLG
execCode(Attacker, Host, User) :- vulExists(Host, _, Program, remote, privilegeEscalation), networkService(Host, Program, Protocol, Port, User), netAccess(Attacker, Host, Protocol, Port).
Scale with network size
O(N) different instantiations
28
![Page 29: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/29.jpg)
netAccess(Attacker, H2, Protocol, Port) :-
execCode(Attacker, H1, _),
reachable(H1, H2, Protocol, Port).
MulVAL complexity in SLG
Scale with network size
O(N2) different instantiations
Complexity of MulVAL
29
![Page 30: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/30.jpg)
Datalog proof generation
• In security analysis, not only do we want to know what attacks could happen, but also we want to know how attacks can happen– Thus, we need more than an yes/no answer for
queries.– We need the proofs for the true queries, which in the
case of security analysis will be attack paths.– We also want to know all possible attack paths; thus
we need exhaustive proof generation.
30
![Page 31: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/31.jpg)
An obvious approach
31
execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port).
execCode(Host, PrivilegeLevel, Pf) :- vulExists(Host, Program, remote, privilegeEscalation, Pf1), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel, Pf2), networkAccess(Host, Protocol, Port, Pf3), Pf=(execCode(Host, PrivilegeLevel), [Pf1, Pf2, Pf3]).
This will break the bounded-term property and result in non-termination
for cyclic Datalog programs
![Page 32: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/32.jpg)
MulVAL Attack-Graph Toolkit
Datalog representation
Machine configuration
Network configuration
Security advisories
XSB reasoning
engine
Datalog P
roof Steps
Grap
h
Bu
ilder Datlog proof
graph
Datalog rules
Ou, Boyer, and McQueen. ACM CCS 2006
Joint work with Idaho National Laboratory
32
Translated rules
![Page 33: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/33.jpg)
netAccess(H2, Protocol, Port, ProofStep) :-
execCode(H1, User),
reachable(H1, H2, Protocol, Port),
ProofStep= because( ‘multi-hop network access', netAccess(H2, Protocol, Port), [execCode(H1, User), reachable(H1, H2, Protocol, Port)] ).
Stage 1: Record Proof Steps
Proof step
33
![Page 34: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/34.jpg)
netAccess(fileServer, rpc, 100003)
Stage 2: Build the Exhaustive Proof
because(‘multi-hop network access', netAccess(fileServer, rpc, 100003), [execCode(webServer, apache), reachable(webServer, fileServer, rpc, 100003)])
1multi-hop network access
0
execCode(webServer, apache)
reachable(webServer, fileServer, rpc, 100003)
2
3
34
![Page 35: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/35.jpg)
Complexity of Proof Building
• O(N2) to complete Datalog evaluation– With proof steps generated
• O(N2) to build a proof graph from proof steps– Need to build O(N2) graph components– Building of one component
• Find the predecessor: table lookup• Find the successors: table lookup
Total time: O(N2), if table lookup is constant time
35
![Page 36: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/36.jpg)
Logical Attack Graphs
10
2
3
4
5
6
: OR
: AND
: ground fact
execCode(attacker,workStation,root)
Trojan horse installation
accessFile(attacker,workStation, write,/usr/local/share)
NFS semantics
networkService (webServer,httpd,tcp,80,apache)
vulExists(webServer, CAN-2002-0392, httpd, remoteExploit, privEscalation)
netAccess(attacker,webServer, tcp,80)
Remote exploitexecCode(attacker, webServer,apache)
accessFile(attacker,fileServer, write,/export)
NFS shell
36
![Page 37: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/37.jpg)
Performance and Scalability
0.01
0.1
1
10
100
1000
10000
1 10 100 1000
Number of hosts
CPU time (sec)
Fully connected
Partitioned
Ring
Star
37
![Page 38: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/38.jpg)
Related Work
• Sheyner’s attack graph tool (CMU)– Based on model-checking
• Cauldron attack graph tool (GMU)– Based on graph-search algorithms
• NetSPA attack graph tool (MIT LL)– Graph-search based on a simple attack model
38
![Page 39: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/39.jpg)
Advantages of the Logic-programming Approach
• Publishing and incorporation of knowledge/information through well-understood logical semantics
• Efficient and sound analysis by leveraging the reasoning power of well-developed logic-deduction systems
39
![Page 40: Logic-based, data-driven enterprise network security analysis](https://reader034.vdocuments.site/reader034/viewer/2022051316/5681572a550346895dc4c55f/html5/thumbnails/40.jpg)
Next Lecture
• How to make use of the proof graph– Optimizing mitigation measures through SAT solving
• Open problems– Uncertainty in reasoning
40