data driven security

7
1 Data-Driven Security Sridhar Rajagopalan, Apigee

Upload: apigee-google-cloud

Post on 16-Apr-2017

319 views

Category:

Technology


0 download

TRANSCRIPT

1

Data-Driven Security !Sridhar Rajagopalan,

Apigee

Security in the context of APIs = Adaptive and Data Driven

Source: Incapsula

Velocity and Exposure to Abuse are two sides of the same coin.

Exposure

Undesired Uses

KPI Data Pollution

Cost Increases

Attacks

Velocity

Integration

Things

Quality Improvements

DevOps

How can you make sense in a Fishmarket?

Apigee Sense: In a nutshell

3

Bot Attack Stopped

Legitimate Traffic

sense

data signatures

A global processing pipeline for data flowing through Apigee Edge with a feedback loop which allows traffic shaping on Edge.

Collect + Analyze + Act

Collect We collect over 1 Billion records each day from traffic running through Apigee Edge. This data is collected at over 1000 different API endpoints (servers), and delivered to the data

lake with less than 5 minute end to end latency by a high throughput fully distributed data flow engine. There is negligible data loss within this system. The system is designed for better than 99.99% availability.

These represent API calls in a large number of industry segments: Hospitality, Telco, Retail, Healthcare, Manufacturing, and more ….

Apigee Edge Data Lake

Thousands of Servers, globally distributed. Running a highly available Managed API Service.

Over a billion API calls per day served with 99.99% availability

Over a Terabyte of data stored each day. Globally distributed. Accessible from a high throughput analysis system. Managed for a 90 day or greater retention period.

High throughput data flow engine.

Analyze The data in the data lake is automatically analyzed using Machine Learning algorithms by a

large cluster. The results stored back into the data lake. The cluster runs algorithms which consider all of the data, not just the data belonging to any one customer. These algorithms consider data seen over large time windows (24 hours, or more). This system enables our customer network to engage in mutually beneficial network effects. An attack on any one of our customers will be used to learn and defend all of our customers.

The cluster is designed to do this with low latency (a few minutes) between when data is available and result computation is completed. The cluster is able to auto-scale to process more data when data rates are higher, and scale down to keep costs under control when data rates are lower.

Data Lake

Analysis Cluster

Machine Learning Algorithms run both “per customer” and “global analysis” and then interpret the combined analysis in a per customer context.

The cluster scales to balance the needs for timeliness and cost.

Terabytes of data move between the cluster and the data lake each day.

Act The results are presented on a dashboard. A Monitoring Engine will also generate actionable

alerts when attacks are detected. The dashboard will show a drill down view on every attack. Any action taken at the dashboard is stored back in the data lake.

Actions are then read and used to shape the traffic running through Apigee Edge. Other than enabling the Sense service, there is no footprint on the Edge API Proxy. This means that we can effectively separate the concerns around security and defense of the API from those around programming and delivering the API program.

Data Lake Apigee Edge Dashboard and Monitoring

Traffic shaping on Apigee Edge is implemented outside the mainline API proxy development and deployment path in order to separate the concerns around security from those around delivering the API program.

Alerting will watch for you. Drill down so that you know who is hitting you and how. Act so that you can stop or manage them. Maintain history for audit purposes.

Thank You