"securing ecommerce with data metrics". corey benninger, etsy
DESCRIPTION
While the need for application logging and proper forensics information has been important after a security incident, it is not frequently used in proactive security. This talk will explore the ways that application logging, data, and metrics can be taken advantage of to create effective defenses for web applications. We query Hadoop for actual threshold numbers used for detecting attacks, proactively monitor for phishing attacks based on our own web server logs, respond in real-time to cross-site scripting attacks by hooking JavaScript methods, among other security countermeasures mined from big data. This presentation will help you build new defense strategies for your applications based on the data you are able to collect.TRANSCRIPT
Securing eCommerce with Data Metrics
Corey Benninger
Founded in 2005, Etsy is an e-commerce website focused on handmade and vintage items, as well as art and craft supplies.
Continuous Deployment
Average 35 deploys to production a day About 10,000 lines of code a day
Corey Benninger Senior Software Security Engineer @0xb3nn
https://www.etsy.com/listing/92868829/the-oh-my-orange-elephant-designer-wall
Overview
Collecting Metrics Viewing Metrics Taking Action Case Studies
First, a thesis
The security posture of your application is directly
proportional to how much you know about your
application.
https://www.etsy.com/listing/96459220/stick-your-head-in-the-sand-pen-ink
Looks fine from here
Data Collection
Application Stats
https://github.com/etsy/statsd
if (preg_match(self::PATTERNXSS, $this->url) == true) { $msg = “attacktype=XSS url=” . $this->url; Logger::log_info($msg, ‘SECURITY’); StatsD::increment(‘security.potential_xss’); if (!$this->rate->checkIncrement(self::XSS_WEIGHT)) { $this->drop_request = true; } }
StatsD
We <3 Graphs
https://github.com/etsy/dashboard
Is this normal?
Is this normal?
Smoothing Data
Get Historical
Log Analysis
https://www.etsy.com/listing/130330032/all-the-things-internet-meme-embroidered
Log all the things
Event logs Visit logs Error logs Mail logs API logs
Search logs DNS logs...
Splunk
Databases
https://www.etsy.com/listing/128620213/vintage-happiness-is-a-humongousl
Databases Relational (row) database Columnar (column) database MapReduce (clustered data processing)
Awesome Data Team
160 nodes 3840 cores
15 TB of RAM 960 TB storage
! Ad-hoc analysis of a large dataset
! Needs to be fast (or scalable)
! Might not do it more than once (for a data set)
Why: Analytics
SuperBIT
Case Studies
Full Site SSL resource cost
Goal
Full-site SSL for all Etsy sellers
Opt In
analytics_cascade do analytics_flow do analytics_source 'event_logs' tap_db_snapshot 'users_index' assembly 'event_logs' do group_by 'user_id', 'scheme' do count 'value' end end assembly 'users_index' do project 'user_id', 'is_seller' end assembly 'ssl_traffic' do project 'user_id', 'is_seller', 'scheme', 'value' group_by 'is_seller', 'scheme' do count 'value' end end analytics_sink 'ssl_traffic' end end
Incident Response for web attacks
https://www.etsy.com/listing/152084181/boba-fett-the-good-the-bad-and-the
Finding Vulns
Bug Bounty Program Launched Sept 2012 Reward: $500 - $2000
Needle in a haystack
• URL Patterns
• IP Addresses
Simple Patterns
analytics_cascade do analytics_flow do analytics_source 'access_logs' assembly 'incident_response' do query_event 'timestamp', 'request_uri', 'useragent', 'ip' where '"/bad_url.php'".equals(request_uri:string) group_by ’url’ do count 'value' end end analytics_sink 'incident_response' end end
When to Alert setting thresholds
• Per time period, count password resets
• Sort the amounts
• Discard outliers
• Average remaining
• Compare with past known attacks
Big Data Answer
Collusion Fraud
The Price is Wrong?
Overpaid
Analysis Check for meta-data Exact hash and fuzzy hashing Analysis of key properties (shadows, patterns, shading...)
Grow Stronger
Detection is timely (hashing ~1ms) Each new data point helps for analysis
Phishing Attack reactive to proactive
Not Etsy
Reactive
source=”access_logs” client_ip=10.163.2.3 | transaction request_uri
Incident Response
Normal
Proactive
Scanners low hanging fruit
Bad Deploy?
https://www.etsy.com/listing/162962424/robot-dress-up-costume
Block Only Bad Bots
Allow legitimate users (including API requests) Allow search engines Allow our own scans
Asimov?
Bad Bots
Bad Bots
False Positives
https://www.etsy.com/listing/159148839/robot-card-trust-no-one
Bad Bots
Disobey 404 Time Announce
Detection Nick Galbreath at DefCon 20 “LibInjection” for detecting SQLi Does it parse as SQL? Yes, then it’s SQL Do you have “.aspx” files? No, then why is someone requesting one?
if (preg_match(self::PATTERNXSS, $this->url) == true) { $msg = “attacktype=XSS url=” . $this->url; Logger::log_info($msg, ‘SECURITY’); StatsD::increment(‘security.potential_xss’); if (!$this->rate->checkIncrement(self::XSS_WEIGHT)) { $this->drop_request = true; } }
Log and Limit
439 - Not Handmade
Check the Graphs
Conclusions
! Instrument your application, log everything
! Get familiar with data resources: people and tools
! Use your data to help drive security alerts, investigations, and actions
Thanks! http://codeascraft.com