detecting malicious activity and malware on a large network brandon enright – cisco computer...
TRANSCRIPT
Detecting Malicious Activityand Malware on a
Large NetworkBrandon Enright – Cisco Computer Security Response Team
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
About me:• Hacker• Problem seeker and solver• Linux user• Extreme twisty puzzle enthusiast• Nmap evangelist• Armchair physicist• Mad scientist• Crypto nerd
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
About Cisco:
400 Sites
In 100 Countries
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
About Cisco:
2010 numbers -- Doesn’t include or
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
About Cisco:
40,000 routers and switches on the network
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
About Cisco:
… wait... WTF? 40,000 routers on the network?
Yeah. It’s the Cisco way:
“Is there any chance a router or switch will kinda sorta almost maybe solve part
of my problem?”
The answer is yes. Install a router.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Big Networks are Hard
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Big networks are hard:
• Every version of software under the sun• BYOD (Bring Your Own Device)• Every version of smart (and dumb) phone ever made• Thousands of VPN users at all times• The sun never sets on the network – no down time• Network logs exceed the size manageable by single-system solutions (> 1Tb / day)
How do you know if you have a big network:Can you memorize all of your public IP prefixes?Cisco’s Primary AS announces 74 IPv4 prefixes (1.17M IPs)
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
If you’re going to do security right you need a LOT of data:
• NetFlow• Transparent web proxy logs• IDS alerts
• HIPS logs• AV logs• IR agent
• DHCP logs• DNS logs• RPZ / Sinkhole logs• VPN logs• AAA logs• Syslog
IT Infrastructure
Network layer
Host level
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
… and you’re going to need a place to store and search that data
Data
• If you don’t have easy access to almost all of your data in one place you won’t use your data to its fullest
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
And “Big Data” will solve all of your problems…
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
And SIEM vendors correlate!
Correlation
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
WTF is correlation?
If you’re dumb you think:
If you’re smart you think:
If you’re a marketer you think:
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
This is what correlation actually is:
Web Proxy
timestamp (date)
source IP
source port
destination IP
destination port(s)
URL
IP reputation
request type
referer
User Agent
HIPS
timestamp (date)
source IP
source port
destination IP
destination port(s)
hostname
nbtname
sourcetype
eventsource
alerttype
Correlation is just a union, join, intersection, or other basic relation between common fields in different data sets
Your will beat anyday.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Fortunately not all hope is lost:
• SIEM “solutions” are almost entirely marketing hype but they are a reasonable way to get at your data
• “Big Data” doesn’t mean anything concrete but big data systems do help you get at your data quickly and easily
This presentation is about going beyond the marketing and canned reports to find
malicious activity on your network.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Gold mining your logs
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Investigative versus High Fidelity reports:
• High Fidelity reports are ones that have no realistic chance of producing a false positive and can be fully automated by a computer. No human being needs to “spot-check” the results.
• Investigative reports are pretty much everything else. The goal is always for maximum fidelity but it’s generally not feasible to build a report with perfect results.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
The High Fidelity intuition trap:
Be careful labeling a report “High Fidelity”. Bayes Theorem is an unforgiving mistress. Presumably you have tons of logs which have the tendency to make the seemingly unlikely happen frequently.
Wikipedia on Bayes Theorem:
You have a drug test that produces 99% true positive results for drug usersand 99% true negative results for non-drug users.
Suppose that 0.5% of people are users of the drug.
If a randomly selected individual tests positive, what is the probability he or she is a user?
33.2% (66.8% false positive rate)
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Talk scope:
This talk is not about “100% effective” ways of finding malicious activity.
Instead it’s about giving you the investigative ideas that should get you started.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
HTTP is the InternetAsk any user and they’ll tell you…
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
HTTP as a data source:
• To most users, if HTTP is broken then the Internet is useless• Organizations pretty much universally allow HTTP out• Even hosts with a RFC1918 address often use HTTP proxies• The browser and all of its plugins is one of the biggest attack
surfaces used by everyone• HTTP is so ubiquitous it’s practically a transport protocol
now
All of these factors (and others) have come together to make the web the most common malware delivery mechanism and HTTP the most common command and control mechanism.
And that makes your HTTP logs one of your most valuable data sources for finding malicious activity!
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Web Browsers vs Everything Else:
There are certain things web browsers always do:• Set a User-Agent: header• Set a Referer: header when appropriate• Use HTTP 1.1• Lots of other idiosyncrasies like “Accept-Type:” and
“Connection:”
Start by querying for things that don’t match web browser behavior.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Web Browsers vs Everything Else (continued):
This activity did not come from browsers:
pwned (click fraud)
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Web Browsers are quirky but consistent:
Within a browser version (and often a whole browser family) the quirks stay the same:• Header order is consistent• Parameter lists for headers like Accept-Encoding: are
generally static• Header capitalization is consistent
GET / HTTP/1.1Host: www.google.comUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like […]Accept-Encoding: gzip,deflate,sdchAccept-Language: en-US,en;q=0.8Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8Connection: keep-alive
Fake Chrome request (header order is wrong):
Quirks are very hard for malware to emulate!
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
If the browser tells you something, check it’s story out:
Nice try but that isn’t anywhere close to IE’s User-Agent string.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Sometimes it’s worthwhile to dig even deeper with fact-checking:
Legitimate IE User-Agent strings:
• Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)
• Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)
• Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
• Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C)
Is there any sort of consistency between Mozilla version, IE version, Windows version, and Trident version?
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Fact-checking User-Agent strings (continued):
First extract the (sub)fields for processing:
• | rex field=cs_useragent "^Mozilla/(?<mozver>[\d.]+)“
• | rex field=cs_useragent "MSIE (?<iefullver>(?<iever>\d+(\.\d)?)[\d.]*)“
• | rex field=cs_useragent "Trident/(?<triver>[\d.]+)“
• | rex field=cs_useragent "Windows NT (?<ntfullver>(?<ntver>\d+(\.\d)?)[\d.]*)"
In machine learning parlance this is feature extraction.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Fact-checking User-Agent strings (continued):
Building a contingency table between the Mozilla and IE version:
index=wsa msie cs_useragent="Mozilla/*msie*" | dedup host | rex field=cs_useragent "^Mozilla/(?<mozver>[\d.]+)" | rex field=cs_useragent "MSIE (?<iefullver>(?<iever>\d+(\.\d+)?)[\d.]*)" | rex field=cs_useragent "Trident/(?<triver>[\d.]+)" | contingency mozver iever
Machine learning models automate this sort of analysis.
7 9 10 6 8 5.5 5 4 11 14 16912 520 24 1052 902 504 30 0 1 05 0 8510 2842 0 7 0 0 0 0 13 0 0 0 0 0 0 0 27 0 0
TOTAL 16912 9030 2866 1052 909 504 30 27 1 1
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Fact-checking User-Agent strings (continued):
Building a contingency table between the IE and Trident version:
index=wsa msie cs_useragent="Mozilla/*msie*" | dedup host | rex field=cs_useragent "^Mozilla/(?<mozver>[\d.]+)" | rex field=cs_useragent "MSIE (?<iefullver>(?<iever>\d+(\.\d+)?)[\d.]*)" | rex field=cs_useragent "Trident/(?<triver>[\d.]+)" | contingency iever triver
5 6 4 7 3.17 6075 3891 574 55 29 4784 22 0 1 0
10 1 1727 0 0 06 0 0 3 0 08 5 1 453 0 0
5.5 0 0 0 0 05 0 0 0 0 04 0 0 0 0 0
11 0 0 0 0 0TOTAL 10865 5641 1034 56 2
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Fact-checking User-Agent strings (continued):Build other contingency tables and then put the logic together:
index=wsa msie cs_useragent="Mozilla/*msie*" (NOT cs_useragent="*iemobile*") | dedup host | rex field=cs_useragent "^Mozilla/(?<mozver>[\d.]+)" | rex field=cs_useragent "MSIE (?<iefullver>(?<iever>\d+(\.\d)?)[\d.]*)" | rex field=cs_useragent "Trident/(?<triver>[\d.]+)" | rex field=cs_useragent "Windows NT (?<ntfullver>(?<ntver>\d+(\.\d)?)[\d.]*)" | search (NOT cs_useragent="Mozilla/*(compatible;*") OR ((mozver < 4) OR (mozver > 5)) OR ((iever < 6) OR (iever > 10)) OR ((ntver < 5) OR (ntver > 6.3)) OR ((mozver="4.0" AND (iever > 8)) OR (mozver="5.0" AND (iever < 9))) OR ((triver < 4) OR (triver > 7)) OR ((iever < 7 AND (triver="*")) OR (iever="8.0" AND (NOT triver="4.0)) OR (iever="9.0" AND (NOT triver="5.0)) OR (iever="10.0" AND (NOT triver="6.0"))) OR (iefullver="*.*.*" OR ntfullver="*.*.*") OR (NOT ntver="*") OR ((iever="6.0" AND ntver > 5.1) OR (iever="7.0" AND ntver < 5.1) OR (iever="8.0" AND ntver < 5.1) OR (iever="9.0" AND ntver < 6) OR (iever="10.0" AND ntver < 6))
Logic similar to this is built automatically with machine learning.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Fact-checking User-Agent strings (continued):
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
When you see it, you know it’s bad:
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
So ask yourself, what would “bad” look like?
index=wsa dosexec java (NOT cs_url="*.exe") | dedup host
http://pacsd.melinert.org/r0vTmK-0OfJB07ey/20hdj/80XDJH0/PJd-A0xNrk/15DH1/0zz-gb1/2TWd/0LNuV/0iaBa_12TNk0-UlY_n08rz-T0Uay/90xxmM0B-r880PHRM_0m3TB0_9ZzP0fO_JA0zwxW-0Hh-e50BKiA0mcHu/0Y_jmM0iN-jt02XM_00oD4f0H_mOM0QZTp_17BW30YfWI-0IWU9_0p-FkN0_kqeh_0mNey0MN-go0/QoTO0p_rWJ0/xhoB_0q4/Vy/0XouZ-02op-F0l8b/S0g2_NE15_dkL0QAB_50VvS_d15L0_20nD5k/14Jra-0w1/Rs_0yn7/H0J-Lts07-GmE0s7M_d0_zkD00_qEd/Y0u5ER/ZTVyJa0mSV.exe?IeLtBYmZ4cZC=73b6a&h=11
http://www2.nq8x6r92.4pu.com/?90xcqmmo=XZPlx67S5dSU5qHcc6NqZHBnntfl2arRn6RuqGaja5Rpkp7X5KqeoqalabFnq2xoX5zroKKdnaeU0praqK%2BLhoRW6MzVqqCV4ZU%3D&h=16
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
If it looks bad… turn it into a specific query:
index=wsa dosexec java (NOT cs_url="*.exe")| regex cs_url="^http://[^/]+(/[a-zA-Z0-9_-]+){8,32}[^\?]+\?[a-zA-Z0-9]+=[a-zA-Z0-9]{4,8}&h=\d+$"
Pattern
IP DomainDomain …
Pattern Domain Pattern IP… } “Connect the dots”
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
So what else looks bad?
How about a POST to an IP address running a PHP script that takes a parameter with no Referer?
index=wsa post php cs_url="*.php?*" "ip address“(NOT cs_referer="*") cs_method="POST“| regex cs_url="^http://(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/“| dedup s_ip | dedup host
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
http://184.72.43.99/index.php?broswer=7020449211b5ce1d1
http://64.62.146.102/showthread.php?t=256534570
Here is what turns up:
Build Patterns
} “Connect the dots”
Pattern Domain IP…
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Humans are notprecision machines
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Do X every Y seconds:
Human: Uh okay…?
Computer: No problem.
This is not a human
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Time deltas and statistics are your friend:
Find the time gaps, do statistics, profit.
index=wsa POST (NOT (cs_referer="*")) (NOT "TCP_MEM_HIT") (x_wbrs_score < 0.0) | rex field=cs_url "http:\/\/(?<domain>[^\/]+)" | strcat host "_to_" domain sd | streamstats current=f last(_time) as next_time by host | eval gap = next_time -_time | stats count avg(gap) as avgg var(gap) as varg values(domain) as domain by sd | eval varavg = (varg / avgg) | search (count >= 10) (avgg > 10) (varavg < 0.05) | table domain count avgg varg varavg | sort varavg
To be honest, Splunk is not the right tool for the job here.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
The top periodic activities table:
Periodic activity by itself only says non-human, not malicious. Must be coupled with additional analysis.
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
What does machine-generated activity look like?
Second of Minute
Min
ute
of H
our
Check out Detecting and Analyzing Automated Activity on Twitter byChao Michael Zhang and Vern Paxson
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
DNS is the Lifeblood ofEverything
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
You should capture DNS queries:
• Humans use names
• Domain names are very inexpensive
• Provides a layer of indirection which increases resiliency
• Makes simple blocking a bit harder
• Allows things like Fast Flux
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
If you don’t capture answers you can use DNSDB:
https://www.dnsdb.info/
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Set operations can give you the context you need:
DNS is a good starting-point for detection but often is just the tip of the iceberg of data contained in your other logs.
Bad
Mac
hine
1 Bad Machine 2
Known-Good Machine
ProbablyBad Stuff
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Follow the DNS graph:
Bad Domain
Bad IP
Bad IP
Bad IP
Bad IP
Bad IP
Bad DomainBad Domain
Bad Domain
Bad Domain
Bad Domain
Bad Domain
Bad Domain
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
The Moral:
If you have a of data
it,you should
you will find
Detecting Malicious Activity and Malware on a Large Network
Brandon Enright – Cisco Computer Security Response Team
Questions?