securerank ping-opendns
TRANSCRIPT
![Page 1: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/1.jpg)
Big Data for Security
Ping @opendns.com
![Page 2: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/2.jpg)
Umbrella Security Lab @OpenDNS
• 100+ sensors across 200+ countries • 200 million unique registered domains names • 40 million acDve users • 50 billions daily DNS requests
![Page 3: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/3.jpg)
![Page 4: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/4.jpg)
The PlaIorm HDFS, HBASE
KaPa Storm
naDve MR
pig hive
Python
R
ProducDon
Backup
AnalyDcs
![Page 5: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/5.jpg)
![Page 6: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/6.jpg)
google.com
labs.kaspersky.com
gpioegjrhsf.ws
pbqxdwwv.ws
usirk.ws
dncdh.nl
hzkfooak.cn
jflyyruea.com
15.83.5.1
128.13.18.67
154.1.32.15
62.8.20.54
TransacDon View of DNS Lookups
![Page 7: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/7.jpg)
![Page 8: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/8.jpg)
DNS VS. Retail
– Amazon’s CollaboraDve Filtering – Apriori algorithm (frequent item set mining)
![Page 9: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/9.jpg)
Modeling Methodologies
Data abstracDon/ representaDon
(link graph, social graph …)
Behavior abstracDon (random walk)
Reasoning tech (generaDve, empirical, iteraDve,
recursive …)
![Page 10: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/10.jpg)
?
?
Client IP
domain
![Page 11: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/11.jpg)
Page rank
The more linked by good pages, the higher a page is ranked One type of node One node can have both inlinks and outlinks Most nodes link to a limited amount of other nodes Pages are not classified
DNS transacDons
The less visited by good clients, the higher chance a domain is bad Two types of node Node is either visiDng, or being visited, but never both There are super nodes that link to millions of other nodes Domains are classified as benign, malicious, unknown
![Page 12: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/12.jpg)
Page rank Damping factor (user get bored) Random sinks and cycles Page rank are numbers between 0 and 1 and sum up to one in total Linkage matrix NxN (N being the total number of pages
DNS transacDons The domains visited by more good visitors are ranked high (inlink) -‐ Assign a “posiDve” iniDal value Visitors visiDng more good domains are ranked high (outlink) -‐ Assign a “posiDve” iniDal value Linkage matrix NxM (N being total number of domains, M being total number of IPs) PotenDally, we can consider query count as linkage weight
![Page 13: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/13.jpg)
Recursive defini-on for all ips visiDng domain dn -‐-‐-‐ rank for ip at Dme t -‐-‐-‐ the total number of domains ip connects to (in a certain Dme window) for all domain dn visited by ip -‐-‐-‐ the rank for d at Dme t -‐-‐-‐ the total number of ips visiDng domain d (not variant by Dme) The denominator gives the marginal (the sum of the counts of the condiDoning variable co-‐occurring with anything else)
r(dn)t+1 = (r(ip)t / L(ip))∑r(ip)tL(ip)
r(ip)t+1 = (r(dn)t / L(dn))∑r(ip)tL(dn)
![Page 14: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/14.jpg)
Tasks
• Recursive definiDon • Build linkage matrix • IniDalizaDon • IteraDng • Test for convergence
![Page 15: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/15.jpg)
Link analysis – build sparse linkage matrix (row-‐wise) input query log (each entry: client ip to hostnames) output dn -‐> ip ip ip ip ip -‐> dn dn dn dn //STRIPE DESIGN //map job: parsing query entry, filter bad hostname, convert hostname to domain emit [key(domain), value(ip)] emit [key(ip),value(domain)] //reduce job: emit [key(domain), value(ip ip ip)] emit [key(ip),value(domain domain domain)]
![Page 16: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/16.jpg)
Itera-ng – MapReduce iteraDon #n map – input Key (domain), value (pagerank ip ip ip) Or Key(ip), value (pagerank dn dn dn dn) – output key(ip/domain), value(x=pagerank/linklist.size()) reduce – input Key(domain/ip), values (x) //x as defined above key (domain/ip), value (x ip ip ... ip ) – output Key (domain/ip), value (Σx ip ip ip)
![Page 17: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/17.jpg)
Hadoop ImplementaDon
• Mapreduce job #1 – Building Link lists
• Iterate mapreduce job #2 – Security ranking
• Mapreduce job #3 – SorDng
![Page 18: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/18.jpg)
Slide 18
Input
Querylog
Output
key Value (rank, previous rank, links)
IP 1.0 1.0 d d d d
Domain 1.0 1.0 ip ip ip ip
Output
key value
Domain IP
IP Domain
Hadoop Job 2 – linkage creaDon, domain (or ip) mappings
Mapper Reducer
![Page 19: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/19.jpg)
Slide 19
Input
Key Value
IP1 2.3, 1.0, d1, d2, d3
IP2 -‐9.5,1.0, d1, d3
d1 24, 1.0, IP1, IP2
Output
key value
d1 2.3/3 + -‐9.5/2, 24, IP1, IP2
Output
key value
d1 “rank” 2.3/(num_of_links=3)
d1 “rank” -‐9.5/(num_of_links=2)
d2 “rank” 2.3/(num_of_links=3)
d3 “rank” 2.3/(num_of_links=3)
d3 “rank” -‐9.5/(num_of_links=2)
IP1 “links” 2.3, 1.0, d1, d2, d3
IP2 “links” -‐9.5,1.0, d1, d3
Hadoop Job 2 – Security Ranking (SR)
Mapper
Redu
cer
UpdaDng security rank
SR = Σ SRi/K, for each outlink, K being the number of outlinks of enDty i
![Page 20: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/20.jpg)
Risks/Issues • Behavior changes. A machine can be infected at any
minute. Is a day or an hour good window to measure the “cleanness” of a client?
• Noises
• Each individual source is one client IP or a user or machine (e.g., school WIFI, where no consistent client visiDng behavior can be obtained). Are these IPs introducing noises or they are the ones bringing in the most likely malicious connec8ons?
• Massive detecDon, is it massive FP?
![Page 21: Securerank ping-opendns](https://reader034.vdocuments.site/reader034/viewer/2022042522/55a28a6b1a28abe5748b4794/html5/thumbnails/21.jpg)
Take-‐away
• Graph-‐based discovery
• Take a different view at your data
• Machine Learning at a different scale