understanding old malware tricks to find new malware families
TRANSCRIPT
![Page 1: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/1.jpg)
50 Thousand Needles in 5 Million Haystacks
Understanding Old Malware Tricks to Find New Malware Families
Veronica Valeros, Karel Bartoš, Lukáš Machlica {vvaleros, kbartos, lumachli}@cisco.com
Cognitive Threat Analytics Cisco Systems, Inc.
![Page 2: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/2.jpg)
Veronica Valeros
Malware Researcher Threat Intel Analyst
Co-Founder MatesLab Hackerspace (ARG)
@verovaleros
Karel Bartoš
Network Security Researcher
PhD Candidate CSČVUT (CZ)
Lukáš Machlica
Network Security Researcher
PhD in CyberneticsZCU (CZ)
![Page 3: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/3.jpg)
Source: ‘Mapping Mirai: A Botnet Case Study’ (@MalwareTechBlog)
![Page 4: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/4.jpg)
Source: ‘Mapping Mirai: A Botnet Case Study’ (@MalwareTechBlog)
admin:admin
![Page 5: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/5.jpg)
![Page 6: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/6.jpg)
ACTORS
![Page 7: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/7.jpg)
![Page 8: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/8.jpg)
![Page 9: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/9.jpg)
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
![Page 10: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/10.jpg)
“Need to communicate” > Our competitive advantage
![Page 11: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/11.jpg)
“Need to communicate” > Our competitive advantage
Big Data is no longer a problem nor a challenge
![Page 12: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/12.jpg)
“Need to communicate” > Our competitive advantage
Big Data is no longer a problem nor a challenge
Machine Learning give us the perfect mechanism
![Page 13: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/13.jpg)
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
![Page 14: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/14.jpg)
?
?
? ?
?
?
?
?
![Page 15: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/15.jpg)
NETWORK TRAFFIC
MACHINE LEARNING
THREAT HUNTING
![Page 16: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/16.jpg)
Transparency + Collaboration
![Page 17: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/17.jpg)
4 C
HAL
LEN
GES
!
![Page 18: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/18.jpg)
4 C
HAL
LEN
GES
! HIGH DYNAMICS
OF MALWARE
![Page 19: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/19.jpg)
4 C
HAL
LEN
GES
! HIGH DYNAMICS
OF MALWARE
I FIND YOUR LACK OF
LABELS DISTURBING
![Page 20: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/20.jpg)
4 C
HAL
LEN
GES
! HIGH DYNAMICS
OF MALWARE
I FIND YOUR LACK OF
LABELS DISTURBING
LARGE SCALE TRAINING
![Page 21: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/21.jpg)
4 C
HAL
LEN
GES
! HIGH DYNAMICS
OF MALWARE
I FIND YOUR LACK OF
LABELS DISTURBING
LARGE SCALE TRAINING
ACTIVE LEARNING LOOP
![Page 22: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/22.jpg)
1
NETWORKTRAFFIC
LABELING
MALWAREDYNAMICS
ACTIVELEARNING
LARGESCALE
TRAINING
CLASSIFICATION
LEARNINGLOOP
INCIDENTS
2MISTAKESIN LABELS
3 4
![Page 23: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/23.jpg)
1
NETWORKTRAFFIC
LABELING
MALWAREDYNAMICS
ACTIVELEARNING
LARGESCALE
TRAINING
CLASSIFICATION
LEARNINGLOOP
INCIDENTS
![Page 24: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/24.jpg)
Change in Malicious Code or Payload
![Page 25: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/25.jpg)
Countermeasure – Proxy Log Records
2:45:00 | 54.62.37.10 | http://www.google.com | Mozilla (… | … 2:45:01 | 68.62.37.10 | http://www.yahoo.com | Mozilla (… | … 2:45:02 | 22.62.37.10 | http://www.cnn.com | Chrome (… | … 2:45:05 | 59.62.37.10 | http://xyfsfnweinfsfe.ru | Mozilla (… | … ...
Time | IP | URL | UserAgent | …
One flowHTTP(S)
Prox
y Lo
gs
![Page 26: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/26.jpg)
02:45:01 185.163.200.15 http://chgk.ge/logo.gif?a99b4ded=-1449439763
02:45:04 52.28.249.128 http://gim8.pl/images/logos.gif?ed13fa9b=-1269831060
02:45:09 195.157.15.100 http://althawry.org/images/logosa.gif?ed13e406=-1587317730
02:48:59 192.185.190.9 http://www.bijibali.com/images/logof.gif?2f18696=148149186
02:49:04 178.162.203.226 http://kukutrustnet987.info/home.gif?228c09=9056292
02:49:07 173.193.19.14 http://173.193.19.14/images/xs.jpg?250695=9706068
Network Traffic VariabilityThinking
TimeServer IPAddress Hostname URL Path/ Resource/ Parameters
![Page 27: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/27.jpg)
Extracted FeaturesHTTP(S) based (~600):
• URL based – Related to n-grams of letters, distribution of characters, …
• HTTP(s) request based – HTTP status, Ports, Up/Down Bytes, Referrer, User Agent, Mime Type, …
Offline per domain/IP (~20):
• Global stats – user-domain popularity, …
• External sources – domain age (WHOIS), …
![Page 28: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/28.jpg)
Legacy Approach
Network connection !(flow) !
Feature vector!
trained on flows
cl_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(3)
Classification!
![Page 29: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/29.jpg)
Legacy Approach
Network connection !(flow) !
Feature vector!
trained on flows
cl_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(3)
Classification!
![Page 30: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/30.jpg)
Normalized Entropy of Feature Values for 32 Malware Categories
Features1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mal
war
e C
ateg
orie
s
5
10
15
20
25
300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 31: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/31.jpg)
Legacy vs. Proposed Bag Representation
Network connection !(flow) !
Feature vector!
trained on flows
cl_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(3)
Classification !
Not suitable!for representing mw !
_
+
_ _
__
__
_
_
_
+
+
++
_ _
+
+
+
clafl
(5)
Classification!Network connection !(flow) !
Feature vector!Bag of flows!
Flow
-Bas
ed!
Bag-
Base
d!
![Page 32: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/32.jpg)
Poweliks
hxxp://31.184.194.39/query?version=1.7&sid=793&builddate=114&q=nitric+oxide+side+effects&ua=Mozilla%2F5 . . . &lr=7&ls=0
hxxp://31.184.194.39/query?version=1.7&sid=793&builddate=114&q=weight+loss+success+stories&ua=Mozilla%2F5 . . . &lr=0&ls=0
hxxp://31.184.194.39/query?version=1.7&sid=793&builddate=114&q=shoulder+pain&ua=Mozilla%2F5 . . . &lr=7&ls=2
hxxp://31.184.194.39/query?version=1.7&sid=793&builddate=114&q=cheap+car+insurance&ua=Mozilla%2F5 . . . &lr=7&ls=2
Zeus
hxxp://130.185.106.28/m/IbQFdXVjiriLva4KHeNpWCmThrJBn3f34HNwsLVVsUmLXtsumSSPe/zzXtIu9SzwjI9zKlxdE . . . 3RqvGzKN5
hxxp://130.185.106.28/m/IbQJFUVjgZn4vx4KHeNpWCmThrJBn3f34HNwsLVVsUmLfkoPaSS+S+zzXtIu9SzwjI9zKlxdE . . . 3vKwmk0oUi
hxxp://130.185.106.28/m/IbQJFUVjiJwJBX4KHeNpWCmThrJBn3f34HNwsLVVsUmKH7ue2STvSkzzXtIu9SzwjI9zKlxdE . . . 3vKwmk0oUi
hxxp://130.185.106.28/m/IbQNtVVji5/7Yp4KHeNpWCmThrJBn3f34HNwsLVVsUmLz4sO6YRvOjzzXtIu9SzwjI9zKlxdE . . . 3zB9057quqv
Legitimate traffic 1
hxxp://www.cnn.com/.element/ssi/auto/4.0/sect/MAIN/markets wsod expansion.html
hxxp://www.cnn.com/.a/1.73.0/assets/sprite-s1dced3ff2b.png
hxxp://www.cnn.com/.element/widget/video/videoapi/api/latest/js/CNNVideoBootstrapper.js
hxxp://www.cnn.com/jsonp/video/nowPlayingSchedule.json?callback=nowPlayingScheduleCallbackWrapper& =1422885578476
Legitimate traffic 2
Asterope
hxxp://194.165.16.146:8080/pgt/?ver=1.3.3398&id=126&r=12739868&os=6.1—2—8.0.7601.18571&res=4—1921—466&f=1
hxxp://194.165.16.146:8080/pgt/?ver=1.3.3398&id=126&r=15425581&os=6.1—2—8.0.7601.18571&res=4—1921—516&f=1
hxxp://194.165.16.146:8080/pgt/?ver=1.3.3398&id=126&r=27423103&os=6.1—2—8.0.7601.18571&res=4—1921—342&f=1
hxxp://194.165.16.146:8080/pgt/?ver=1.3.3753&id=126&r=8955018&os=6.1—2—8.0.7601.18571&res=4—1921—319&f=1
Click-fraud, malvertising-related botnet
hxxp://directcashfunds.com/opntrk.php?tkey=024f9730e23f8553c3e5342568a70300&[email protected]
hxxp://directcashfunds.com/opntrk.php?tkey=c1b6e3d50632d4f5c0ae13a52d3c4d8d&[email protected]
hxxp://directcashfunds.com/opntrk.php?tkey=7c9a843ce18126900c46dbe4be3b6425&[email protected]
hxxp://directcashfunds.com/opntrk.php?tkey=c1b6e3d50632d4f5c0ae13a52d3c4d8d&[email protected]
DGA
![Page 33: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/33.jpg)
Parallel to Action Recognition
![Page 34: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/34.jpg)
Invariant Representation* – Overviewweb logs
...
vector of flow 1
... ...
vector of flow N
flow 1
...
flow N
user
:hos
tnam
e
1
2
feature values
locally-scaledself-similarity
matrix
...3
featuredifferenceshistogram
4
...
bag
feat
ure
1
feat
ure
M
...
5
feature valueshistogram
combined final feature vector
Flow-based
1 – create bag + extract flow-based feature vectors
2 – create feature values histogram
3 – create self-similarity matrix
4 – create feature differences histogram
5 – combine into final feature vector
* Karel Bartoš, Michal Sofka, Vojtěch Franc Optimized Invariant Representation of Network Traffic for Detecting Unseen Malware Variants. USENIX Security 2016
![Page 35: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/35.jpg)
NETWORKTRAFFIC
LABELINGACTIVE
LEARNING
LARGESCALE
TRAINING
CLASSIFICATION
LEARNINGLOOP
INCIDENTS
2MISTAKESIN LABELS
![Page 36: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/36.jpg)
Week Labels for Training?
Small amount of reliable labels
+ + + –
![Page 37: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/37.jpg)
Week Labels for Training?Large amount of week labels (blacklists, feeds)
Can we use them for training? Mistakes?
+ + + – + + + – – – ++ + +– – –
![Page 38: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/38.jpg)
+/- shows if a flow is malicious or legitimate
mistakes in flow-based labeling
blacklists, feeds
M
M
M
M
M
M
M
M
M
M
classifiertrained on bags
create bags of flows
_
+
_ _
__
__
_
_
_
+
+
++
_ _
+
+
+
classifiertrained on flows
flow-based labeling
Existingapproaches
Proposed MILapproaches
+
classifyflows
classifyflows+
---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+
+
-++
_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(1) (2)
(3)
(4)
(5)
Multiple Instance Learning (MIL)
![Page 39: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/39.jpg)
+/- shows if a flow is malicious or legitimate
mistakes in flow-based labeling
blacklists, feeds
M
M
M
M
M
M
M
M
M
M
classifiertrained on bags
create bags of flows
_
+
_ _
__
__
_
_
_
+
+
++
_ _
+
+
+
classifiertrained on flows
flow-based labeling
Existingapproaches
Proposed MILapproaches
+
classifyflows
classifyflows+
---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+
+
-++
_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(1) (2)
(3)
(4)
(5)
Multiple Instance Learning (MIL)
![Page 40: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/40.jpg)
+/- shows if a flow is malicious or legitimate
mistakes in flow-based labeling
blacklists, feeds
M
M
M
M
M
M
M
M
M
M
classifiertrained on bags
create bags of flows
_
+
_ _
__
__
_
_
_
+
+
++
_ _
+
+
+
classifiertrained on flows
flow-based labeling
Existingapproaches
Proposed MILapproaches
+
classifyflows
classifyflows+
---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+
+
-++
_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(1) (2)
(3)
(4)
(5)
Multiple Instance Learning (MIL)
![Page 41: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/41.jpg)
+/- shows if a flow is malicious or legitimate
mistakes in flow-based labeling
blacklists, feeds
M
M
M
M
M
M
M
M
M
M
classifiertrained on bags
create bags of flows
_
+
_ _
__
__
_
_
_
+
+
++
_ _
+
+
+
classifiertrained on flows
flow-based labeling
Existingapproaches
Proposed MILapproaches
+
classifyflows
classifyflows+
---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+---+
-----+----+--+-------++--
+
+
-++
_
+
_
_
_
_
__
_
_
_
++
++
_
_
+
+
+
+
+
+
_
_
_
_
(1) (2)
(3)
(4)
(5)
MIL*
* Vojtěch Franc, Michal Sofka, Karel Bartoš: Learning detector of malicious network traffic from weak labels. ECML-PKDD 2015
![Page 42: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/42.jpg)
Wide Classification Capabilityweb logs
...
vector of flow 1
... ...
vector of flow N
flow 1
...
flow N
user
:hos
tnam
e
1
2
feature values
locally-scaledself-similarity
matrix
...3
featuredifferenceshistogram
4
...
bag
feat
ure
1
feat
ure
M
...
5
feature valueshistogram
combined final feature vector
Flow-based Cryptowall
hxxp://ukhealer.net/u/?a=KELQFAJusqu6Gd33DB0T1zATPwoXsmYFciyO9THSYS7na3zZfVczZ8GzHHydLYn8hVyiy1l0...
hxxp://sethealer.com/u/?a=L4ZTRAn2VVC9F -BkobTaxsNyaqCKxReHIOOWoVFd–YZxFkES4Y mBgSCaN 1K1rWdeM...
hxxp://sethealer.net/u/?a=qF1coIn2VVE3OFYDC1NXrm24fgDShSqjFsut7gMXRymFe3zZuFTQPw1lI4X6t2MQIMntv2It...
http://zerosumstudio.com/img5.php?z=smnk91cpnmd!http://zerosumstudio.com/img5.php?z=sd04vutaog!http://zerosumstudio.com/img5.php?z=snmofp2ye0x !
Goznym Banking Trojan
Rig Exploit Kithttp://ds.revivefl.org/?x3qJc7iZLBrGAoc=13SKfPrfJxzFGMSUb-nJDa9BMEX…!http://ds.revivefl.org/index.php?x3qJc7iZLBrGAoc=13SKfPrfJxzFGMSUb-nJ…!http://ds.revivefl.org/index.php?h4SGhKrXCJ-ofSih17OIFxzsmTu2KV_Opqxv…!
http://carvezine.com/chdcm.php?gvo=miovlbwds&pbhpuo=03625369&oo… !http://carvezine.com/nnlgz.php?pmgl=wyynckyuok&nlm=vzhuhjachy&nptr…!http://carvezine.com/syajsg.php?lmtzkhhj=xmilzwgox&djax=1PhY8FI2NioY… !
![Page 43: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/43.jpg)
NETWORKTRAFFIC
LABELINGACTIVE
LEARNING
LARGESCALE
TRAINING
CLASSIFICATION
LEARNINGLOOP
INCIDENTS
3
![Page 44: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/44.jpg)
How to Sample Training Data
1 day = +10 billion requests (millions of users and devices)
![Page 45: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/45.jpg)
How to Sample Training Data
1 day = +10 billion requests (millions of users and devices)
![Page 46: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/46.jpg)
How to Sample Training Data
1.RANK the traffic according to the SCORE of statistical classifiers
2.Take only TOP-N samples as the NEGATIVEs
3.COLLECT samples across longer time period
![Page 47: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/47.jpg)
How to Sample Training Data
1.RANK the traffic according to the SCORE of statistical classifiers
2.Take only TOP-N samples as the NEGATIVEs
3.COLLECT samples across longer time period
![Page 48: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/48.jpg)
How to Sample Training Data
1.RANK the traffic according to the SCORE of statistical classifiers
2.Take only TOP-N samples as the NEGATIVEs
3.COLLECT samples across longer time period
![Page 49: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/49.jpg)
Positive-Unlabeled Training*
* X. Li, B. Liu, S.K. Ng: Negative Training Data can be Harmful to Text Classification, 2010, EMNLP
POOL OF NEGATIVEs !
= spy/known positive !
= malicious duck!
= benign duck !
![Page 50: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/50.jpg)
Positive-Unlabeled Training*
* X. Li, B. Liu, S.K. Ng: Negative Training Data can be Harmful to Text Classification, 2010, EMNLP
score/maliciousness!
NEG
ATIVEs
!
![Page 51: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/51.jpg)
Classifiers
![Page 52: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/52.jpg)
• SVM alternative – MIL representations: Challenge #1
• Random Forests
• Randomness involved = more robust against evasion
• Multi-class classification
• Easy to train and run efficiently on big data
• Used prevalently for per-vector classification
![Page 53: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/53.jpg)
Neural Networks*
* T. Pevny, P. Somol: Discriminative models for multi-instance problems with tree structure, 2016 9th ACM Workshop on AISec with the 23nd ACM Conference on Computer and Communications (CCS)
Used to classify the traffic on the level of users1st hidden layer! 2nd hidden l. ! 3rd hidden l.!input layer !
![Page 54: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/54.jpg)
![Page 55: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/55.jpg)
Malicious !
Legitimate!
Malicious!
Legitimate!
![Page 56: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/56.jpg)
Phishing + obfuscated URLs !!
Click-fraud + obfuscated URLs !!
C&C + DGAs + obfuscated URLs!C&C using DGAs!
C&C + obfuscated URLs !
![Page 57: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/57.jpg)
NETWORKTRAFFIC
LABELINGACTIVE
LEARNING
LARGESCALE
TRAINING
CLASSIFICATION
LEARNINGLOOP
INCIDENTS
4
![Page 58: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/58.jpg)
Active Learning – exploration vs. exploitation
*
RANSOMWARE
INFORMATIONSTEELER
CLICKFRAUD
DATAEXFILTRATION
ADINJECTOR
EXPLOITKIT
BANKINGTROJAN
…partial labeling !
newly labeled + in/correctly classified samples !
detections !
![Page 59: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/59.jpg)
CTA Classification and Learning Loop
![Page 60: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/60.jpg)
CTA Classification and Learning Loop
* We use SPARK to train the model (with 200 CPUs it takes approx. 1-2 days)!
*
detections !
![Page 61: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/61.jpg)
• Automatic retraining of the classifier
• based on new samples in already observed categories
• new category is discovered
time!
…
…
…
storing!
storing!
storing!
Performance evaluation!
Performance evaluation !
Performance evaluation!
storing!
![Page 62: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/62.jpg)
Real Life Example
![Page 63: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/63.jpg)
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ?
![Page 64: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/64.jpg)
?
?
? ?
?
?
?
?
![Page 65: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/65.jpg)
DNSChanger
![Page 66: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/66.jpg)
1 week
![Page 67: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/67.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Confidential !X
![Page 68: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/68.jpg)
C&CDNS CHANGER TROJAN
Thanks to Ross Gibb (Cisco AMP Threat Grid)!
![Page 69: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/69.jpg)
C&C
FILE SERVER
C&C
MAMBA TROJAN (stage 1)
MAMBA TROJAN (stage 2)
DNS CHANGER TROJAN
Thanks to Ross Gibb (Cisco AMP Threat Grid)!
![Page 70: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/70.jpg)
C&C
FILE SERVER
C&C
MAMBA TROJAN (stage 1)
MAMBA TROJAN (stage 2)
DNS CHANGER TROJAN
Thanks to Ross Gibb (Cisco AMP Threat Grid)!
ADWARE
![Page 71: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/71.jpg)
http://finhoome.info/u/?a=kLz-Yckq(..)&c=fPOnv(..)&r=987(..) http://domenjob.com/u/?a=D-2n5k7(..)&c=vTB5(..)&r=589(..) http://domenjob.net/u/?a=qk7BKV9(..)&c=m6V(..)&r=327(..) http://listcool.net/u/?q=jW6H5obe2(..)&c=be2G(..)&r=684(..) http://listcool.info/u/?q=J5DM4nrA(..)&c=rASU(..)&r=911(..) http://usafun.info/u/?q=S42YFQPC(..)&c=YFQP(..)&r=769(..) http://realget.info/u/?a=fDrS_9vLG(..)&c=GM0-(..)&r=528(..) http://alwaysweb.info/u/?a=G3ZGb(..)&c=wNR4(..)&r=781(..)
![Page 72: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/72.jpg)
Ads Gone Rogue
![Page 73: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/73.jpg)
http://ddprxzxnhzbq.com/cVERg.htm?l=1475373009865&p=3&r=313449&v=0.0013&h=0&k=&z=https%3A%2F%2Fwww.google.com%2F&f=1920,1080,1,1920,1080
http://avrdpbiwvwyt.com/asddsf.php?_750352&v=direct&siteId=1299913&minBid=0.0&popundersPerIP=10&default=http%3A%2F%2Fonclickads.net%2Fafu.php%3Fzoneid%3D699158&docref=&s=
http://eeqabqioietkquydwxfgvtvpxpzkuilfcpzkplhcckoghwgacb.com/kKifq.htm?k=1474870164116&n=3&a=24892&c=0&x=0&m=&t=http%3A%2F%2Fvidup.me%2Fsw6jvd1g05ye&y=1440,900,1,1440,900
http://cdrjblrhsuxljwesjholugzxwukkerpobmonocjygnautvzjjm.com/TV.asp?q=1474398263431&e=3&s=1153313&f=0&l=0&k=&o=http%3A%2F%2Fonbokep.com%2Ftag%2Fvideo-bokep-indo%2F26177%2Fpage%2F3&w=720,1280,1,720,1280
![Page 74: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/74.jpg)
http://www.xxyafiswqcqz.com/ZITVk.swf
http://www.xxyafiswqcqz.com/JQk.swf
http://www.xxyafiswqcqz.com/a.swf
http://www.higygtvnzxad.com/Ysmcib.swf
http://www.higygtvnzxad.com/UCTpvP.swf
http://www.lbypppwfvagq.com/qPDps.swf
http://www.lbypppwfvagq.com/ciGkKZ.swf
Thanks to Ross Gibb (Cisco AMP Threat Grid)! & Avast Team!
> popads.net
Flash file (4.2 KB)
Zlib compressed
RIG Exploit Kit
bbd6e73b1eb7415c563c5ab34b4992b5
![Page 75: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/75.jpg)
![Page 76: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/76.jpg)
hxbvbmxv.comrpczohkv.comaiypulgy.comovgzbnjj.com
igupodzh.comfluohbiy.comgggemaop.comajgffcat.comossdqciz.com
bmqnguru.comwrmcfyzl.comokmuxdbq.comcmpsuzvr.comlkrcapch.comxbynkkqi.com
zoowknbw.comizwsvyqv.comcomgnnyx.comnefxtwxk.comnfniziqm.comxttrofww.comdmdcpvgu.comhffmzplu.comawsatstb.comccdkyvyw.com
kgkjlivo.comnegdrvgo.comazeozrjk.comsriaqmzx.comjlflzjdt.combircgizd.com
asqamasz.commnusvlgl.comruovcruc.comvrqajyuu.comwgefjuno.comwwgjtcge.comervpgpxr.comzycvyudt.com
imyqdbxq.comwzueqhwf.comuxyofgcf.com
wkhychiklhdglppaeynvntkublzecyyymosjkiofraxechigon.comdwentymgplvrizqhieugzkozmqjxrxcyxeqdjvcbjmrhnkguwk.comxhojlvfznietogsusdiflwvxpkfhixbgdxcnsdshxwdlnhtlih.comnrgpugas.com
tmdcfkxcckvqbqbixszbdyfjgusfzyguvtvvisojtswwvoduhi.comdkrhsftochvzqryurlptloayhlpftkogvzptcmjlwjgymcfrmv.comhtllanmhrnjrbestmyabzhyweaccazvuslvadtvutfiqnjyavg.comsjpexaylsfjnopulpgkbqtkzieizcdtslnofpkafsqweztufpa.comhvukouhckryjudrawwylpboxdsonxhacpodmxvbonqipalsprb.com
cdqmeyhqrwinofutpcepbahedusocxqyfokvehqlqpusttfwve.combmjccqfxlabturkmpzzokhsahleqqrysudwpuzqjbxbqeakgnf.comgrceweaxhbpvclyxhwuozrbtvqzjgbnzklvxdezzficwjnmfil.commopvkjodhcwscyudzfqtjuwvpzpgzuwndtofzftbtpdfszeido.comrdzxpvbveezdkcyustcomuhczsbvteccejkdkfepouuhxpxtmy.comrhfvzboqkjfmabakkxggqdmulrsxmisvuzqijzvysbcgyycwfk.com
hyvsquazvafrmmmcfpqkabocwpjuabojycniphsmwyhizxgebu.comuyqzlnmdtfpnqskyyvidmllmzauitvaijcgqjldwcwvewjgwfj.comfarkkbndawtxczozilrrrunxflspkyowishacdueiqzeddsnuu.compxarwmerpavfmomfyjwuuinxaipktnanwlkvbmuldgimposwzm.com
djzmpsingsrtfsnbnkphyagxdemeagsiabguuqbiqvpupamgej.comuupqrsjbxrstncicwcdlzrcgoycrgurvfbuiraklyimzzyimrq.compzgchrjikhfyueumavkqiccvsdqhdjpljgwhbcobsnjrjfidpq.com
ngmckvucrjbnyybvgesxozxcwpgnaljhpedttelavqmpgvfsxg.comsnpevihwaepwxapnevcpiqxrsewuuonzuslrzrcxqwltupzbwu.comlytpdzqyiygthvxlmgblonknzrctcwsjycmlcczifxbkquknsr.com
hndesrzcgjmprqbbropdulvkfroonnrlbpqxhvprsavhwrfxtv.comdobgfkflsnmpaeetycphmcloiijxbvxeyfxgjdlczcuuaxmdzz.comzfrzdepuaqebzlenihciadhdjzujnexvnksksqtazbaywgmzwl.combjzcyqezwksznxxhscsfcogugkyiupgjhikadadgoiruasxpxo.com
ptiqsfrnkmmtvtpucwzsaqonmvaprjafeerwlyhabobuvuazun.comcdrjblrhsuxljwesjholugzxwukkerpobmonocjygnautvzjjm.comwfiejyjdlbsrkklvxxwkferadhbcwtxrotehopgqppsqwluboc.com
gkgdqahkcbmykurmngzrrolrecfqvsjgqdyujvgdrgoezkcobq.comhuzmweoxlwanzvstlgygbrnfrmodaodqaczzibeplcezmyjnlv.comooyhetoodapmrjvffzpmjdqubnpevefsofghrfsvixxcbwtmrj.com
wdcxuezpxivqgmecukeirnsyhjpjoqdqfdtchquwyqatlwxtgq.comlkjmcevfgoxfbyhhmzambtzydolhmeelgkotdllwtfshrkhrev.comhpxxzfzdocinivvulcujuhypyrniicjfauortalmjerubjgaja.comynrbxyxmvihoydoduefogolpzgdlpnejalxldwjlnsolmismqd.com
wsfqmxdljrknkalwskqmefnonnyoqjmeapkmzqwghehedukmuj.complmuxaeyapbqxszavtsljaqvmlsuuvifznvttuuqfcxcbgqdnn.comeeqabqioietkquydwxfgvtvpxpzkuilfcpzkplhcckoghwgacb.comswckuwtoyrklhtccjuuvcstyesxpbmycjogrqkivmmcqqdezld.comwwgdpbvbrublvjfbeunqvkrnvggoeubcfxzdjrgcgbnvgcolbf.comtabeduhsdhlkalelecelxbcwvsfyspwictbszchbbratpojhlb.comvpsotshujdguwijdiyzyacgwuxgnlucgsrhhhglezlkrpmdfiy.com
rmetgarrpiouttmwqtuajcnzgesgozrihrzwmjlpxvcnmdqath.comrffqzbqqmuhaomjpwatukocrykmesssfdhpjuoptovsthbsswd.comhgbmwkklwittcdkjapnpeikxojivfhgszbxmrjfrvajzhzhuks.com
cphxwpicozlatvnsospudjhswfxwmykgbihjzvckxvtxzfsgtx.commfmikwfdopmiusbveskwmouxvafvzurvklwyfamxlddexgrtci.comsncpizczabhhafkzeifklgonzzkpqgogmnhyeggikzloelmfmd.comgxgnvickedxpuiavkgpisnlsphrcyyvkgtordatszlrspkgppe.comrnrbvhaoqzcksxbhgqtrucinodprlsmuvwmaxqhxngkqlsiwwp.combzjtjfjteazqzmukjwhyzsaqdtouiopcmtmgdiytfdzboxdann.combkmmlcbertdbselmdxpzcuyuilaolxqfhtyukmjkklxphbwsae.com
syorlvhuzgmdqbuxgiulsrusnkgkpvbwmxeqqcboeamyqmyexv.comxhwtilplkmvbxumaxwmpaqexnwxypcyndhjokwqkxcwbbsclqh.comyyuztnlcpiym.com
qpiyjprptazz.comiibcejrrfhxh.comolctpejrnnfh.com
zrxgdnxneslb.comwijczxvihjyu.comxwwkuacmqblu.comnbbljlzbbpck.combvobtmbziccr.com
dobjgpqzygow.comwepmmzpypfwq.comapgjczhgjrka.comddprxzxnhzbq.com
xmmnwyxkfcavuqhsoxfrjplodnhzaafbpsojnqjeoofyqallmf.comavrdpbiwvwyt.comzbrkywjutuxu.comcuguwxkasghy.comeovkzcueutgf.comqixlpaaeaspr.comelbeobjhnsvh.comcymuxbcnhinm.com
tgrmzphjmvem.comiohaqrkjddeq.comdxigubtmyllj.comifvetqzfiawg.comqxxyzmukttyp.com
ozhwenyohtpb.comjgqkrvjtuapt.com
24/07 07/08 21/08 04/09 18/09 02/10 16/10 30/10
![Page 77: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/77.jpg)
Automated Threat Hunting
![Page 78: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/78.jpg)
Initial Malicious Traffic (training) New Malicious Traffic
![Page 79: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/79.jpg)
3x more detections
![Page 80: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/80.jpg)
Tracking hundreds of malicious campaigns
![Page 81: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/81.jpg)
So
![Page 82: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/82.jpg)
Designed representation invariant to rapidly evolving
malware
![Page 83: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/83.jpg)
Large scale learning from weak labels
![Page 84: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/84.jpg)
Active learning for automated malware tracking
![Page 85: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/85.jpg)
Millions of users
Tens of thousands of infections
![Page 86: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/86.jpg)
What got us here won’t take us there.
![Page 87: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/87.jpg)
Thanks!
Lukáš Machlica
Veronica Valeros
Karel Bartoš
Any more haystacks?
![Page 88: Understanding Old Malware Tricks to Find New Malware Families](https://reader034.vdocuments.site/reader034/viewer/2022052305/586764801a28ab61568b45db/html5/thumbnails/88.jpg)
The end.