measurement and classification of humans and bots in internet chat

Jhih-sin Jheng2009/09/01

Machine Learning and Bioinformatics Laboratory

Reference

Measurement and Classification of Humans and Bots in Internet ChatSteven Gianvecchio, Mengjun Xie, ZhenyuWu, and Haining WangDepartment of Computer ScienceThe College of William and Mary(USENIX Security),2008

2

OutlineBackgroundMeasurementClassification SystemExperimental EvaluationConclusion

3


4

Chat Bots vs. BotNetsBotNets – networks of compromised machines

some use chat systems (IRC) for C&C, others use P2P, HTTP, etc.

abuse various systemsChat Bots – automated chat programs

some are helpful, e.g., chat loggerscan abuse chat systems and their users

Send spam ,spread malicious software , mount phishing attacks

Our focus is on the Yahoo! Chat system.

5


6

MeasurementAugust-November 2007 – we collect data

August 2007 – Yahoo! adds CAPTCHAvery few chat bots

October 2007 – bots are back

7

MeasurementAugust and November 2007

many chat bots1,440 hours of chat logs147 chat logs21 chat rooms

8

MeasurementTo create our dataset, we read and label the

chat users ashuman, bot, or ambiguous

In total, we recognized 14 different types of chat botsdifferent triggering mechanismsdifferent text generation techniques

9

Types of Chat BotsPeriodic Bots – sends messages based on

periodic timersRandom Bots – sends messages based on

random timersResponder Bots – responds to messages of

other usersReplay Bots – replays messages of other

users

10

Humansinter-message delay – evidence of heavy tailmessage size – well fit by Exponential

(λ=0.034)

11

Periodic Botsinter-message delay – several clusters with

high probabilitiesmessage size – messages built from templates

approximate a normal distribution

12

Random Botsinter-message delay – Equilikely distribution at

40, 64, and 88; Uniform distribution 45-125message size – messages selected from a small

database

13

Responder Botsinter-message delay – human-like timingmessage size – multiple templates of different

lengths

14

Replay Botsinter-message delay – cluster with high

probabilities (replay bots are periodic)message size – human-like size, well fit by

Exponential (λ=0.028)

15


16

Classification SystemEntropy Classifier

detects abnormal behaviorbased on message sizes and inter-message

delaysaccurate but slow

Machine Learning Classifierdetects “learned” patternsbased on message contentfast but must be trained

17

18

Observation – chat bots are less complex than humans, and thus, lower in entropyexploits the low entropy of chat bots

Corrected Conditional Entropy Test (CCE)estimates higher-order entropy

Entropy Test (EN)estimates first-order entropy

Entropy Classifier

18

Machine Learning ClassifierObservation - chat spam like email spam is a

text classification problemexploits message content of chat bots

CRM114a powerful text classification system

19

20

Hybrid Classification System entropy classifier builds and maintains

the bot corpus machine learning classifier uses the bot

and human corpora

BOT CORPUS

CLASSIFY AS CHAT BOT

HUMAN CORPUS

CLASSIFY AS HUMAN

INPUT

ENTROPY CLASSIFIER

MACHINE LEARNING

CLASSIFIER


21

Experimental EvaluationTypes of Chat Bots

Periodic BotsRandom BotsResponder BotsReplay Bots

Classifiersentropy classifier – 100 messagesmachine learning classifier – 25 messages

22

Experimental EvaluationClassification Tests

Ent – entropy classifier SupML – fully-supervised ML classifier, trained

on AUG BOTSSupMLre – fully-supervised ML classifier,

retrained on NOV BOTSEntML – entropy-trained ML on AUG BOTS

23

AUG BOTS NOV BOTS

periodic random respond periodic random replay human

test TP TP TP TP TP TP FP

EN(imd) 121/121 68/68 1/30 51/51 109/109 40/40 7/1713

CCE(imd) 121/121 49/68 4/30 51/51 109/109 40/40 11/1713

EN(ms) 92/121 7/68 8/30 46/51 34/109 0/40 7/1713

CCE(ms) 77/121 8/68 30/30 51/51 6/109 0/40 11/1713

OVERALL 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

24

Entropy Classifier EN – entropy CCE – corrected conditional entropy (imd) – inter-message delay (ms) – message size

AUG BOTS NOV BOTS



EN(imd) 121/121 68/68 1/30 51/51 109/109 40/40 7/1713

CCE(imd) 121/121 49/68 4/30 51/51 109/109 40/40 11/1713

EN(ms) 92/121 7/68 8/30 46/51 34/109 0/40 7/1713

CCE(ms) 77/121 8/68 30/30 51/51 6/109 0/40 11/1713

OVERALL 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

25

EN(imd) and CCE(imd) problems against responder bots detect most other chat bots

AUG BOTS NOV BOTS



EN(imd) 121/121 68/68 1/30 51/51 109/109 40/40 7/1713

CCE(imd) 121/121 49/68 4/30 51/51 109/109 40/40 11/1713

EN(ms) 92/121 7/68 8/30 46/51 34/109 0/40 7/1713

CCE(ms) 77/121 8/68 30/30 51/51 6/109 0/40 11/1713

OVERALL 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

26

EN(ms) and CCE(ms) problems against random and replay

bots detect most other chat bots

AUG BOTS NOV BOTS



EN(imd) 121/121 68/68 1/30 51/51 109/109 40/40 7/1713

CCE(imd) 121/121 49/68 4/30 51/51 109/109 40/40 11/1713

EN(ms) 92/121 7/68 8/30 46/51 34/109 0/40 7/1713

CCE(ms) 77/121 8/68 30/30 51/51 6/109 0/40 11/1713

OVERALL 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

27

OVERALL detects all chat bots false positive rate is ~0.01 100 messages

AUG BOTS NOV BOTS



Ent 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

SupML 121/121 68/68 30/30 14/51 104/109 1/40 0/1713

SupMLre 121/121 68/68 30/30 51/51 109/109 40/40 0/1713

EntML 121/121 68/68 30/30 51/51 109/109 40/40 1/1713

28

Entropy and Machine Learning Classifiers Ent – entropy classifier (from last slide) SupML – fully-supervised ML classifier,

trained on AUG BOTS SupMLre – fully-supervised ML

classifier, retrained on NOV BOTS EntML – entropy-trained ML on AUG

BOTS

AUG BOTS NOV BOTS


Test TP TP TP TP TP TP FP

Ent 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

SupML 121/121 68/68 30/30 14/51 104/109 1/40 0/1713

SupMLre 121/121 68/68 30/30 51/51 109/109 40/40 0/1713

EntML 121/121 68/68 30/30 51/51 109/109 40/40 1/1713

29

Ent OVERALL results from previous slide

AUG BOTS NOV BOTS



Ent 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

SupML 121/121 68/68 30/30 14/51 104/109 1/40 0/1713

SupMLre 121/121 68/68 30/30 51/51 109/109 40/40 0/1713

EntML 121/121 68/68 30/30 51/51 109/109 40/40 1/1713

30

SupML has problems against November bots needs to be retrained for new bots

SupMLre detects all bots

AUG BOTS NOV BOTS



Ent 121/121 68/68 30/30 51/51 109/109 40/40 17/1713

SupML 121/121 68/68 30/30 14/51 104/109 1/40 0/1713

SupMLre 121/121 68/68 30/30 51/51 109/109 40/40 0/1713

EntML 121/121 68/68 30/30 51/51 109/109 40/40 1/1713

31

EntML false positive rate is ~0.0005

(Ent is ~0.01) 25 messages


32

ConclusionMeasurements

overall, chat bots are less complex than humans

some chat bots more human-likeClassification System

exploits benefits of both classifiersquickly classifies known chat botsaccurately classifies unknown chat bots

33

Thank you !

measurement and classification of humans and bots in internet chat

Documents

chat bots1

chat botsoctober

observation chat bots

yahoo chat system

chat users ashuman

message sizes

periodic botsintermessage

random timersresponder