a large-scale empirical analysis of chinese web passwordsa large-scale empirical analysis of chinese...

Post on 10-Mar-2021

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Large-Scale Empirical Analysis of Chinese Web Passwords

Zhigong Li, Weili Han Software School, Fudan University

Wenyuan Xu

Department of Electronic Engineering, Zhejiang University

********

Introduction

Does Chinese choose better passwords?

How can we guess them?

Two Problems

Introduction

3

1

2

There are over 600 million !Netizens in China.

h#p://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201407/t20140721_47437.htm  

Why Chinese Passwords?

Introduction

4

Password Leakage

Introduction

Over 100 million plaintext passwords

5

English

Chinese

Methodologies -- Two stage analysis

Guessing

Patterns

Characteristics

Introduction

Analysis

6

What are the Most Popular Passwords

Chinese English

12345

123456(2.17%) 123456789(0.65%)

111111(0.59%) 12345678(0.39%)

000000(0.34%)

123456 (0.88%) 12345(0.24%)

123456789(0.23%) password(0.18%)

iloveyou(0.15%)

Characteristics

7

Password Constitutions Vary

Digits Letters(mostly) Symbols

1 2 0 3 8 5 6 9 4 7 a i n e o h g l uwy s z q c x d jmb t r f k p v . A?N I ELHGYWDZ JSCOBX@QUFT _MR+P * - KV ! / # $ =& ?̀%) ~ ^ ( ; , [ ] < : > " ' | { }1E-6

1E-5

1E-4

1E-3

0.01

0.1

Chinese English

Occu

rren

ce P

erce

ntag

e

Characters

Characteristics

8

PWD Strength Metrics -- alpha-work-factor

[2] J. Bonneau. The science of guessing: analyzing an anonymized corpus of 70 million passwords, In IEEE S&P 2012.

[1] J. O. Pliam. On the incomparability of entropy and marginal guesswork in brute-force attacks, In Cryptology-INDOCRYPT 2000.

Alpha-­‐work-­‐factor

Alpha Probability  of  success

Expected  Number  of  guesses  needed  to  succeed  with  probability  alpha

9

PWD Strength Metrics -- alpha-work-factor

passwords 40% 123456 30% iloveyou 20% 123456789 10%

alpha=0.4 alpha-work-factor = 1

alpha=0.9 alpha-work-factor = 3

Characteristics

10

Example Dataset

PWD Strength Result -- Similar Strength

0.00    

5.00    

10.00    

15.00    

20.00    

25.00    

0.00     0.1   0.2   0.3   0.4   0.5  

α-­‐work-­‐factor(bit)

success  rate  α

CSDN  

Tianya  

Duduniu  

7k7k  

178.com  

Rockyou  

Yahoo  

Characteristics

11

Guessing

Patterns

Characteristics

Analysis

12

Patterns

PWD Patterns of Chinese -- Pinyins

你 好

nǐ hǎo

Patterns

13

Pinyins/Words -- Letter only

Chinese Pinyins English Words CSDN

Duduniu Tianya

7k7k 178.com

Rockyou

Yahoo

41.61 % (5.15%) 40.63% (4.15%) 33.28% (3.91%) 44.70% (4.97%) 57.31% (5.25%)

15.59% (1.93%) 10.39% (1.06%) 15.35% (1.80%) 10.04% (1.02%)

2.20% (0.20%) 6.94% (2.99%)

4.31% (1.46%)

25.47% (10.98%)

34.92% (11.86%)

Le#er-­‐Only  Passwords

Patterns

14

Pinyins/Words -- Mixed

Chinese Pinyins English Words CSDN

Duduniu Tianya

7k7k 178.com

Rockyou

Yahoo

25.49% (10.68%) 23.59% ( 5.78%) 25.17% (13.87%) 21.09% ( 5.84%) 23.49% ( 9.97%)

7.97% (3.34%) 6.05% (1.48%) 6.48% (3.57%) 7.02% (1.94%) 4.58% (1.94%)

6.88% ( 2.61%)

4.53% ( 2.59%)

28.11% (10.65%)

27.99% (16.01%)

Mixed  Passwords

Patterns

15

Special Passwords -- Love

Top Chinese Pinyins Top English Words 12345

woaini (1.47%) li (1.06%)

wang (0.97%) tianya (0.89%) zhang (0.84%)

password (1.28%) iloveyou (0.98%)

love (0.76%) angel (0.59%)

monkey (0.45%)

Patterns

16

Dates -- Extraction

a123456b a1234567b

a12345678b a123456789b

a1234b56789

Patterns

17

Dates -- Eight digits

YYYYMMDD DDMMYYYY CSDN

Duduniu Tianya

7k7k 178.com

Rockyou Yahoo

29.24% 36.26% 28.87% 32.41% 30.46%

2.64% 2.78%

17.66% 11.17%

MMDDYYYY 0.25% 0.35% 0.28% 0.18% 0.13%

0.43% 0.60% 0.84% 0.37% 0.19%

7.70% 12.00%

Patterns

18

Dates -- Six digits

YYMMDD DDMMYY CSDN

Duduniu Tianya

7k7k 178.com

Rockyou Yahoo

27.21% 23.93% 17.84% 24.34% 13.96%

5.63% 4.66%

18.42% 7.77%

MMDDYY 4.04% 3.05% 2.97% 2.63% 1.72%

1.24% 1.19% 1.78% 0.88% 1.30%

21.90% 25.99%

Patterns

19

Beginning End CSDN

Duduniu Tianya

7k7k 178.com

Rockyou Yahoo

21.68% 27.33% 24.76% 32.17% 22.30% 27.40% 22.66%

68.69% 72.34%

Middle 4.32% 4.75% 1.36% 2.70% 1.03%

74.00% 67.07% 73.88% 65.13% 76.67%

3.91% 5.00%

Patterns

Dates at the End

20

Guessing

Patterns

Characteristics

Analysis

Guessing

21

Guessing based on Probabilistic Context-Free Grammar

Guessing

22

Describe the structures of passwords using a set of rules.

Dictionary guessing

[1]Matt Weir et al. Password Cracking Using Probabilistic Context-Free Grammar, In IEEE Symposium on Security and Privacy, 2009.

Example

Guessing

23

Start

L5D3

S2L3D3

##L3D3

**L3D3

##L3123

##L3321

$$L3123

$$L3321

L5123

L5321 70%

30%

60%

40%

50%

50%

60%

40%

60%

40% Ln  =  n  le#ers  Dn  =  n  digits  Sn  =  n  symbols

Guessing CSDN

RockyouTS

MRockyouTS

DuduTS

RockyouDuduTS

Guessing

24

Training Sets

Guessing CSDN -- Dictionaries

20,000 Most Popular Pinyins

20,000 Most Popular Six-digit Dates

20,000 Most Popular Eight-digit Dates

EDict: Dic-0294 and English-lower CDict

Guessing

Dic-­‐0294  can  be  downloaded  from  h#p://www.outpost9.com/files/WordLists.html  English-­‐lower  can  be  downloaded  from  h#p://download.openwall.net/pub/passwords/wordlists/  

25

Guessing CSDN -- Result

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

26

Pinyin

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

27

Dates

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

28

Pinyin+Dates

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

29

Guessing CSDN -- Improvement

We increase the guessing d efficiency by 34%

Guessing

30

Conclusion

• With similar strength, Chinese passwords contain more DIGIT-ONLY (>50% ) passwords.

• Just like English words in English passwords, Chinese PINIYINs (10%-15%) appear in Chinese passwords.

• Using Pinyin/Dates, we can improve the

EFFICIENCY (34%) of guessing Chinese passwords.

31

Q&A

top related