a large-scale empirical analysis of chinese web passwordsa large-scale empirical analysis of chinese...

32
A Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu Department of Electronic Engineering, Zhejiang University

Upload: others

Post on 10-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

A Large-Scale Empirical Analysis of Chinese Web Passwords

Zhigong Li, Weili Han Software School, Fudan University

Wenyuan Xu

Department of Electronic Engineering, Zhejiang University

Page 2: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

********

Introduction

Page 3: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Does Chinese choose better passwords?

How can we guess them?

Two Problems

Introduction

3

1

2

Page 4: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

There are over 600 million !Netizens in China.

h#p://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201407/t20140721_47437.htm  

Why Chinese Passwords?

Introduction

4

Page 5: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Password Leakage

Introduction

Over 100 million plaintext passwords

5

English

Chinese

Page 6: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Methodologies -- Two stage analysis

Guessing

Patterns

Characteristics

Introduction

Analysis

6

Page 7: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

What are the Most Popular Passwords

Chinese English

12345

123456(2.17%) 123456789(0.65%)

111111(0.59%) 12345678(0.39%)

000000(0.34%)

123456 (0.88%) 12345(0.24%)

123456789(0.23%) password(0.18%)

iloveyou(0.15%)

Characteristics

7

Page 8: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Password Constitutions Vary

Digits Letters(mostly) Symbols

1 2 0 3 8 5 6 9 4 7 a i n e o h g l uwy s z q c x d jmb t r f k p v . A?N I ELHGYWDZ JSCOBX@QUFT _MR+P * - KV ! / # $ =& ?̀%) ~ ^ ( ; , [ ] < : > " ' | { }1E-6

1E-5

1E-4

1E-3

0.01

0.1

Chinese English

Occu

rren

ce P

erce

ntag

e

Characters

Characteristics

8

Page 9: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

PWD Strength Metrics -- alpha-work-factor

[2] J. Bonneau. The science of guessing: analyzing an anonymized corpus of 70 million passwords, In IEEE S&P 2012.

[1] J. O. Pliam. On the incomparability of entropy and marginal guesswork in brute-force attacks, In Cryptology-INDOCRYPT 2000.

Alpha-­‐work-­‐factor

Alpha Probability  of  success

Expected  Number  of  guesses  needed  to  succeed  with  probability  alpha

9

Page 10: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

PWD Strength Metrics -- alpha-work-factor

passwords 40% 123456 30% iloveyou 20% 123456789 10%

alpha=0.4 alpha-work-factor = 1

alpha=0.9 alpha-work-factor = 3

Characteristics

10

Example Dataset

Page 11: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

PWD Strength Result -- Similar Strength

0.00    

5.00    

10.00    

15.00    

20.00    

25.00    

0.00     0.1   0.2   0.3   0.4   0.5  

α-­‐work-­‐factor(bit)

success  rate  α

CSDN  

Tianya  

Duduniu  

7k7k  

178.com  

Rockyou  

Yahoo  

Characteristics

11

Page 12: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing

Patterns

Characteristics

Analysis

12

Patterns

Page 13: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

PWD Patterns of Chinese -- Pinyins

你 好

nǐ hǎo

Patterns

13

Page 14: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Pinyins/Words -- Letter only

Chinese Pinyins English Words CSDN

Duduniu Tianya

7k7k 178.com

Rockyou

Yahoo

41.61 % (5.15%) 40.63% (4.15%) 33.28% (3.91%) 44.70% (4.97%) 57.31% (5.25%)

15.59% (1.93%) 10.39% (1.06%) 15.35% (1.80%) 10.04% (1.02%)

2.20% (0.20%) 6.94% (2.99%)

4.31% (1.46%)

25.47% (10.98%)

34.92% (11.86%)

Le#er-­‐Only  Passwords

Patterns

14

Page 15: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Pinyins/Words -- Mixed

Chinese Pinyins English Words CSDN

Duduniu Tianya

7k7k 178.com

Rockyou

Yahoo

25.49% (10.68%) 23.59% ( 5.78%) 25.17% (13.87%) 21.09% ( 5.84%) 23.49% ( 9.97%)

7.97% (3.34%) 6.05% (1.48%) 6.48% (3.57%) 7.02% (1.94%) 4.58% (1.94%)

6.88% ( 2.61%)

4.53% ( 2.59%)

28.11% (10.65%)

27.99% (16.01%)

Mixed  Passwords

Patterns

15

Page 16: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Special Passwords -- Love

Top Chinese Pinyins Top English Words 12345

woaini (1.47%) li (1.06%)

wang (0.97%) tianya (0.89%) zhang (0.84%)

password (1.28%) iloveyou (0.98%)

love (0.76%) angel (0.59%)

monkey (0.45%)

Patterns

16

Page 17: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Dates -- Extraction

a123456b a1234567b

a12345678b a123456789b

a1234b56789

Patterns

17

Page 18: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Dates -- Eight digits

YYYYMMDD DDMMYYYY CSDN

Duduniu Tianya

7k7k 178.com

Rockyou Yahoo

29.24% 36.26% 28.87% 32.41% 30.46%

2.64% 2.78%

17.66% 11.17%

MMDDYYYY 0.25% 0.35% 0.28% 0.18% 0.13%

0.43% 0.60% 0.84% 0.37% 0.19%

7.70% 12.00%

Patterns

18

Page 19: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Dates -- Six digits

YYMMDD DDMMYY CSDN

Duduniu Tianya

7k7k 178.com

Rockyou Yahoo

27.21% 23.93% 17.84% 24.34% 13.96%

5.63% 4.66%

18.42% 7.77%

MMDDYY 4.04% 3.05% 2.97% 2.63% 1.72%

1.24% 1.19% 1.78% 0.88% 1.30%

21.90% 25.99%

Patterns

19

Page 20: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Beginning End CSDN

Duduniu Tianya

7k7k 178.com

Rockyou Yahoo

21.68% 27.33% 24.76% 32.17% 22.30% 27.40% 22.66%

68.69% 72.34%

Middle 4.32% 4.75% 1.36% 2.70% 1.03%

74.00% 67.07% 73.88% 65.13% 76.67%

3.91% 5.00%

Patterns

Dates at the End

20

Page 21: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing

Patterns

Characteristics

Analysis

Guessing

21

Page 22: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing based on Probabilistic Context-Free Grammar

Guessing

22

Describe the structures of passwords using a set of rules.

Dictionary guessing

[1]Matt Weir et al. Password Cracking Using Probabilistic Context-Free Grammar, In IEEE Symposium on Security and Privacy, 2009.

Page 23: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Example

Guessing

23

Start

L5D3

S2L3D3

##L3D3

**L3D3

##L3123

##L3321

$$L3123

$$L3321

L5123

L5321 70%

30%

60%

40%

50%

50%

60%

40%

60%

40% Ln  =  n  le#ers  Dn  =  n  digits  Sn  =  n  symbols

Page 24: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing CSDN

RockyouTS

MRockyouTS

DuduTS

RockyouDuduTS

Guessing

24

Training Sets

Page 25: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing CSDN -- Dictionaries

20,000 Most Popular Pinyins

20,000 Most Popular Six-digit Dates

20,000 Most Popular Eight-digit Dates

EDict: Dic-0294 and English-lower CDict

Guessing

Dic-­‐0294  can  be  downloaded  from  h#p://www.outpost9.com/files/WordLists.html  English-­‐lower  can  be  downloaded  from  h#p://download.openwall.net/pub/passwords/wordlists/  

25

Page 26: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing CSDN -- Result

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

26

Page 27: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Pinyin

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

27

Page 28: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Dates

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

28

Page 29: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Pinyin+Dates

5.00%   10.00%   15.00%   20.00%  

DuduTS  (CDict)  DuduTS  (EDict)  

RockyouDuduTS  (CDict)  RockyouDuduTS  (EDict)  

MRockyouTS  (CDict+Date)  MRockyouTS  (EDict+Date)  

MRockyouTS  (CDict)  MRockyouTS  (EDict)  

RockyouTS  (CDict+Date)  RockyouTS  (EDict+Date)  

RockyouTS  (CDict)  RockyouTS  (EDict)  

Percentage  of  Cracked  Passwords

RockyouTS

MRockyouTS

RockyouDuduTS DuduTS

Guessing

29

Page 30: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Guessing CSDN -- Improvement

We increase the guessing d efficiency by 34%

Guessing

30

Page 31: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Conclusion

• With similar strength, Chinese passwords contain more DIGIT-ONLY (>50% ) passwords.

• Just like English words in English passwords, Chinese PINIYINs (10%-15%) appear in Chinese passwords.

• Using Pinyin/Dates, we can improve the

EFFICIENCY (34%) of guessing Chinese passwords.

31

Page 32: A Large-Scale Empirical Analysis of Chinese Web PasswordsA Large-Scale Empirical Analysis of Chinese Web Passwords Zhigong Li, Weili Han Software School, Fudan University Wenyuan Xu

Q&A