a large-scale empirical analysis of chinese web passwordsa large-scale empirical analysis of chinese...
TRANSCRIPT
A Large-Scale Empirical Analysis of Chinese Web Passwords
Zhigong Li, Weili Han Software School, Fudan University
Wenyuan Xu
Department of Electronic Engineering, Zhejiang University
********
Introduction
Does Chinese choose better passwords?
How can we guess them?
Two Problems
Introduction
3
1
2
There are over 600 million !Netizens in China.
h#p://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201407/t20140721_47437.htm
Why Chinese Passwords?
Introduction
4
Password Leakage
Introduction
Over 100 million plaintext passwords
5
English
Chinese
Methodologies -- Two stage analysis
Guessing
Patterns
Characteristics
Introduction
Analysis
6
What are the Most Popular Passwords
Chinese English
12345
123456(2.17%) 123456789(0.65%)
111111(0.59%) 12345678(0.39%)
000000(0.34%)
123456 (0.88%) 12345(0.24%)
123456789(0.23%) password(0.18%)
iloveyou(0.15%)
Characteristics
7
Password Constitutions Vary
Digits Letters(mostly) Symbols
1 2 0 3 8 5 6 9 4 7 a i n e o h g l uwy s z q c x d jmb t r f k p v . A?N I ELHGYWDZ JSCOBX@QUFT _MR+P * - KV ! / # $ =& ?̀%) ~ ^ ( ; , [ ] < : > " ' | { }1E-6
1E-5
1E-4
1E-3
0.01
0.1
Chinese English
Occu
rren
ce P
erce
ntag
e
Characters
Characteristics
8
PWD Strength Metrics -- alpha-work-factor
[2] J. Bonneau. The science of guessing: analyzing an anonymized corpus of 70 million passwords, In IEEE S&P 2012.
[1] J. O. Pliam. On the incomparability of entropy and marginal guesswork in brute-force attacks, In Cryptology-INDOCRYPT 2000.
Alpha-‐work-‐factor
Alpha Probability of success
Expected Number of guesses needed to succeed with probability alpha
9
PWD Strength Metrics -- alpha-work-factor
passwords 40% 123456 30% iloveyou 20% 123456789 10%
alpha=0.4 alpha-work-factor = 1
alpha=0.9 alpha-work-factor = 3
Characteristics
10
Example Dataset
PWD Strength Result -- Similar Strength
0.00
5.00
10.00
15.00
20.00
25.00
0.00 0.1 0.2 0.3 0.4 0.5
α-‐work-‐factor(bit)
success rate α
CSDN
Tianya
Duduniu
7k7k
178.com
Rockyou
Yahoo
Characteristics
11
Guessing
Patterns
Characteristics
Analysis
12
Patterns
PWD Patterns of Chinese -- Pinyins
你 好
nǐ hǎo
Patterns
13
Pinyins/Words -- Letter only
Chinese Pinyins English Words CSDN
Duduniu Tianya
7k7k 178.com
Rockyou
Yahoo
41.61 % (5.15%) 40.63% (4.15%) 33.28% (3.91%) 44.70% (4.97%) 57.31% (5.25%)
15.59% (1.93%) 10.39% (1.06%) 15.35% (1.80%) 10.04% (1.02%)
2.20% (0.20%) 6.94% (2.99%)
4.31% (1.46%)
25.47% (10.98%)
34.92% (11.86%)
Le#er-‐Only Passwords
Patterns
14
Pinyins/Words -- Mixed
Chinese Pinyins English Words CSDN
Duduniu Tianya
7k7k 178.com
Rockyou
Yahoo
25.49% (10.68%) 23.59% ( 5.78%) 25.17% (13.87%) 21.09% ( 5.84%) 23.49% ( 9.97%)
7.97% (3.34%) 6.05% (1.48%) 6.48% (3.57%) 7.02% (1.94%) 4.58% (1.94%)
6.88% ( 2.61%)
4.53% ( 2.59%)
28.11% (10.65%)
27.99% (16.01%)
Mixed Passwords
Patterns
15
Special Passwords -- Love
Top Chinese Pinyins Top English Words 12345
woaini (1.47%) li (1.06%)
wang (0.97%) tianya (0.89%) zhang (0.84%)
password (1.28%) iloveyou (0.98%)
love (0.76%) angel (0.59%)
monkey (0.45%)
Patterns
16
Dates -- Extraction
a123456b a1234567b
a12345678b a123456789b
a1234b56789
Patterns
√
√
17
Dates -- Eight digits
YYYYMMDD DDMMYYYY CSDN
Duduniu Tianya
7k7k 178.com
Rockyou Yahoo
29.24% 36.26% 28.87% 32.41% 30.46%
2.64% 2.78%
17.66% 11.17%
MMDDYYYY 0.25% 0.35% 0.28% 0.18% 0.13%
0.43% 0.60% 0.84% 0.37% 0.19%
7.70% 12.00%
Patterns
18
Dates -- Six digits
YYMMDD DDMMYY CSDN
Duduniu Tianya
7k7k 178.com
Rockyou Yahoo
27.21% 23.93% 17.84% 24.34% 13.96%
5.63% 4.66%
18.42% 7.77%
MMDDYY 4.04% 3.05% 2.97% 2.63% 1.72%
1.24% 1.19% 1.78% 0.88% 1.30%
21.90% 25.99%
Patterns
19
Beginning End CSDN
Duduniu Tianya
7k7k 178.com
Rockyou Yahoo
21.68% 27.33% 24.76% 32.17% 22.30% 27.40% 22.66%
68.69% 72.34%
Middle 4.32% 4.75% 1.36% 2.70% 1.03%
74.00% 67.07% 73.88% 65.13% 76.67%
3.91% 5.00%
Patterns
Dates at the End
20
Guessing
Patterns
Characteristics
Analysis
Guessing
21
Guessing based on Probabilistic Context-Free Grammar
Guessing
22
Describe the structures of passwords using a set of rules.
Dictionary guessing
[1]Matt Weir et al. Password Cracking Using Probabilistic Context-Free Grammar, In IEEE Symposium on Security and Privacy, 2009.
Example
Guessing
23
Start
L5D3
S2L3D3
##L3D3
**L3D3
##L3123
##L3321
$$L3123
$$L3321
L5123
L5321 70%
30%
60%
40%
50%
50%
60%
40%
60%
40% Ln = n le#ers Dn = n digits Sn = n symbols
Guessing CSDN
RockyouTS
MRockyouTS
DuduTS
RockyouDuduTS
Guessing
24
Training Sets
Guessing CSDN -- Dictionaries
20,000 Most Popular Pinyins
20,000 Most Popular Six-digit Dates
20,000 Most Popular Eight-digit Dates
EDict: Dic-0294 and English-lower CDict
Guessing
Dic-‐0294 can be downloaded from h#p://www.outpost9.com/files/WordLists.html English-‐lower can be downloaded from h#p://download.openwall.net/pub/passwords/wordlists/
25
Guessing CSDN -- Result
5.00% 10.00% 15.00% 20.00%
DuduTS (CDict) DuduTS (EDict)
RockyouDuduTS (CDict) RockyouDuduTS (EDict)
MRockyouTS (CDict+Date) MRockyouTS (EDict+Date)
MRockyouTS (CDict) MRockyouTS (EDict)
RockyouTS (CDict+Date) RockyouTS (EDict+Date)
RockyouTS (CDict) RockyouTS (EDict)
Percentage of Cracked Passwords
RockyouTS
MRockyouTS
RockyouDuduTS DuduTS
Guessing
26
Pinyin
5.00% 10.00% 15.00% 20.00%
DuduTS (CDict) DuduTS (EDict)
RockyouDuduTS (CDict) RockyouDuduTS (EDict)
MRockyouTS (CDict+Date) MRockyouTS (EDict+Date)
MRockyouTS (CDict) MRockyouTS (EDict)
RockyouTS (CDict+Date) RockyouTS (EDict+Date)
RockyouTS (CDict) RockyouTS (EDict)
Percentage of Cracked Passwords
RockyouTS
MRockyouTS
RockyouDuduTS DuduTS
Guessing
27
Dates
5.00% 10.00% 15.00% 20.00%
DuduTS (CDict) DuduTS (EDict)
RockyouDuduTS (CDict) RockyouDuduTS (EDict)
MRockyouTS (CDict+Date) MRockyouTS (EDict+Date)
MRockyouTS (CDict) MRockyouTS (EDict)
RockyouTS (CDict+Date) RockyouTS (EDict+Date)
RockyouTS (CDict) RockyouTS (EDict)
Percentage of Cracked Passwords
RockyouTS
MRockyouTS
RockyouDuduTS DuduTS
Guessing
28
Pinyin+Dates
5.00% 10.00% 15.00% 20.00%
DuduTS (CDict) DuduTS (EDict)
RockyouDuduTS (CDict) RockyouDuduTS (EDict)
MRockyouTS (CDict+Date) MRockyouTS (EDict+Date)
MRockyouTS (CDict) MRockyouTS (EDict)
RockyouTS (CDict+Date) RockyouTS (EDict+Date)
RockyouTS (CDict) RockyouTS (EDict)
Percentage of Cracked Passwords
RockyouTS
MRockyouTS
RockyouDuduTS DuduTS
Guessing
29
Guessing CSDN -- Improvement
We increase the guessing d efficiency by 34%
Guessing
30
Conclusion
• With similar strength, Chinese passwords contain more DIGIT-ONLY (>50% ) passwords.
• Just like English words in English passwords, Chinese PINIYINs (10%-15%) appear in Chinese passwords.
• Using Pinyin/Dates, we can improve the
EFFICIENCY (34%) of guessing Chinese passwords.
31
Q&A