optimizing password composition policies jeremiah blocki saranga komanduri ariel procaccia or...

Optimizing Password Composition Policies

Jeremiah BlockiSaranga Komanduri

Ariel ProcacciaOr Sheffet

To appear at EC 2013

3

Password Composition Policy

password

Password Composition Policy

4

How Do Users Respond?

Password1

6

Predictable Responses

1. password2. 1234563. 123456784. abc1235. qwerty6. monkey7. letmein8. dragon9. 111111….25. password1

7

Previous Work

• Initial password composition policies designed without empirical data [BDP, 2006].

• User’s respond to password composition policies in predictable ways [KSKMBCCE, 2011]

• Trivial password choices vary widely across contexts [BX, 2012].

• No theoretical models of password composition policies.

8

Our Contributions

We initiate an algorithmic study of password composition policies.

Theoretical Model

Security Goal

Policy Structure

User Model

9

Outline

• User Model• Policy Structure• Goal• Algorithms and Reductions• Experiments

10

Rankings ModelUser 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

Each User: Passwords P ordered by preference.n = 7 (number of users).

11

Rankings Model: Example 1User 1 User 2 User 3 User 4 User 5 User 6 User 7





… … … … … … …



Allowed Passwords All Passwords

𝑨=𝑷− { 𝑝𝑎𝑠𝑠𝑤𝑜𝑟𝑑 ′ }

12

Rankings Model: Example 1

Pr[111111 | A] = 3/7Pr[letmein | A] = 2/7Pr[123456 | A]=Pr[12345 | A]=1/7

User 1 User 2 User 3 User 4 User 5 User 6 User 7





… … … … … … …



13

Rankings Model: Example 2User 1 User 2 User 3 User 4 User 5 User 6 User 7





… … … … … … …



)}(|{ wNoNumberswPA

Allowed Passwords All Passwords

14

Warm-up

Fact: Let A’ A then for any w A’ Pr[w|A] ≤ Pr[w|A’]






… … … … … … …



Initially one person uses letmein as their password.

letmein

15

Warm-up

Fact: Let A’ A then for any w A’ Pr[w|A] ≤ Pr[w|A’]






… … … … … … …



Every user who used letmein before is still using the same password.

16

Outline

• User Model• Policy Structure– Positive Rules – Negative Rules – Singleton Rules

• Goal• Algorithms and Reductions• Experiments

17

Positive Rules

Rules R1,…,Rm P

R1 = {w | Length(w) 14}.

Active Rules: S {1,…,m}.

.

18

Positive Rules - Example

Rules R1,…,Rm P

R1 = {w | Length(w) 14}.


A{1}= {w | Length(w) 14}.

19

Negative Rules

Rules R1,…,Rm P

R1 = {w | Length(w) < 8}.


20

Negative Rules - Example

Rules R1,…,Rm P

R1 = {w | Length(w) < 8}.


A{1}= P - {w | Length(w) < 8}.

21

Singleton Rules

Rule Rw= {w} for each w P.

Can allow/ban any individual password.

Special Case of Positive Rules/Negative Rules.

22

Outline


23

Online Attack

password

Guess Limit: k-strikes policy

12345

12345

p(k, A) – probability of a successful untargeted attack given A.

25

p(k,A) - Example

p(1,A) = Pr[111111] = 3/7p(2,A) = p(1,A) + Pr[letmein] = 5/7p(3,A) = p(2,A) + Pr[123456]= 6/7






… … … … … … …



26

Goal: Optimize p(k,A)

Goal: Find a password composition policy S {1,…,m} which minimizes p(k,AS) for some k.

p(k, A) – Fraction of accounts an adversary can crack with k guesses per account given policy A.

p(1, A): minimum entropy of the password distribution resulting from policy A.

28

Outline


29

ResultsRankings Model

Constant k Large k

Singleton Rules P NP-HardAPX-Hard (UGC)

Positive Rules P NP-Hard

Negative Rules n1/3-approx is NP-Hard NP-Hard

This Talk: k=1

n1/3-approx is NP-Hard

Parameters: n, m, |P|

30

Negative Rules are Hard!

Theorem: Unless P = NP no polynomial time algorithm can even approximate p(1,AS) to a factor of n1/3- in the negative rules setting.

31

Reduction

Maximum Independent Set: g vertices e edges

Theorem [Hastad 1996]: NP-Hard to distinguish the following two cases (1) any independent set has size at most K = g or (2) the maximum independent set has size g1-.

32

Reduction (Preference Lists)Preference Lists: Type 1

W1 … W1

W2 … W2

… … …WK … WK

B1 … Bg

… … …

Observation: Unless we ban W1,…,WK we have p(1,AS) ≥ g/n

33

Reduction (Preference Lists)

Preference Lists: Type 2 (for each edge e = {u,v})(u,v,1) … (u,v,g)(v,u,1) … (v,u,g)

X … X… … …

Observation: If for any edge e = {u,v} we ban (u,v,1),…,(u,v,g) and (v,u,1),…,(v,u,g) then p(1,AS) ≥ g/n.

34

Reduction (Preference Lists)

Preference Lists: Type 3 (for each vertex v, i j [K])(v,i,j,1) … (v,i,j,g)(v,j,i,1) … (v,j,i,g)

X … X… … …

Observation: If we ban (v,i,j,1),…,(v,i,j,g) and (v,j,i,1),…,(v,j,i,g) then p(1,AS) ≥ g/n.

35

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4Preference Lists: Type 1

W1 … W1

W2 … W2

… … …

WK … WK

B1 … Bg

… … …

s

t

36

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4 Preference Lists: Type 2 (edge e = {u,x})

(u,x,1) … (u,x,g)

(x,u,1) … (x,u,g)

X … X

… … …

s

t

p(1,AS) ≥ g/n

37

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4 Preference Lists: Type 2 (edge e = {u,s})

(u,s,1) … (u,s,g)

(s,u,1) … (s,u,g)

X … X

… … …

s

t

38

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4 Preference Lists: Type 2 (edge e = {s,t})

(s,t,1) … (s,t,g)

(t,s,1) … (t,s,g)

X … X

… … …

s

t

39

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4


W1 … W1

W2 … W2

… … …

WK … WK

B1 … Bg

… … …

s

t

p(1,AS) ≥ g/n

40

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4


W1 … W1

W2 … W2

… … …

WK … WK

B1 … Bg

… … …

s

t

Rv,5

41

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4

K=5

s

t

Rv,5

Preference Lists: Type 3 (for each vertex u, i j [K])

(v,2,5,1) … (v,2,5,g)

(v,5,2,1) … (v,5,2,g)

X … X

… … …

p(1,AS) ≥ g/n

42

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4

K=4

s

t

Preference Lists: Type 3 (w, i=4, j=2)

(w,4,2,1) … (w,4,2,g)

(w,2,4,1) … (w,2,4,g)

X … X

… … …

44

ReductionIndependent Set of Size K? maxS [m] p(1,AS)Yes 1/n

No g/n where n = O(g3)

45


Constant k Large k




This Talk: k=1

P

Parameters: n, m, |P|

46

Key Difference: Positive vs. Negative

Let S w = {i | w Ri} (all rules Ri that contain w).

Negative Rules: Ban w - activate any rule in Sw.

Positive Rules: Ban w - deactivate all rules in Sw.

47

Positive Rules

Fact: Let S* {1,…m} denote the optimal solution, and let S S* then either

(1) p(1,AS) = p(1,AS*), or (S is optimal)

(2) S-Sw S*, where Pr[w|AS] = p(1,AS).

All rules Ri that contain the most popular word in AS.

48

Positive Rules

Fact: Let S* {1,…m} denote the optimal solution, and let S S* then either

(1) p(1,AS) = p(1,AS*), or (S is optimal)

(2) S-Sw S*, where Pr[w|AS] = p(1,AS).

Proof: Suppose for contradiction that w AS*, and observe that .

Therefore, . Contradiction!S*S AA

Si

iSi

i RR*

S*S*S AA|wA |PrPr,1 wp

49

Positive Rules Algorithm

Iterative Elimination: Initialize: S0 = {1,…,m}

Repeat: (Ban w - current most popular password)

Si+1 = Si – Sw

Claim: One of the Si’s must be the optimal solution!

50


Constant k Large k




This Talk: k=1

Question: What if we don’t have access to the full preference lists of each user? What if we don’t want to run in time n?

Parameters: n, m

51


Constant k Large k




This Talk: k=1

Sampling Algorithm: ε-approximation with probability 1-δ

Parameters: m, 1/ε, 1/δ

52

Sampling Algorithm

Theorem: There is an efficient algorithm that makes O(m log (m/𝛿)/𝜀2) queries and with probability at least 𝛿 outputs positive rules S ⊆ [m] s.t

p(1,AS) ≤ p(1,AS*)+𝜀.

Sample: q(A) returns w with probability P[w|A].

Idea: Run iterative elimination. In each round use sampling to estimate the probability of the most popular word.

53

Sampling Lemma

Lemma: Let s=100 log (m/)/2 denote the number of samples in each round, and let BADi denote the event that in iteration i, there exists a password w s.t.

(e.g., our probability estimate off by /2). ThenPr[i.BADi]≤

2

|Pr

s

sw w

iSA

# times w sampled

54

Sampling Lemma

Partition P into buckets.

2

Pr

iS

Aw 4

Pr2

iSAw

… …

12

Pr2

iSi iAw

B0 B1 Biw

Contains at mot 2i+1/ such passwords.

55

Sampling Lemma


s=100 log (m/)/2

… …B0 B1 Bi

2122PrPr

i

wS

ms

sAw

i

Chernoff Bounds:

Contains at most 2i+1/ such passwords.

w

56

Sampling Lemma


s=100 log (m/)/2

… …B0 B1 Bi

Contains at most 2i+1/ passwords.

Union Bound: 1

1

21 2

2

22Pr.Pr

i

i

i

wSi mms

sAwBw

i

57

Sampling Lemma


s=100 log (m/)/2

… …B0 B1 Bi

mm

BADi

ii

012

PrUnion Bound (buckets):

58

Sampling Lemma


s=100 log (m/)/2

… …B0 B1 Bi

mmBADi i.PrUnion Bound (rounds):

64

Outline

• User Model• Policy Structure• Goal• Algorithms and Reductions• Experiments– RockYou Dataset– Rules– Results

65

RockYou Dataset

• RockYou password leak: 32 million plaintext passwords.

• No Preference Lists: Insufficient for our sampling algorithm.

• We test our algorithm under an additional assumption…

66

0.51

Normalized Probabilities

RockYou: initial distribution over P.

0.5

letmein (0.1)

PA

1letmein (0.2)

A

67

Normalized ProbabilitiesRankings Model

Constant k Large k




Normalized Probabilities Model

Constant k Large k

Singleton Rules P P


Negative Rules NP-Hard NP-Hard

72

Base Line Results

73

Results

74

Discussion

• Optimal solution was better under negative rules.

• However, sampled solutions were much better with positive rules.

• Interesting Directions:– Additional Rules?– Is the Normalized Probabilities Model reasonable?– General experiment in preference list model?

75

Open Questions

• Efficient approximation algorithm in negative rules setting with normalized probabilities assumption?

• Adversary with limited background knowledge about the user (e.g., age, gender, birthday).

optimizing password composition policies jeremiah blocki saranga komanduri ariel procaccia or...

Documents

rankings model user

prwa prwa user

t qwerty1 qwerty p

passwords p

trivial password choices

predictable passwords

passwordsall passwords

predictable responses