lecture 5: classical cryptographycryptanalysis of the vigenére cipher cryptanalysis of a...

38
Lecture 5: Classical cryptography Thomas Johansson T. Johansson (Lund University) 1 / 38

Upload: others

Post on 29-Dec-2019

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Lecture 5: Classical cryptography

Thomas Johansson

T. Johansson (Lund University) 1 / 38

Page 2: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Classical cryptography

Throughout history, secret writing became an established problem and thearea was named cryptology.

Up to very recently, cryptology was primarily concerned with military anddiplomatic applications and distinguished between two disciplines

cryptography which deals with development of systems for secret writing,

cryptanalysis which analyze existing systems in order to break them.

T. Johansson (Lund University) 2 / 38

Page 3: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Classical cryptography

The solutions to different cryptographic problems are refered to ascryptographic primitives.The primitive “symmetric encryption scheme”

T. Johansson (Lund University) 3 / 38

Page 4: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

A Model of a Cryptosystem

Let P be a finite set which is called the alphabet.

We often use the english letters as our alphabet, we number them asa = 0, b = 1, . . . , z = 26. This gives P = Z26, and |P| = 26.

Cryptographic convention: use the names Alice, Bob, Caesar, and Eve (Eveis considered to be the enemy).

T. Johansson (Lund University) 4 / 38

Page 5: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

A Model of a Cryptosystem

Alice encryption

Eve

key source

decryption

secure channel

Bobm c m

K

Figure: The Shannon model for symmetric encryption

T. Johansson (Lund University) 5 / 38

Page 6: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

A Model of a Cryptosystem

The plaintext m consists of the produced symbols in vector form, i.e.,

m = (m1,m2, . . . ,mn),

the encryption function Ek() calculating c = Ek(m)The key is taken from a set K of possible keys. Each key describes acertain function Ek : Pn → Cn′

each function Ek() must be invertible (injective)

T. Johansson (Lund University) 6 / 38

Page 7: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

A Model of a Cryptosystem

a decryption function Dk() that decrypts encrypted messages back toits original form.

Dk(Ek(m)) = m, ∀m ∈ Pn. (1)

the set of transformations {Ek()|k ∈ K} is known to the enemy, aswell as the probability distribution for the selected key and theplaintext. Kerckhoff’s principle.

T. Johansson (Lund University) 7 / 38

Page 8: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

A Model of a Cryptosystem

Different kind of attacks that can be applied.

Ciphertext only attacks. A ciphertext only attack is an attack in whichthe enemy has access to a ciphertext c and tries to recover either theplaintext m or the secret key k.Known plaintext, ...

T. Johansson (Lund University) 8 / 38

Page 9: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

The Caesar cipher

shifted each letter 3 steps. The letter a became the letter d, b becamee, et.A generalization is to shift not three but k positions, wherek ∈ {0, 1, . . . , 25} is the secret key. This cryptosystem is usually calledthe Caesar cipher.

Ek(m) = m + k (mod 26).

The decryption function is given by

Dk(m) = m− k.

T. Johansson (Lund University) 9 / 38

Page 10: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Cryptanalysis of the Caesar cipher

Exhaustive key searchciphertext wklvlvdphvvdjhwrbrx.

k plaintext0 wklvlvdphvvdjhwrbrx1 vjkukucoguucigvqaqw2 uijtjtbnfttbhfupzpv3 thisisamessagetoyou4 sghrhrzldrrzfdsnxnt5 rfgqgqykcqqyecrmwms6 qefpfpxjbppxdbqlvlr7 pdeoeowiaoowcapkukq8 ocdndnvhznnvbzojtjp9 nbcmcmugymmuaynisio10 mablbltfxlltzxmhrhn11 lzakaksewkksywlgqgm12 kyzjzjrdvjjrxvkfpfl13 jxyiyiqcuiiqwujeoek14 iwxhxhpbthhpvtidndj15 hvwgwgoasggoushcmci16 guvfvfnzrffntrgblbh17 ftueuemyqeemsqfakag18 estdtdlxpddlrpezjzf19 drscsckwocckqodyiye20 cqrbrbjvnbbjpncxhxd21 bpqaqaiumaaiombwgwc22 aopzpzhtlzzhnlavfvb23 znoyoygskyygmkzueua24 ymnxnxfrjxxfljytdtz25 xlmwmweqiwwekixscsy

T. Johansson (Lund University) 10 / 38

Page 11: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

The simple substitution cipher

Ek() operates on individual characters, but now we consider anarbitrary permutation of the alphabet as the key.the set of all permutations on Z26, written as

{π1, π2, . . . , π|K|}.

EK(m) = πk(m). (2)

Dk(c) = π−1k (c).

(examples)

T. Johansson (Lund University) 11 / 38

Page 12: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Cryptanalysis of the simple substitution cipher

the number of different permutations on Z26 is 26! > 4 · 1026. Toolarge!Instead, expoit the statistical nature of the plaintext source!

T. Johansson (Lund University) 12 / 38

Page 13: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Cryptanalysis of the simple substitution cipher

r-grams, (mi,mi+1, . . . ,mi+r)Single letters:

a 0.0804 j 0.0016 s 0.0654b 0.0154 k 0.0067 t 0.0925c 0.0306 l 0.0414 u 0.0271d 0.0399 m 0.0253 v 0.0099e 0.1251 n 0.0709 w 0.0192f 0.0230 o 0.0760 x 0.0019g 0.0196 p 0.0200 y 0.0173h 0.0549 q 0.0011 z 0.0009i 0.0726 r 0.0612

T. Johansson (Lund University) 13 / 38

Page 14: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

First approach: count the number of occurrences of the differentletters in the ciphertext. Clearly, the most common letter in theciphertext c is likely to correspond to the letter e in the plaintext.Better: instead of 1-grams make use of 2-grams or 3-grams.Our source is not memoryless. Large dependence between consecutiveletters!Example: If the source has generated the two letters wa, the nextletter can be for example f, g, i, etc., but the otherwise so commonletter e can not occur.

T. Johansson (Lund University) 14 / 38

Page 15: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

The most common 2-grams and 3-grams and their correspondingprobabilities.

th 0.0270 on 0.0154 ed 0.0111he 0.0257 an 0.0152 te 0.0109in 0.0194 en 0.0129 ti 0.0108er 0.0180 at 0.0127 or 0.0108re 0.0160 es 0.0115 st 0.0103the 0.0215 for 0.0036 ere 0.0027and 0.0060 tha 0.0032 con 0.0026tio 0.0048 ter 0.0029 ted 0.0023ati 0.0036 res 0.0027

T. Johansson (Lund University) 15 / 38

Page 16: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Example

Assume that we have received the following ciphertext:

xsftbiwmqooxwdxssfmtxmaibcsfiisctfzsfmibcixiiczcawbxosbxamqiwqsaxjfqscixssaxnxiibxijcznlqlcmixnqemtfmiczmxiufinbqvwfixsoxwlxrfmtxmflvqzixmiafwuqjcznibclnwiczfqewvxifcmifmzqqlioqqmcibzccbxadfmxssnxoxrcmcawbclqxmcawqdisnuqesafigcibxiwbcoxwibcwfwiczqdibcgqnfmrxmwxwobqsqjcaibctfzsofibibcixiiczcawbxosobqoxwibcaxetbiczqdibclxfaobqbxacwuxvca

dzqlibcvfzxicwibcfmiczmdzqomca

Different letters, the most frequent ones are i 46 times, c 40 times, x 37times, etc.

Frequency count on 3-grams, which is much more powerful: The mostfrequent 3-grams : ibc occuring 11 times and icz occuring 7 times

T. Johansson (Lund University) 16 / 38

Page 17: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

C

This gives us π(t) = i, π(h) = b, and π(e) = c.ibc corresponds to a plaintext 3-gram of the form te*

Conclusion: π(r) = z.start decrypting the ciphertexta***ht*****a**a*****a**the**tt*e**r***thetattere**ha**ha***t****. . .

T. Johansson (Lund University) 17 / 38

Page 18: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

The unknown last letter in the sequence *thetattere* is a d (tattered).This gives π(d) = a.

This gives us some additional information and if we continue to do a“partial” decryption we find later in the plaintext the sequence**a*theda**hter**the*. . . .

We guess that the word daughter is present, giving π(o) = q, π(f) = d.

T. Johansson (Lund University) 18 / 38

Page 19: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Polyalphabetic ciphers

Polyalphabetic ciphers operate differently on various portions of theplaintext.

Ssimplest case: a number of mono-alphabetic ciphers with different keysare used sequentially and then cyclically repeated.

T. Johansson (Lund University) 19 / 38

Page 20: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

The Vigenére cipher

Cyclically uses t Caesar ciphers, where t is called the period of the cipher.

The encryption function maps the plaintext m = m1,m2, . . . to theciphertext c = c1, c2, . . ., through

c = Ek(m1,m2, . . . ,mt), Ek(mt+1,mt+2, . . . ,mt+t), . . . ,

where

Ek(m1,m2, . . . ,mt) = (m1 + k1,m2 + k2, . . . ,mt + kt), (3)

and the key k consists of t characters k = (k1, k2, . . . , kt).

The key k is often chosen as a word (keyword).

T. Johansson (Lund University) 20 / 38

Page 21: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Example

The plaintext youmustvisitmetonight is to be encrypted using aVigenére cipher with period 4, where k = lucy.

y o u m u s t v i s i t m e t o ...+ + + + + + + + + + + + + + + +l u c y l u c y l u c y l u c y ...j i w k f m v t t m k r x y v m ...

The resulting ciphertext is jiwkfmvttmkrxyvm. . ..

T. Johansson (Lund University) 21 / 38

Page 22: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Cryptanalysis of the Vigenére cipher

Cryptanalysis of a polyalphabetic substitution cipher is split into twodifferent problems,

1. determine the period of the cipher2. reconstruct the t different substitution alphabets that have been used.

T. Johansson (Lund University) 22 / 38

Page 23: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Kasiski’s method

observation: repeated portions of plaintext encrypted with the sameportion of the keyword results in identical ciphertext segments.expects the number of characters between the beginning of repeatedciphertext segments to be a multiple of the keyword length.Compute the greatest common factor of all such distances betweenidentified repeated segments.

T. Johansson (Lund University) 23 / 38

Page 24: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Example

Assume that we have the following ciphertext

vyckbygecgxukgftkmuzlvtjgcyibngcwhagtkntkaqughbfuvajkwdlrqrm

uzlvtjcebdfcgprtqlqirwxhuvsshjcebdfcgprtqlqwntkaqujtgyvyegxs

vvpiaspkftfwujyvxshrggequvajkwupqixedvadfwurtpbdcsjtmzgkfgxw

nflvkwrvyixvuvojxfevqxgljzqekgdccbtjgkwebuccmumzgncpdfgjqdyl

jvndeqccnwttgkgrthrimpvzyzrwtkorjadwanmgwp

Distance of muzlvtj is 42. The other two distances are 96 and 24respectively.

The gcd is 6.

T. Johansson (Lund University) 24 / 38

Page 25: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Second part: Find the different Caesar ciphers

After t has been determined, we know that mi,mi+t,mi+2t, . . . allhave been encrypted with the same substitution cipher.Split the ciphertext characters into t different multisets, eachcontaining the letters encrypted by a mono-alphabetic substitutioncipher.The key of each mono-alphabetic substitution cipher can bedetermined through the statistics of 1-grams together with some trialand error.

T. Johansson (Lund University) 25 / 38

Page 26: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Example cont’

S1 = {v, g, k, u, g, g, t, . . .},

S2 = {y, e, g, z, c, c, k, . . .},...

S6 = {y, u, m, j, n, g, a, . . .},

Counting the different letters in S1 we get

a b c d e f g h i j k l m n o p q r s t u v w x y z1 0 5 3 1 0 10 1 0 2 1 0 0 1 1 1 4 2 0 1 9 4 0 0 1 0

T. Johansson (Lund University) 26 / 38

Page 27: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Example cont’

By cyclically shifting this frequency count and looking for the best matchcompared to the known distribution we find it at k1 = 2 (or k1 = c).

Doing the same procedure for all six multisets we finally reach the keyk = crypts and the ciphertext can be decrypted as

the vigenere cipher using a relatively short period isinsecure but by using . . ..

T. Johansson (Lund University) 27 / 38

Page 28: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Index of coincidence

Let pi be the unknown probability of the character i in the ciphertext,i =a,b,...,z.The measure of roughness, MR, measures the deviation of thedistribution of ciphertext characters from a flat frequency distributionas follows:

MR =z∑

i=a

(pi −126

)2 =z∑

i=a

p2i −

126

. (4)

T. Johansson (Lund University) 28 / 38

Page 29: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Index of coincidence

The minimum value of MR is MRmin = 0, corresponding to a flatdistribution.The maximum value will occur when

∑zi=a p2

i is maximized, which inour case corresponds to a mono-alphabetic cipher.

MRmax = 0.0667− 0.0385 = 0.0282, (5)

MR will vary with the period t, reaching MRmax for t = 1 andMRmin for t = ∞.

T. Johansson (Lund University) 29 / 38

Page 30: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Index of coincidence

The MR cannot be computed directly, because the distribution (andt) is unknownbut it can be estimated

T. Johansson (Lund University) 30 / 38

Page 31: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

IC

Let fi denote the number of appearances of letter i, i =a,b,..., z, in aciphertext of length n.

The IC is defined as the probability that two arbitrary chosen characterfrom the given ciphertext are the same, i.e.,

IC =∑z

i=a fi(fi − 1)n(n− 1)

. (6)

IC is an estimate of∑z

i=a p2i , and it will also provide an estimate of

MR + 1/26. We can express this as E(IC) = MR + 1/26, where E(IC)is the expectation of (6).

T. Johansson (Lund University) 31 / 38

Page 32: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

IC

t 1 2 3 4 5 7 ∞E(IC) 0.066 0.052 0.047 0.044 0.042 0.041 0.038

Example: Calculating IC for the ciphertext in the previous example givesIC = 0.042, indicating a period of 5. As we know that the period is 6, ...

T. Johansson (Lund University) 32 / 38

Page 33: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Ciphertext autocorrelation - a better way to find the period

With a given ciphertext c = c1, c2, . . . , cn, we count the number ofoccurrences ci = ci+t∗ in the interval i = 1, . . . , n− t∗, for different valuesof t∗. The lowest value of t∗ with number of occurrences around 0.066n iswith high probability the period t.

Improved if we divide the letters in t∗ multisets by selecting every t∗th

letter and consider all possible pairs of letters within each multiset.

T. Johansson (Lund University) 33 / 38

Page 34: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Other important classical ciphers

The Vernam cipher: (or the one-time-pad)

a ciphertext of fixed length n encrypted by a Vigenére cipher, where theperiod of the key is chosen to be t = n.

we prove that the Vernam cipher is totally secure (i.e. unbreakable) if thekey is chosen uniformly at random among all Zn

26 possible values.

T. Johansson (Lund University) 34 / 38

Page 35: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Other important classical ciphers

The simple transposition cipher: (or permutation cipher)

Transposition cipher with a period t involves grouping the plaintext intoblocks of t consecutive characters. The key is a permutation π on thepositions within the block,( t! possible keys).

The encryption function maps the plaintext m = m1,m2, . . . to theciphertext c = c1, c2, . . ., through

c = Ek(m1,m2, . . . ,mt), Ek(mt+1,mt+2, . . . ,mt+t), . . . ,

whereEk(m1,m2, . . . ,mt) = mπ(1),mπ(2), . . . ,mπ(t).

T. Johansson (Lund University) 35 / 38

Page 36: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Other important classical ciphers

Decryption is done through the inverse permutation,

Dk(c1, c2, . . . , ct) = cπ−1(1), cπ−1(2), . . . , cπ−1(t).

Let the key be π = (25134), meaning that π(1) = 2, π(2) = 5, . . .. ThenEk(m1,m2, . . . ,m5) = (m2,m5,m1,m3,m4).

Assume that m = findingthetreasurecanonlybedoneby .... ThenEk(findi) = iifnd, Ek(ngthe) = genth, etc., and the ciphertext is

c = iifndgenthrstearauecoynnl . . .

Decryption is done by the inverse permutation π−1 = (31452), andDk(c1, c2, . . . , c5) = (c3, c1, c4, c5, c2). Decrypting givesDk(iifnd) = findi, etc.

T. Johansson (Lund University) 36 / 38

Page 37: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Other important classical ciphers - The Hill cipher

The Hill cipher acts on t-grams from Z26 through a key which is aninvertible t× t matrix A = [aij ].

The encryption function maps the plaintext m = m1,m2, . . . to theciphertext c = c1, c2, . . ., throughc = Ek(m1,m2, . . . ,mt), Ek(mt+1,mt+2, . . . ,mt+t), . . ., where

Ek(m1,m2, . . . ,mt) = (m1,m2, . . . ,mt)A.

Decryption involves using A−1.

T. Johansson (Lund University) 37 / 38

Page 38: Lecture 5: Classical cryptographyCryptanalysis of the Vigenére cipher Cryptanalysis of a polyalphabetic substitution cipher is split into two different problems, 1. determine the

Other important classical ciphers - Rotor-based machines

Rotor-based machines: Polyalphabetic substitution ciphers implemented bya class of rotor-based machines were the dominant cryptographic tool inWorld War II.

The most well known rotor-based machine is the Enigma (used by theGermans in World War II).

Boris Hagelin also made several designs of rotor-based machines. One ofthe, called M-209, was extensively used by the US army during the 1940s.

T. Johansson (Lund University) 38 / 38