cns2010lecture 3 :: cyphers1 elec5616 computer and network security matt barrie [email protected]
TRANSCRIPT
CNS2010 lecture 3 :: cyphers 2
cryptography
• Cryptography is the study of mathematical techniques related to the design of cyphers.
• It is one example of many mechanisms that make up security:– Cryptography– Signature / Pattern Matching– Access Control– Statistical Profiling– Traffic Security– Countermeasures– Software Security– Operating System Security– Tamper Resistance
CNS2010 lecture 3 :: cyphers 3
cryptography
• The fundamental application of cryptography is in facilitating secure communications over an insecure channel.
• How can Alice send a message to Bob over an insecure channel with Eve listening in?
• Eve is an active attacker and may tap, insert or modify messages in transit.
• How does one use cryptography to provide security such as:– Authentication ?– Confidentiality ?– Integrity ?– Non-repudiation ?
Alice Bob
Eve
CNS2010 lecture 3 :: cyphers 4
symmetric cyphers
• The traditional way of achieving this is through private key encryption.
• This is also known as symmetric encryption as the key used to encrypt and decrypt messages is the same.
• A symmetric cypher is one defined by the rule:
Dk(Ek(m)) = m
EMessage m
Secret Key k
C = Ek(m)D
Secret Key k
Original Message m
Encryption Decryption
CNS2010 lecture 3 :: cyphers 5
communication with symmetric cyphers
• Alice and Bob share an encryption algorithm Ek, a decryption algorithm Dk, and a secret key, k.
• Alice wants to send Bob a message, m.
• The unencrypted message m is known as either the plaintext or the cleartext.
• Alice encrypts m by computing the cyphertext c = Ek(m) and sends it to Bob.
• Bob decrypts c by computing Dk(c) = m to retrieve the original plaintext message m.
CNS2010 lecture 3 :: cyphers 6
• It is computationally hard to decrypt c without the secret key, k.
• The secret key k is usually a large number (≥ 64 bits).
• The range of possible values of k is called the key space K– For 64 bit keys the keyspace would be (0 .. 264-1)
• The range of possible messages is the message space M.
• A cryptosystem is a system consisting of an algorithm, plus all possible plaintexts, cyphertexts and keys.
symmetric cryptosystem
Alice Bobc = Ek(m)
Secret Key k Secret Key kEve Doesn’t know
kCan’t decrypt!
CNS2010 lecture 3 :: cyphers 7
types of symmetric cyphers
• Stream cyphers operating on a single bit (or byte) at a time.
• Block cyphers operating on blocks (numbers of bits) of plaintext at a time.
E DPlaintext Cyphertext Original Plaintext
key key
CNS2010 lecture 3 :: cyphers 8
cryptanalysis
• We always assume that attackers have – Complete access to the communications channel– Complete knowledge about the cryptosystem
• Secrecy must exist completely within the key
• There are five major attack models:
• Cyphertext-only attack (COA)– Attacker only has access to the cyphertext
– Given : c1 = Ek(m1), c2 = Ek(m2), … , cn = En(mn)
– Find any of : m1, m2, mn, k, or
an algorithm to infer mn+1 from cn+1
CNS2010 lecture 3 :: cyphers 9
cryptanalysis - attack models
• Known-plaintext attack (KPA)– Attacker intercepts a random plaintext / cyphertext pair
– Given : m1, c1 = Ek(m1), … , cn = En(mn)
– Find any of : Either k oran algorithm to infer mn+1 from cn+1
• Chosen-plaintext attack (CPA)– Attacker chooses a message, m1, and gets the cyphertext.
– Stronger than KPA (some cyphers resistant to KPA are not resistant to CPA)
– Given : m1, c1 = Ek(m1), … , cn = En(mn) where attacker chooses m1
– Find any of : Either k oran algorithm to infer mn+1 from cn+1
CNS2010 lecture 3 :: cyphers 10
cryptanalysis - attack models
• Chosen-cyphertext attack (CCA)– Attacker specifies a cyphertext, C, and gets the plaintext.
– Given : c1, m1 = Dk(c1), … , mn = Dn(cn) where attacker chooses c1
– Find any of : k
• Rubber-hose attack (RHA)– The cryptanalyst breaks knuckles, blackmails, threatens or
tortures someone until they cough up the key.– Sometimes known as a purchase-key attack.– Extremely powerful.– Usually the easiest way to break a cryptosystem.– Not permissible for Wargames.
CNS2010 lecture 3 :: cyphers 11
attack examples
• Known-plaintext attack:– An attacker knowing that source code is being encrypted
• (the first bytes most likely to be #include, copyright notices etc.)
– Famous break of the Japanese PURPLE cypher in WWII• A complex cypher used to protect high level communications• The allies had already broken several of the Japanese diplomatic
cyphers• PURPLE was used to protect communications, but the Japanese could
only afford to build and deploy 12 cypher machines • They sent these to the twelve most important embassies• Some messages needed to be broadcast to all embassies• So some messages had to be sent using old cyphers the US had
already broken!
• Chosen-plaintext attack:– Feed intelligence to an ambassador with goal that it is encrypted
and sent home
CNS2010 lecture 3 :: cyphers 12
classes of break
Worst to least severe:
• Total break– Attacker finds secret key k and hence can compute all Dk(c)
• Global deduction– Attacker finds alternate algorithm A equivalent to all Dk(c), without
finding k
• Local deduction (or instance deduction)– Attacker finds the plaintext of one intercepted cyphertext
• Information deduction– Attacker gains some information about the key or plaintext, e.g. a
few bits or the meaning of a message.
CNS2010 lecture 3 :: cyphers 13
attack metrics
• An algorithm is unconditionally secure, if no matter how much cyphertext an attacker has, there is not enough information to deduce the plaintext.
• Information security is a resource game; attacks are measured in terms of:– Data requirements
• how much data is necessary to succeed?
– Processing requirements (or work factor)• how much time is needed to perform the attack?
– Memory requirements• how much storage space is required?
– Computational cost• Dollar-seconds
CNS2010 lecture 3 :: cyphers 14
substitution cyphers
• Substitution cyphers are the oldest form of cypher
• The secret key consists of a table which maps letter substitutions between plaintext and cyphertext.
• Most famous is the Caesar cypher where each letter is shifted by 3 (modulo 26):
abcdefghijklmnopqrstuvwxyzDEFGHIJKLMNOPQRSTUVWXYZABC
– “A” becomes “D”– “B” becomes “E”
• Similar to this is ROT13 which shifts the plaintext 13 places, so encrypting twice results in the plaintext:
ROT13(ROT13(m)) = m
CNS2010 lecture 3 :: cyphers 15
substitution cyphers
• There are 26! (factorial) possible keys (~4 x 1026 - large!)• Monoalphabetic (single character) substitution cypher:
• Substitution cyphers are easy to break using frequency analysis of the letters (a cyphertext-only attack)– single letters– digraphs (groups of two letters)– trigraphs (three letters)
abcdefghijklmnopqrstuvwxyz
key XNYAHPOGZQWBTSFLRCVMUEKJDI
plaintext THISCOURSEROCKSTHEBLOCK
cyphertext MGZVYFUCVHCFYWVMGHNBFYW
CNS2010 lecture 3 :: cyphers 16
substitution cyphers
• Substitution cyphers are easy to break using frequency analysis of the letters (a cyphertext-only attack)
Letter count Letter CountEnglish Dictionary Cyphertext
a 18924 n 17432 a …?nb 3311 o 17520 boc 7852 p 5150 cpd 9491 q 179 dqe 30625 r 15399 erf 5176 s 16485 fsg 4480 t 20900 gth 11092 u 5815 hui 17080 v 2613 ivj 255 w 3899 jwk 1193 x 512 kxl 9990 y 3642 lym 6152 z 188 mz
0
5000
10000
15000
20000
25000
30000
35000
a b c d e f g h I j k l m n o p q r s t u v w x y z
CNS2010 lecture 3 :: cyphers 17
permutation cyphers
• Otherwise known as a transposition cypher
• The secret key is a random permutation, π
• Given a message m = m1m2m3 ... mn
• One can compute its encryption by
• Eπ(m) = mπ(1) mπ(2) … mπ(n)
• Suppose π = 1 2 3 4 5 64 3 1 5 2 6
• Then Eπ(“crypto”) = “PYCTRO”
CNS2010 lecture 3 :: cyphers 18
vignere cypher
• Originates in Rome in the sixteenth century
• A vignere cypher is a polyalphabetic substitution cypher (made up of multiple monoalphabetic substitution cyphers)
• The secret key is a word. Encryption is performed by adding the key modulo 26 in blocks:
plaintext launchmissilesatlosangeles
key cryptocryptocryptocryptocr
===========================================cyphertext nrscvvozqhbzgjyiecurlvwzgj
• Note that punctuation and white space is removed (this would make the code easy to break)
CNS2010 lecture 3 :: cyphers 19
breaking vignere cyphers
• The index of coincidence is a statistical measure of text, useful in distinguishing simple substitution ciphers from vignere cyphers.
• Intuitively it is a measure of the probability of collision of a symbol if a string is compared against a random shifted version of itself.
• The index of coincidence is measured by the following formula:
IC = ∑
• F = Frequency of symbol in text, N = Length of text
F (F - 1)
N (N - 1)
althoughthecipherisinscrutableandoftenunforgeabletoanyonewithoutthis
oftenunforgeabletoanyonewithoutthisAlthoughthecipherisinscrutableand
x x x x IOC = 4/68 = 6%
CNS2010 lecture 3 :: cyphers 20
breaking vignere cyphers
• Using the standard frequencies with which individual letters appear in English text, the probability that a coincidence will occur is κp = 0.0669.
• If the text is random, and the different letters are chosen with equal probability, the probability of a coincidence is much smaller (κr = 0.0385).
• The most important property of the index of coincidence is that it is the same for text encrypted with a simple substitution cipher as for plain text (Why?).
CNS2010 lecture 3 :: cyphers 21
index of coincidence by language
Malay 0.085286Dutch 0.079805Japanese 0.077236Hebrew 0.076844German 0.076667Spanish 0.076613Arabic 0.075889French 0.074604Portuguese 0.074528Finnish 0.073796Italian 0.073294Danish 0.070731Norwegian 0.069428Greek 0.069165English 0.066895Swedish 0.064489Serbo-Croatian 0.064363Russian 0.056074Random 0.038461
CNS2010 lecture 3 :: cyphers 22
example :: breaking vignere cyphers
CyphertextTPCTY LVEOO GBVRC BTWXS IHDKD QIRVQ QUKWL TMNQO EKMLP AURKL VHIUX YJRNV QWJEK UEQVD IXPLU RKLVT QSLKI LWAZI JWXPL QRKIO PWFME XLLCP KDIKV EUXYX EAAQV MEKVN AVZRQ JGEMX LQUPM PCRLO IZPZZ FPONI AYPVQ RMVHC QZFLV IKGUK LRXER MDWVG RVMPQ PWLWT TIYEQ JAYMK XBPUK PZJBF WIRKS QJMSV FYKFP QLRXE OIPID IQQLI ICPFP LMVBR BUAMW KLLUM FLRXE CDQFV IKNWZ KUIXF BTIII CQZQM JQVUX UVZXL XMDAY IIOMP AZXEK VYIDC EGIDX NMQJQ ZQVMP FMESC EQGQD IDIJD MDXYI ACGES WSIFQ YIUMQ CBQSE EINBT CNSOM AUQLW BQVFL VALTS AJKLV JIZHJ MPVZQ XTLCQ ZFLDC ECVPW LRQQB TIVQV UWGPK LFTAF IKLXH BQVKL BGIEE KLFTA FCCEK FAQPR LEGID QVWMG MPMCC LNWDH DCPRQ DMKJX KTQXY LFFMZ SKXEA NMGVJ OQUYI CIPVQ NICMH GCZXF XEGUF LRXDQ LAAEM KVWFL VTFVK MYJIJ GBALV EOVPK PFZFP OWMEH KGAEM EXEGU AVEMK INAVZ RQJMQ HFMQT CEXTE RUMYI KSHPW IXYIT CGILV VBKVU WYSRN LIECO CQZUP ZJQWX YCJSR NCZXF XEGMP ICMSG ZYIFP LTLRV FQJKV QIEIJ KMEMW PBGCZ XFXEG MFSYM AGUQX VEZJU QXFHL VPKAZ PIHWD XYSRC ZFQPK LFBTC JTFTQ FMJKL QLXIR HJGQZ XFXEG TMRUS CWXDM XLQPM EWHYF ESQRD ILNWD HWSOV PKRRQ BUAMO VJLTB TCIMD JBQSL WKGAE WROBD ZURXQ VUWGP FYQQN FVFYY NMMRU SCVPK QVVZA KGXFJ COQZI VRBOQ QWRRA FMEXI SVCTX XYIJV PMXRJ CNQOX DCPQC XJFVF CUFLP WBTDM RK
Shift Index1 .0282 .0453 .0344 .0375 .0426 .0357 .070 <<< the key length is most likely 7 (closest to 6.6%)8 .032
CNS2010 lecture 3 :: cyphers 23
breaking vignere cyphers
• Once the key length N is known, we attack the N subtexts of the message independently by frequency analysis.
• Taking every N-th symbol gives a monoalphabetic substitution cypher (0, N, 2N, …; 1, N+1, 2N+1; … etc.).
• Note the index of coincidence varies by language, and can be domain-specific.
CNS2010 lecture 3 :: cyphers 24
rotor machines
• Rotor machines are mechanical devices that rotate varying sized disks in different ratios using gears.
• Popular during WWII e.g. German ENIGMA.
• Enigma used 3 or 4 replaceable rotors. On each rotor was arranged the alphabet in random order. The "key” is the choice and initial positions of the rotors.
CNS2010 lecture 3 :: cyphers 25
rotor machines
• Suppose you wished to encrypt “B”. After pressing “B" the first rotor would select a permutation for “B” say “C”. The second rotor would match “C” with its own permutation, say “G”, the third “G” to “H” etc.
• After determining “B” to represent “C” the first rotor would be advanced a notch, so that if “B” were to be typed again, a different encryption would result (here “G”). Similarly for rotor 2, after a full rotation of rotor 1 in this manner, etc.
• 263 possible permutations for each letter.
CNS2010 lecture 3 :: cyphers 26
xor
• XOR (addition modulo 2) is commonly used to provide security in software programs (although extremely weak!).
• The message m is xor’d bitwise with a secret key:
c = m km = c k
• XOR is a Vignere cypher and easy to break:– Determine the key length N from index of coincidence– Shift cyphertext by N and XOR with itself– This removes the key (c c’ = m k m’ k = m m’)– Results in message XORd with a shifted version of itself– Language is extremely redundant (English ~1.3 bits / byte)– Easy to then decrypt
CNS2010 lecture 3 :: cyphers 27
one time pad
• A one time pad is where we use a different substitution cypher for each letter of the plaintext.
• Encryption is xor (for bits) or addition modulo-26.
• Provided the secret key is truly random, the plaintext does not repeat and the pad is never used again, a one time pad is perfectly secure.
• Failure in any these requirements results in no security.
• Strength comes from the fact that a truly random key added to plaintext results in truly random cyphertext.
• No amount of computing power can break a OTP, since every possible plaintext is equally likely.
• Problems: key distribution, key destruction, synchronisation.
• Used for ultra-secure low bandwidth communications.
CNS2010 lecture 3 :: cyphers 28
perfect secrecy
• Goal of cryptography is that cyphertext tells absolutely nothing about the plaintext.
• A cypher has perfect secrecy if For all m є M, c є C the plaintext and cyphertext are statistically independent:
Pr [ m1 = m2 | c1 = c2 ] = Pr [ m1 = m2 ]
• Assuming each transmitted message is equally likely, the probability that the transmitted message is m is
Pr [ m1 = m2 ] = |M|-1
• Now the probability that the transmitted message is m given that the observed cyphertext is c is :
Pr [ m1 = m ] = | {k: Ek(m) = c, k є K} | |K|
CNS2010 lecture 3 :: cyphers 29
perfect secrecy of OTP
• The key space must be at least as large as the set of plaintexts– i.e. |K| ≥ |M|
• For M = C = {0,1}n :– Any cypher with perfect secrecy satisfies |K| ≥ 2n
• The one time pad has perfect secrecy since
M = C = K = {0,1}n
Thus Pr [ m1 = m2 ] = 1 / 2n
Pr [ m1 = m2 | c1 = c2 ] = 1 / 2n
• Note: we require k є K to be as long as the message– The conundrum: we have to securely communicate a key as long as the
message in advance.
CNS2010 lecture 3 :: cyphers 30
attacks on the OTP
• A two-time pad has perfect insecurity!– If c1 = m1 k and c2 = m2 k
– then c1 c2 = m1 m2
• The OTP is highly malleable. – An attacker can easily create a new cyphertext with a known
relationship to the plaintext without decryption.– Stream cyphers suffer from the same problem.
CNS2010 lecture 3 :: cyphers 31
example :: OTP voting scheme
• Suppose plaintext is a one bit vote v є {0,1} – Where v = 0 is a vote for Labor, v = 1 is a vote for Liberal
• Alice encrypts her vote using a OTP and sends to Bob
c = b k (k є {0,1} randomly chosen)
• Eve intercepts the cyphertext and sends with bits flipped
c’ = c 1
• Bob receives c’ and decrypts vote = c’ k
= c 1 k= b k 1 k= !b (the vote is reversed!)
CNS2010 lecture 3 :: cyphers 32
references
• Handbook of Applied Cryptography– §1.4 - §1.5– §7.1 - §7.3
• Stallings (3rd Ed)– §2