language modeling and encryption on packet switched networks

44
Language Modeling and Encryption on Packet Switched Networks Kevin McCurley

Upload: shen

Post on 16-Jan-2016

65 views

Category:

Documents


0 download

DESCRIPTION

Language Modeling and Encryption on Packet Switched Networks. Kevin McCurley. A.A. Markov of St. Petersburg 1856-1922. Андрей Андреевич Марков. The science of cryptology. Devise a mathematical model of communication Devise a mathematical model of an adversary Construct a cryptosystem - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Language Modeling and Encryption on Packet Switched Networks

Language Modeling andEncryption on

Packet Switched Networks

Kevin McCurley

Page 2: Language Modeling and Encryption on Packet Switched Networks

A.A. Markovof St. Petersburg

1856-1922

Андрей Андреевич Марков

Page 3: Language Modeling and Encryption on Packet Switched Networks

The science of cryptology

1. Devise a mathematical model of communication

2. Devise a mathematical model of an adversary

3. Construct a cryptosystem4. Prove that the theoretical adversary cannot

break the cryptosystem (perhaps under assumptions)

Page 4: Language Modeling and Encryption on Packet Switched Networks

The application of cryptology

• Choose a construction for a cryptosystem

• Adapt it to your model of communication

• Watch it get broken by adversaries who don’t fit the model.

The choice of a model isat least as important as a proof.

Page 5: Language Modeling and Encryption on Packet Switched Networks

Security models

The real

world

Complexity- theoreticsecurity

Information theoreticsecurity

Page 6: Language Modeling and Encryption on Packet Switched Networks

Attacks that have fallen outside the scope of security models

• Electromagnetic radiation measurements

• Timing analysis

• Power analysis

• Fault analysis

• Acoustic attacks

• Cache attacks

Page 7: Language Modeling and Encryption on Packet Switched Networks

Micali and Reyzin“Physical Observable Security”

TCC 2004

• Extension of security models to include physical instantiation of computation.

• Better description of adversary results in a more robust model.

Page 8: Language Modeling and Encryption on Packet Switched Networks

Are these the only attacks?

• Micali & Reyzin address the physical act of computation

• What about the physical act of communication?

How do we even define communication?

Page 9: Language Modeling and Encryption on Packet Switched Networks

Claude E. Shannon

“The mathematical theory of communication” (1948)

“The communication theory of secrecy systems” (1949)

Page 10: Language Modeling and Encryption on Packet Switched Networks

Shannon’s linear model of communication (1948)

Informationsource

Transmitter

Noise

Receiver DestinationChannel

A discrete Markov process on a finite domain

Page 11: Language Modeling and Encryption on Packet Switched Networks

Information theoreticperfect secrecy

• Messages are drawn from an underlying known probability distribution

• An encryption system has perfect secrecy if P(ciphertext | plaintext) is independent of plaintext.

Page 12: Language Modeling and Encryption on Packet Switched Networks

The one time pad has perfect secrecy.

BUT: messages must all be the same length or else you lose perfect secrecy.

One time pad

Page 13: Language Modeling and Encryption on Packet Switched Networks

What about infinite message spaces?

Theorem. (Chor-Kushilevitz, 1989) It is impossible to construct an information-theoretically secure cryptosystem on a countably infinite message space.

Page 14: Language Modeling and Encryption on Packet Switched Networks

Theorem (Goldreich) A semantically secure encryption scheme cannot hide the length of the plaintext against polynomial time adversaries.

Are these just theoretical concerns? or do they show up in practice?

Is leaking the message length unavoidable?

Page 15: Language Modeling and Encryption on Packet Switched Networks

Example: communication to a stock broker with messages of “buy IBM” or “sell IBM”

Example: in a file system, sizes of many files may provide evidence that encrypted files are copies of known files

Example: in the military, voluminous communications may indicate a command or intelligence center with multimedia

What can be learned from the length of a message?

Page 16: Language Modeling and Encryption on Packet Switched Networks

In practical systems, bandwidth and storage are not free!

How can we deal with message lengths?

Pad the messages to be all the same length Keep talking at all times

Page 17: Language Modeling and Encryption on Packet Switched Networks

i3or uqpcs hrt nbqpdn 0xcae opx

How does one approach the question?is it encrypted or just gibberish?what language is it?what is the character set?who said it?when did they say it?how was it encrypted?

What is this the encryption of?

Page 18: Language Modeling and Encryption on Packet Switched Networks

A related problem:communication is segmented

• Human spoken language is broken into syllables, words, sentences.

• Human written communication is broken into paragraphs, chapters, articles, books.

• Movies are broken into frames.

• Internet communication is packet switched.

Page 19: Language Modeling and Encryption on Packet Switched Networks

# ####### ### ## ##### #### ###### ### ######## ######## #### #### #### ##### #### ## # #######

###### ########

An cryptanalyst would have a tremendous advantage in guessing the message!

I usually get my stuff from people who promised somebody else that they would keep it a secret.

Walter Winchell

Suppose I told you where the spaces were

Page 20: Language Modeling and Encryption on Packet Switched Networks

Word lengths in 3 translations of a Tolstoy novel

Page 21: Language Modeling and Encryption on Packet Switched Networks

Word length transitions

1 2 3 4 5 6 7 8S1

S2

S3

S4

S5

S6

S7

S8

1 2 3 4 5 6 7 8S1

S2

S3

S4

S5

S6

S7

S8

French

1 2 3 4 5 6 7 8

S1

S2

S3

S4

S5

S6

S7

S8

English

German

Page 22: Language Modeling and Encryption on Packet Switched Networks

Internet CommunicationsAre Packet Switched

Packets are formed and transmitted according to the “language of the application”

Complications arise from buffering, Nagle’s algorithm, etc.

Page 23: Language Modeling and Encryption on Packet Switched Networks

SYNACK

SYN-ACK

Banner

Login:

mab

password:

w

h

. . .

The “language” of ssh (simplified)

Page 24: Language Modeling and Encryption on Packet Switched Networks

Cryptanalytic attacks based on packet timings and size

• Keystroke timings (Song, Wagner, Tiang)

• Probable plaintext (Bellovin)

• Several traffic classification studies:– Moore and Zuev (2005) 95% accuracy with

supervised Bayesian analysis

Page 25: Language Modeling and Encryption on Packet Switched Networks

Another looming attack:voice over IP

• VoIP has high quality of service demands.

• VoIP packets are easy to recognize.

• VoIP supports “silence suppression” for bandwidth savings.

Page 26: Language Modeling and Encryption on Packet Switched Networks

IPSEC traffic padding (TFC)

• Omitted in early versions of IPSEC

• Added in recent versions of tunnel mode to obfuscate traffic patterns.

• No theoretical basis for security arguments.

• No guidance on how to generate dummy traffic.

Page 27: Language Modeling and Encryption on Packet Switched Networks

If a known packet is repeatedly encrypted,

then the padding distribution may be

recognizable, and can be subtracted.

Note: simple padding is NOT enough

Page 28: Language Modeling and Encryption on Packet Switched Networks

Do we have to packetize data?

Success of Internet depends on it Fairness in a shared medium Quality of service Buffering Error correction retransmission

Page 29: Language Modeling and Encryption on Packet Switched Networks

Can we afford to keep the Internet channels full?

• Everyone depends on everyone else using only what they need.

• Internet exists as a sparse graph:– O(n) total nodes to connect n endpoints– To support circuits, we would need a much

denser graph

• Alternatives in onion routing, mixes

2

n

Page 30: Language Modeling and Encryption on Packet Switched Networks

Shannon’s epic 1949 paper“Communication Theory of Secrecy Systems”

1. Concealment systems, that obscure the existence of communication.

2. Privacy systems, requiring special communication hardware.

3. Secrecy systems, utilizing mathematical transformations.

Shannon addressed only secrecy systems, calling concealment a “psychological problem”

Page 31: Language Modeling and Encryption on Packet Switched Networks

What is an appropriate definition of communication?

Page 32: Language Modeling and Encryption on Packet Switched Networks

Shannon’s communication model revisited

Informationsource

Transmitter

Noise

Receiver DestinationChannel

“Frequently the messages have meaning; that is they referto or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.”

Page 33: Language Modeling and Encryption on Packet Switched Networks

Deficiencies in Shannon’s model

• Communication should be bidirectional.

• Communication is a physical process with side effects.

• Communication is segmented.

• Communication has context.

Page 34: Language Modeling and Encryption on Packet Switched Networks

Temporal aspects of communication

Do nothing secretly; for time sees and

hears all things, and discloses all.

- Sophocles, 496-406 B.C.

Page 35: Language Modeling and Encryption on Packet Switched Networks

Two versions of Shannon’s paper

• Bell System Technical Journal (1948)– “A Mathematical Theory of Communication”

• University of Illinois Press (1949)– “The Mathematical Theory of Communication”– With a section by Warren Weaver: “Recent

Contributions to The Mathematical Theory of Communication”

Page 36: Language Modeling and Encryption on Packet Switched Networks

Weaver’s three levels of communication

Level A. How accurately can the symbols of communication be transmitted? (the technical problem)

Level B. How precisely do the transmitted symbols convey the desired meaning? (the semantic

problem)Level C. How effectively does the received

meaning effect conduct in the desired way? (the effectiveness problem)

Page 37: Language Modeling and Encryption on Packet Switched Networks

Definitions of communication• Shannon:

– Reproducing the output from a stochastic process either approximately or exactly.

• Lasswell’s definition:– Who says what to whom in what channel with what

effect

• The exchange of messages that change the a priori expectation of events.

• Griffin: the management of messages for the purpose of creating meaning

• Schramm: a purposeful effort to establish a commonness between a source and receiver

Page 38: Language Modeling and Encryption on Packet Switched Networks

The DIKW Hierarchy(from the 1980s)

• Data (bits or raw symbols)– ’k’,51771

• Information (symbols with relationships)– ‘e’ is more common than ‘z’

• Knowledge (useful information)– ‘e’ is energy; ‘e’ is a mathematical constant

• Wisdom (understanding of knowledge)– Extrapolative, non-deterministic process.

Page 39: Language Modeling and Encryption on Packet Switched Networks

DIKW Hierarchy(the origins)

• Where is the wisdom we have lost in knowledge?

• Where is the knowledge we have lost in information?

T.S. Eliot, 1934

“The Rock”

Page 40: Language Modeling and Encryption on Packet Switched Networks

Moving up the DIWK Hierarchy

• Data Information– Understanding structure

• Information Knowledge– Understanding patterns, context, relationships

• Knowledge Wisdom– Understanding principles, applications,

meaning, interpretation.

Page 41: Language Modeling and Encryption on Packet Switched Networks

At what layer does a cryptanalyst work?

• Data: what is the fifth symbol in this message?• Information: Are there more 1’s than 0’s in this

message? Is there ever a sequence 00010000000?• Knowledge: what language is being spoken in this

communication?• Wisdom: did the initiator just ask a question or

give a command? Should I expect a response? Who is in command?

Page 42: Language Modeling and Encryption on Packet Switched Networks

At what layer does the crypto designer work?

• Data: make sure that this bit is unrecognizable from a random bit.

• Information: make sure that they can’t estimate the distribution of bits accurately.

• Knowledge: make sure that the adversary cannot recover the encoded knowledge.

• Wisdom: make sure the cryptanalyst has no understanding of what they observe.

Page 43: Language Modeling and Encryption on Packet Switched Networks

Closing thoughts

• We need better models of communication in order to advance cryptology.

• We need better definitions of knowledge and wisdom in order to advance cryptology.

• Absolute security for internet communication is probably impossible.

Page 44: Language Modeling and Encryption on Packet Switched Networks

Security is mostly a superstition. ... Life is nothing if not a grand adventure.

- Helen Keller

http://mccurley.org/papers/traffic/ for updates