part four: algorithmic aspects related to security in distributed systems

PART FOUR:Algorithmic aspects related to security in distributed systems

2

Suggested reading

Crittografia, P. Ferragina e F. Luccio, Ed. Bollati Boringhieri, € 16.

3

Roadmap

Introduction Computer security and network

security Attacks, mechanisms, services

Criptography Private key cryptography Public key cryptography

4

Computer security vs network security

Computer security: aims at protecting information within a computer

Network security: aims at protecting the exchange of information among computers

5

Network security New paradigms

Distributed information Access via distributed systems

New security issues Security in LANs

from attacks from unfaithful employees

Security in applications (e-mail, http, ftp,...) essential for business and electronic commerce

6

Basic problems Networks are insecure because most of

communications occur in clear Often there is no server authentication,

but only (and not always) of users No physical point-to-point connections

but through shared lines through third-party routers

7

Security: fundamental issues Attacks: actions that violate the security

of information held by an organization Security mechanisms: hardware and

software tools designed to prevent and to face security threats

Security Services: services that ensure the security of information through the use of one or more security mechanisms

8

Attacks to security

9

Security mechanisms There is a large variety of security

mechanisms, both hardware and software, almost all based on cryptographic techniques

Cryptography (from the greek kryptos, hidden, and graphein, writing) is the discipline that deals with the study of the “secret scriptures”: “set of techniques that permit the construction of an encrypted text and the decryption of a cryptogram.” (Garzanti, 1972)

Security services: examples

Confidentiality: prevent the data sent from one person A to person B to be understood by a third party C.

Authentication: verify the identity of who sends or receives data.

Integrity: be sure that the data received are identical to those sent.

Non-repudiation: prevent users who send data may in future negate to have sent them (digital signature).

Ensuring confidentiality and integrity

Ensuring confidentiality and integrity

• Kerckhoffs’ principle: a cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

• It can be rephrased in the Shannon maxim: the enemy knows the system!

Cryptography: a brief history Cryptography is a science used in the

ancient antiquity to hide the content of text messages

Cryptography experienced a tremendous development during the Second World War, when British mathematician Alan Turing formalized the theory needed to decrypt the German cryptosystem known as Enigma.

Modern cryptography In 1949 C. Shannon published a paper which gave

start what is now called the Theory of Information. The union of this new science with the theories of

Probability, Complexity, and Numbers, gave start to modern cryptography.

Definition: A cryptosystem is a quintuple (P,C,K,Cod,Dec), where: P: finite set of plaintexts C: finite set of ciphertexts K: set of possible encryption keys Cod: P x K → C: encryption function (injective and invertible) Dec: C x K → P: decryption function

Properties of a cryptosystem If Cod and Dec use the same key to encrypt

and to decrypt a given text, we talk about symmetric cryptosystems, otherwise of asymmetric cryptosystems.

A cryptosystem is called perfect if plaintext and ciphertext are statistically independent.

Shannon has shown that a necessary condition for a cryptosystem to be perfect is that the key length is at least equal to the length of the text to encrypt

A perfect cryptosystem

One-time pad (Vernam G., AT&T, 1917):1. Builds a large random (and not

pseudorandom) key, for example using a detector of cosmic rays.

2. The ciphertext is constructed by a bitwise XOR between the plaintext and the key.

3. The key should never be reused (one-time pad).

From perfection to reality ...

Instead of being perfect (i.e., provably secure but practically unusable), used cryptosystems are: Computationally secure: the cryptanalytic problem

(namely, the decryption of a ciphertext without knowing the key) is computationally intractable.

Probabilistically secure: cryptosystems that have been proved to be invulnerable, provided that certain probabilistically unlikely events do not occur.

All modern cryptosystems actually used belong to the class of computationally secure.

Symmetric key algorithms

Symmetric key: the two subjects (A and B) use the same key K to encrypt and to decrypt data.

The encryption algorithms are public the symmetric key must be secret,

and so the main problem is the key exchange!

Symmetric key scenario

The problem of transmitting the key Q: If you want to use a symmetric

cipher to protect the dataflow between two parties, how to exchange the secret key?

A: You must use a secure channel of communication!

A first example of a symmetric key cipher: The Caesar cipher

Let us consider the Italian alphabet, and let us construct a cipher that replaces each letter of the alphabet by the letter which is 3 positions forward.

For example, the clear text “distributed algorithm" is encrypted in the cryptogram “gnvzuneazhg dolrunzmpv”.

However, as most of the ciphers based on transpositions and translations, it can be easily attacked by statistical approaches.

Statistical cryptanalysis The plaintext is obtained by means of the use of

statistical techniques on the frequency of characters or substrings of the ciphertext.

The state-of-the-art in symmetric key encryption: Rijndael

Developed by Joan Daemen and Vincent Rijmen.

This algorithm has won the selection for Advanced Encryption Standard (AES) in 2000. Officially, the Rijndael method has become the standard for symmetric key encryption of the XXI century.

The cipher uses a key of variable length to 128, 192, 256 bits, and a network of "confusion of the message," in which multiple operations of transposition, substitution, and xoring of blocks of fixed length are performed.

Limits of symmetric key ciphers

Does a secure channel of communication to exchange the secret key actually exist in reality? And if it does exist, why using encryption??

In addition, for secure communication between n users, one must exchange a total of (n-1)*n/2 keys. For instance, with 100 users you will need 4950 keys!

Asymmetric key algorithms Public/private key: Each subject S has:

Its own public key Kpu(S), known to all; A private key Kpr(S) known only by himself.

The requirements that a public key algorithm must enjoy are: Data encoded with one key can be decrypted

only with the other one; The private key should never be transmitted in

the network; It must be very difficult to derive a key from

the other one (in particular, the private key from the public key).

The various public key scenarios

First scenario: A encodes a message with the public key associated with B, which then decodes the message by using its own private key; in this way, confidentiality and integrity are guaranteed (B only can read the message)


Second scenario: A “signs” a message by encoding it with its own private key, and then sends it to B, which then authenticates the message by using the public key associated with A; in this way, authenticity and non-repudability are guaranteed (all can read the message, but A only can have signed it)


Third scenario: A “signs” a message by encoding it with its own private key, then re-encodes it with the public key associated with B; hence, it sends it to B, which decodes it by using its own private key, and then authenticates it by using the public key associated with A; in this way, confidentiality, integrity, authenticity, and non-repudability are guaranteed.

The birth of PKI systems• Where do I find the public keys of my

recipients?

• Creation of archives of public keys, the public key servers.

• But who guarantees the correspondence of public keys with the respective owners?

Birth of the Certification Authority (CA).

• At this point, who guarantees the validity of a certificate authority?

Act of faith!

The mathematics of public key systems It was introduced by Diffie and Hellman in 1976:Definition: A function f is called one-way if for

every x the computation of y=f(x) is simple (i.e., it is in P), while the calculation of x=f-1(y) is computationally hard (i.e., it is NP-hard).

Definition: A one-way function is called trapdoor if the calculation x=f-1(y) can be made easy once that additional information (private) are known.

... But unfortunately for them, they were not able

to build a one-way trapdoor function!

The RSA algorithm Designed in 1977 by Ron Rivest, Adi Shamir and

Leonard Adlemann, the cipher is patented, and has become public knowledge until 2000.

Basic idea: given two prime numbers p and q (very large), it is easy to calculate the product n = p∙q, while it is very difficult to compute the factorization of n (although this problem is not known to be NP-hard).

The best factorization algorithms currently available (Quadratic Sieve, Elliptic Curve Method, Pollard’s Heuristic, etc.). all have an exponential complexity, in the order of:

The RSA algorithm To ensure security, it is necessary that p and q are

at least 200 decimal digits. Indeed, in this way n=p∙q is 400 digits long, namely is in the order of 10400, and so:

≈ e79 ≈ 1034

which is computationally intractable. keys are typically 1024 bits long (21024 ≈ 10300)

RSA is much slower than symmetric key algorithms, and it is often applied for the transmission of small amount of data, like the private key in a symmetric key system.

RSA at work: key generation

1. Choose two large primes p and q and computes n=p∙q. 2. Compute the Euler totient function w.r.t. n, i.e., the

cardinality of all numbers less than n and prime with it: ϕ(n)=ϕ(pq)=pq-[(q-1)+(p-1)]-1=pq-(p+q)+1=

=(p-1)·(q-1)=ϕ(p)·ϕ(q) (since there are q-1 multiples of p less than n, and p-1

multiples of q less than n) 3. Choose a number 0<e<ϕ(n) s.t. GCD(e,ϕ(n))=1 (i.e., e,ϕ(n)

are coprime)4. Compute d such that e·d1 mod ϕ(n). 5. Define the public key as (e,n). 6. Define the private key as (d,n).

Recall: xy mod z the remainder of the integer division between x and z, and between y and z is the same, namely x mod z = y mod z (or, equivalently, there exists an integer k s.t. x=y+kz)

RSA at work

1. The encryption function of A is Cod(x):=xe mod n (with x<n), where (e,n) is the public key of the recipient B.

2. The decryption function of B is:Dec(Cod(x)):=Cod(x)d mod n = (xe mod n)d mod n

where (d,n) is the private key of B.

Sending a crypted message x

“Signing” a message x1. The encryption function of A is Cod(x)=xd mod n

(with x<n), where (d,n) is the private key of A.2. The decryption function of B is:

Dec(Cod(x)):=Cod(x)e mod n = (xd mod n)e mod nwhere (e,n) is the public key of A.

Public and private keys can be used interchangably, i.e., Dec(Cod(x))=Cod(Dec(x)).

RSA at work: an example (1 of 2) B needs to choose its keys; then, it selects two large

primes, for instance p=3 e q=11 (ehmm, not very large, actually!)

Then, n=33 e ϕ(n)=2·10=20. Then, B takes e=3, since 3 is coprime with 20 (3,33) is

the public key of B Then, B searches d s.t. 3d1 mod 20. Hence, from

3d=1+k·20, by setting k=1, we have d=7 (7,33) is the private key of B

Now, to encrypt a message, a sender A divides it in blocks of bits whose maximum expresseable value is less than n=33; then, a block P becomes:

C:=Cod(P)=P3 mod 33 To decode C, B computes P=C7 mod 33 In our example, since n=33, a block contains at most 5 bits

(25<33); however, in the practice, n is in the order of 21024, and so blocks have a size of 1024 bit, i.e., 128 ASCII characters (8 bits each).

RSA at work: an example (2 of 2)

To visualize the example, let us suppose that the 26 letters of the English alphabet are represented by using 5 bits, and so, since n=33, each block is made up by a single character:

Computational Complexity of RSA It can be shown that the keys (and thus p,q,e,d) can

be generated in polynomial time w.r.t. to their binary representation (namely, logarithmic in their value).

In particular, e is usually chosen by taking a quite small prime number (e.g., e=3).

Instead, d is obtained by an extension (polynomial) of the Euclidean algorithm for computing the GCD (based on the fact that GCD(a,b)=GCD(b,a mod b)).

However, to find large prime numbers (i.e., p and q), probabilistic primality testing algorithms are used, since deterministic algorithms are too slow (although polynomial, but in the order of a degree of 10).

Finally, note that the processes of encryption and decryption can be performed efficiently by successive exponentiation (so-called modular exponentiation).

Searching for p and q Definition (Monte Carlo algorithm): A

Monte Carlo "no-biased" algorithm is a randomized algorithm for solving a given decision problem, such that the answer "no" is always correct, while the answer "yes" may be incorrect with a fixed probability ε. Monte Carlo "yes-biased" algorithms are similarly defined.

The Miller&Rabin algorithm is a Monte Carlo "no-biased" algorithm to test the primality of a number n. Its time complexity is O(log3 n), and its probability of inaccuracy is ε≈1/4 (i.e., YES answer is correct with probability ≈3/4).

Miller&Rabin algorithm It is based on the following property: for an odd integer

n, and for some 2≤y≤n, let y-1=2wz, with z odd (and so w is the max allowed exponent for 2), and let us define the following 2 predicates:(P1): GCD(n,y)=1;(P2): (yz mod n = 1) OR (it exists 0≤i≤w-1 t.c. y2iz mod n=-1).

Theorem: If n is prime it satisfies both predicates, while if n is composite, then the number of integers between 1 and n-1 that satisfy both predicates is less than n/4.

We run a number of k times MR(n), testing each time the two predicates on a random integer less than n. If the algorithm answers "no“, even only once, the number is definitely composite, but if it always answers "yes", then the probability that the number is composite is 4-k, and therefore the probability that the number is prime is:

P(prime)=1-P(composite)=1-4-k

(e.g., if k=100, then P(prime)≈1-10-60 ≈ 1)

Miller&Rabin algorithm

Miller-Rabin(n)1. Set n-1=2sr with r odd2. For i=1 to k do

2.1 choose randomly an integer t s.t. 2≤t≤n-22.2 compute y=tr mod n2.3 if y≠1 do

2.3.1 j=12.3.2 while ((j≤s-1) and (y≠n-1))

y:=t2jr mod nj++

2.3.3 if y≠n-1 return composite3. Return prime (w.h.p. 1-4-k)

Is it easy to find prime numbers? Despite the efficiency of the primality test, it is still

unknown if the primes are too "few" and therefore difficult to find (Riemann hypothesis!).

Gauss Theorem (prime numbers): Let π(n) be the distribution function of prime numbers, i.e., the number of primes less than n. Then the following is satisfied:

So, if you search for a prime number of 100 digits, you should check "only" ln (10100) ≈ 230 consecutive numbers.

part four: algorithmic aspects related to security in distributed systems

Documents

security of information

security mechanismsattacks

computer network security

theory of information

exchange of information

quintuple p

finite set of ciphertextsk

finite set of plaintextsc