cryptography on a customized network · palavras-chave: indistinguibilidade de texto cifrado; ......

Cryptography on a Customized Network

Ricardo Martinho Ferreira Miranda

Thesis to obtain the Master of Science Degree in

Mathematics and Applications

Examination CommitteeChairperson: Prof. Maria Cristina Sales Viana Serodio Sernadas

Supervisor: Prof. Paulo Alexandre Carreira MateusCo-supervisor: Bruno Neto de Oliveira Tavares

Member of the Committee: Prof. Andre Nuno Carvalho Souto

November 2017

Acknowledgements

I want to thank both my supervisors Paulo Mateus and Bruno Tavares for all their support and guid-

ance. I would also like to thank my dearest friend Sofia Brito, whose counseling and motivation were

crucial aspects in the overcoming of the most difficult moments.

i

Resumo

Construir uma rede segura para ser utilizada em aplicacoes reais onde ha restricoes impostas

as capacidades dos elementos da rede e a transferencia de informacao necessita uma analise crip-

tografica costumizada de forma a proteger as comunicacoes e detectar e minimizar as vulnerabilidades

do sistema que poderao ser exploradas. Neste documento, uma rede com essas condicoes e apre-

sentada, procura-se encontrar um esquema topologico otimo antes de se escolherem os componentes

criptograficos da rede embutidos nas comunicacoes e armazenamento e posteriormente analiza-se a

sua seguranca. De entre as alternativas escrutinadas, apenas uma e escolhida como a solucao, por

comparacao em termos de performance, seguranca e adaptacao as restricoes impostas. Esta solucao

e implementada usando as linguagens de programacao C e Java. Prova-se que os esquemas de

encriptacao e protocolos escolhidos sao opcoes altamente adequadas e o seu uso na pratica e acon-

selhado. Estes resultados sao apenas validos para este especıfico caso de estudo, uma vez que na

eventualidade de alguma das restricoes ser alterada entao e provavel que exista uma solucao diferente

da sugerida e mais apropriada.

Palavras-chave: indistinguibilidade de texto cifrado; modo de operacao de cifra de bloco; seguranca

semantica; sistema de encriptacao simetrico.

iii

Abstract

Building a secure network to be used in real-world applications where there are constraints strictly

imposed to the capabilities of the network’s elements and to the data flow requires a customized crypto-

graphic analysis in order to protect the communications and detect and minimize the system’s exploitable

vulnerabilities. In this document a network under such conditions is presented and one is challenged

with providing an optimal topological scheme prior to choosing the network’s cryptographic components

embedded in the communication and data storage protocols and posteriorly analyzing their security.

Among the scrutinized alternatives a single one of them is elected as the solution by a comparison in

terms of performance, security and suitability under the enforced restrictions. This solution is imple-

mented using C and Java programming languages. The selected encryption schemes and protocols are

proven to be highly reasonable options and their use in practice is advised. These results are only valid

for this specific case of study, for if any of the established constraints is ruled out then it is most likely the

insurgence of an enhanced solution.

Keywords: ciphertext indistinguishability; block cipher mode of operation; semantic security; symmetric

cryptosystem.

v

Glossary

In The set {k ∈ N : 1 ≤ k ≤ n}.

P (A) Probability of occurrence of event A..

A∗ The Kleene star of A.

I The set of unique identifiers of gathering devices.

O f ∈ O(g)⇔ ∃M∈R+∃x0∈R∀x≥x0|f(x)| ≤M |g(x)|.

bitstring An element of Z∗2.

byte A metric related with data-storage, composed by 1 octet.

kB 1 kB = 1024 bytes.

octet A sequence of 8 bits.

vii

List of Abbreviations

bxc Floor function of x, for some x ∈ R.

0j The bitstring composed of j ’0’s, for some j ∈ N.

1j he bitstring composed of j ’1’s, for some j ∈ N.

[w]2 Binary representation of the word m.

dxe Ceiling function of x, for some x ∈ R.

w|k Suffix of w of length k, for some k ∈ N.

w|k Prefix of w of length k, for some k ∈ N.

x ‖ y Concatenation of words x and y.

x \ y Difference of x and y.

|w|2 The number of bits of the word w.

3DES Triple DES.

ACK Acknowledgement.

AES Advanced Encryption Standard.

BCMO block cipher mode of operation.

CA certificate authority.

CBC Cipher Block Chaining.

CCM Counter with CBC-MAC.

CFB Cipher Feedback.

CPU central processing unit.

CSPRNG cryptographically secure pseudo-random number generator.

CTR Counter.

CTR-H CTR mode with HMAC-256 checksum.

DAL Downstream Algorithm Lifecycle.

DB database.

DDL Downstream Data Lifecycle.

DES Data Encryption Standard.

ix

EAP Extensible Authentication Protocol.

ECB Electronic Codebook.

ECC Elliptic Curve Cryptography.

FIFO First in first out.

GCM Galois Counter Mode.

GD Gathering device.

GDj gathering device with unique identifier j ∈ I.

GMAC Galois Message Authentication Code.

GTK Group Temporal Key.

HMAC hash-based message authentication code.

IEEE Institute of Electrical and Electronic Engineers.

IEEESA Institute of Electrical and Electronic Engineers Standards Association.

IND-CCA Indistinguishability under chosen-ciphertext attack.

IND-CPA Indistinguishability under chosen-plaintext attack.

IV initialization vector.

KDF key derivation function.

LAN local area network.

MAC message authentication code.

MIC message integrity code.

MiM Man-in-the-middle.

MnDM Mission and data manager.

MPP Middle-point party.

NIST National Institute of Standards and Technology.

PBKDF2 password-based key derivation function 2.

PCgF package ciphertext generator function.

PCuF package ciphertext unpacking function.

PMK Pairwise Master Key.

PMS pre-mission system.

POA Padding Oracle Attack.

x

PRF pseudo-random function.

PRNG pseudo-random number generator.

PSch packing scheme.

PSK Pre-shared key.

PTK Pairwise Transient Key.

RFC Request for Comments.

SEM-CPA Semantic security under chosen-plaintext attack.

SPN Substitution Permutation Network.

SSID Service Set Identifier.

UAL Upstream Algorithm Lifecycle.

UDL Upstream Data Lifecycle.

WLAN wireless local area network.

XOR exclusive-or operation.

xi

List of Tables

4.1 Comparison between CTR and CFB features. . . . . . . . . . . . . . . . . . . . . . . . . . 58

xiii

List of Figures

2.1 Encryption round of a SPN. Corresponds to the round function g from cryptosystem 4. It

is used in all rounds except the last. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Network Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Extensible Authentication Protocol (EAP). . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 WPA2 four-way handshake. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.5 WPA2 group-key handshake. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6 Man in the middle attack. Eve is able to intercept the message and/or jam the communi-

cation channel at will. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1 General purpose and activity of the envisaged network. . . . . . . . . . . . . . . . . . . . 36

3.2 General layout of the desired network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Pre-deployment stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Topology of AP-based networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Topology of the ad-hoc network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.1 Key generation based on k users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.1 Pre-processing steps of the secret pass for the generation of the seed of the SHA-1

pseudo-random function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.2 Scatter plots of the average key generation time per number of gathering devices. . . . . 64

5.3 Upstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.4 Downstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

A.1 ECB mode encryption and decryption procedures using an arbitrary block cipher B. . . . 81

A.2 CBC mode encryption and decryption procedures using an arbitrary block cipher B. . . . 82

A.3 CFB mode encryption and decryption procedures using an arbitrary block cipher B. . . . 83

A.4 CTR mode encryption and decryption procedures using an arbitrary block cipher B. . . . 84

B.1 KeyGeneratorApp’s initial screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

B.2 KeyGeneratorApp’s target choice screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

B.3 KeyGeneratorApp’s file details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

B.4 KeyGeneratorApp’s key export final step. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

xv

B.5 KeyGeneratorApp’s key checker example screen. . . . . . . . . . . . . . . . . . . . . . . . 89

B.6 Pre-deployment stage secret information’s revealment. . . . . . . . . . . . . . . . . . . . . 90

C.1 Message format F∗1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

C.2 Message format F1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91







C.9 Message format F∗∗5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94



xvi

Contents

Resumo iii

Abstract v

Glossary vii

List of Abbreviations ix

List of Tables xiii

List of Figures xv

1 Introduction 1

1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Basic Concepts 3

2.1 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Modern Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1.1 Linear and Differential Cryptanalysis . . . . . . . . . . . . . . . . . . . . . 7

2.2.1.2 DES and 3DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.1.3 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.2 Block Cipher Modes of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.2.1 ECB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2.2 CBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2.3 CFB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.2.4 CTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.2.5 CCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.2.6 GCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.2.7 Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.3 Asymmetric Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Cryptographic Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.1 SHA-256 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

xvii

2.3.2 HMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.1 Key Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5 Communication Protocols in Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5.1 WEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5.2 WPA/WPA2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5.2.1 Initial Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5.2.2 4-way Handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.2.3 Group-key Handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6 Known Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.6.1 Brute Force and Dictionary Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.6.2 Man In The Middle Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.6.3 Birthday Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6.4 Replay Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6.5 Padding Oracle Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6.6 Stream Cipher Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.6.6.1 Key Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6.6.2 Bit-flipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.6.7 Weaknesses of Block Cipher Modes of Operation . . . . . . . . . . . . . . . . . . . 31

2.6.8 Side-Channel Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.6.9 Attacks on AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Network 35

3.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4.2 Communication Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.5 Message Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4 Security Analysis 51

4.1 Strengths and Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.1.1 Key Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.1.2 Packing Schemes and Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.1.2.1 Semantic security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.1.2.2 Encryption Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.1.3 Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.2 Possible Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.2.1 Chosen-plaintext attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

xviii

4.2.2 Chosen-ciphertext attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5 Implementation Details 61

5.1 Key Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.2.1 Upstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.2.2 Downstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Results 75

6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

References 77

A Schemes of Block Cipher Modes of Operation 81

B User Manual: Key Generation Application 85

C Message Formats 91

xix

Chapter 1

Introduction

The first approach to secure a certain piece of information dates back to the Ancient Greece. Ever

since, mankind has been continuously developing new methods for securing desired secrets and while

some create the methods to secure information, others put a lot of effort on discovering weaknesses in

order to retrieve the envisaged secrets. An example that reflects cryptography’s tremendous relevance

in modern days is World War II. The victory of the Allies is considered to have been greatly influenced

by their ability to eavesdrop on the enemy’s communications after being able to break the Enigma [1]

cipher and as such, this war propelled the major advancements in the fields of cipher construction and

cryptanalysis. The continuous demand for protecting information is the fuel that thrives the evolution of

computational security.

In the current work one is presented with a network composed by several types of devices with cer-

tain restriction with respect to memory, space and power consumption and aims to choose the most

suitable topology for the network and to create a security mechanism to be included in the communi-

cation protocol with the objective of providing the satisfiability of some cryptographic properties to the

messages travelling through the network. This work was developed for a real-world project in a business

environment under the supervision of analysts and developers of the company GMVIS Skysoft, S.A.

therefore several limitations were imposed.

1.1 Summary

This dissertation is segmented in six chapters and an appendix with three sections.

In chapter 2 some required state-of-the art concepts that are somehow related with the developed

work are addressed. In chapter 3 the problem is introduced, the options with regard to the network’s

topology and communication protocols are discussed, the general protocol for the upstream and down-

stream data lifecycles is presented and its underlying message formats are defined. Chapter 4 contains

a security analysis for the network defined in chapter 3. In chapter 5 the details regarding the implemen-

tation are discussed and some parts of the code are presented and analyzed as pseudocode. Chapter

6 is the last of its kind and contains a general overview of the results obtained and motivation for future

1

work on the subject.

In appendix A lie the figures regarding the construction of the block cipher modes of operation pre-

sented in chapter 2, appendix B is a user-guide manual for the developed user-interface application

with respect to the generation of keys and in appendix C one can visualize all the message formats

introduced in chapter 3.

2

Chapter 2

Basic Concepts

This chapter provides an overview of several cryptographic concepts and algorithms that are ap-

plicable to the system that is considered in the next section and references are provided for various

topics that are considered relevant, but out of the central scope of the text. The reader is assumed

to be familiarized with basic cryptographic theory. Section 2.1 addresses some security definitions for

cryptosystems and the most common techniques used by adversaries, section 2.2 introduces modern

cryptographic concepts with highlight for private-key cryptosystems, section 2.3 contains the properties

of cryptographic hash functions as well as a high level description of SHA-256, section 2.4 describes the

problem of generating random values, section 2.5 specified the wireless communication protocols used

in the problem at hand and the last section refers to nowadays’ known attacks to some of the systems

described throughout the chapter. The minor level of detail assigned to public-key cryptography is due

to the deprecation of this option with regard to the solution of security nominated in chapter 3.

The suite of algorithms for the key generation, encryption and decryption processes form a cryp-

tographic system (cryptosystem) and these are usually implemented to provide the user the ability of

covering classified information.

Definition 2.0.1. A Cryptosystem is a 5-tuple (P, C,K, E ,D) such that:

• P is the set of all possible plaintexts.

• C is the set of all possible ciphertexts.

• K is the set of all possible keys, also denoted as the key space.

• E := {Ek : P → C}k∈K is the family of all encryption functions.

• D := {Dk : C → P}k∈K is the family of all decryption functions.

Two types of cryptosystems can be defined: symmetric and asymmetric. In the first, the same key

is used for encryption and decryption, while in the latter there are two distinct keys, one for encryp-

tion and one for decryption. Section 2.2.3 further discusses asymmetric cryptography also known as

public-key cryptography; for now this distinction suffices. The following section is oriented for symmetric

cryptography.

3

2.1 Cryptanalysis

It is possible to break1 cryptographic systems without the knowledge of the used key(s) or even of the

algorithm itself. As the name suggests, Cryptanalysis is the study of cryptosystems with the objective to

find flaws or weaknesses that entail a gain of information from unauthorized parties, without necessarily

discovering the secret key. Distinct cryptanalysis’ methods can be categorized based on the information

available to the attacker. Following, the most common cryptanalysis methods are presented:

Definition 2.1.1 (Ciphertext-only attack). A ciphertext-only attack is one where the adversary possesses

information regarding the ciphertext and is able to deduce either the corresponding plaintext or the key,

without being provided any details about the plaintext itself, theoretically. Notwithstanding, in practice,

the attacker usually does have access to useful information, such as the alphabet in which the plaintext

is written.

Definition 2.1.2 (Known-plaintext attack). A Known-plaintext attack focuses on finding the secret key

(or key stream) of the cryptosystem at hand, provided the knowledge of both the ciphertext and its

corresponding plaintext.

Definition 2.1.3 (Chosen-plaintext attack). In a Chosen-plaintext attack, the adversary has access to

an encryption oracle, which encrypts any plaintext given by the attacker and outputs the corresponding

ciphertext.

A cryptosystem is said to be secure against chosen-plaintext attacks if and only if given an adversary

who is able to choose any pair of plaintexts x0, x1 whose encryptions are y0, y1 respectively, cannot

decide which of the following is true:

yi = ek(xi) or yi = ek(xi+1) (2.1)

with probability greater than 0.5.

Definition 2.1.4 (Chosen-ciphertext attack). In a Chosen-ciphertext attack, the adversary has access to

a decryption oracle, which decrypts any ciphertext given by the attacker and outputs the corresponding

plaintext.

The increasing complexity of cryptosystems throughout the years has demanded a serious develop-

ment on the methods of cryptanalysis. A system is as secure as its resilience to the most devious pos-

sible attack and every cryptosystem used in the present day is continuously being targeted by hackers,

who never cease their attacking spree and are constantly developing new methods aiming to increase

their success rate. So, how can one be certain that a cryptosystem will always remain secure against

certain types of attacks? The following definition highlights such property.

Definition 2.1.5 (Semantic Security). Let C = (X,Y,K, E ,D) be a cryptosystem and ek ∈ E . A cryp-

tosystem is said to be semantically secure if given y = ek(x), then V (A(y, l)) = V (B(l)), where A and

1By ”break” one means being able to recover the corresponding plaintext for any given ciphertext.

4

B are two polynomial-time bounded adversaries, l = |y| and V is the advantage of the adversary, which

is defined by

V (a) = P (a chooses the wrong plaintext)− P (a chooses the correct plaintext) (2.2)

for every adversary a, where P (E) defines the probability of occurrence of event E.

Notwithstanding, it is possible that a cryptosystem is semantically secure against some types of

attacks while having some flaws with respect to its construction, which entail undesired properties that

can be used by an adversary to exploit vulnerabilities in the scheme at hand.

The upcoming definitions are helpful to the semantic security analysis of cryptosystems. The follow-

ing description is based on the results presented in [2] and some similar notation is used.

Consider the following scenario: an adversaryA living in one of two worlds2 (left world L or right world

R) is trying to break a cryptographic system C and has access to an encryption oracle Oe. A does not

know which world he lives in but the world W is defined a priori and cannot be changed throughout the

entire activity of A. The encryption oracle, given any two plaintexts p0, p1 always returns the ciphertext

ek(pb) where ek is the encryption function of C for some k ∈ K and b ∈ {0, 1} is picked according to the

following relation

b =

0 if W = L

1 if W = R

(2.3)

Oe is known as lr-oracle.

Definition 2.1.6 (IND-CPA). Let C = (P,C,K, E ,D) be a cryptosystem with encryption and decryption

functions ek and dk, respectively, for some k ∈ K, let A be an adversary and O an encryption lr-

oracle. Consider that A is in possession of X = (x1, . . . , xn) ∈ Pn and also that |xi| = |xj |,∀i 6=j .

Indistinguishability under chosen-plaintext attack (IND-CPA) is a game defined as follows:

1. A picks two messages x′0, x′1 ∈ X;

2. A queries the oracle O with (x′0, x′1);

3. O encrypts x′b yielding ek(x′b), according to 2.3;

4. O returns the encryption output ek(x′b) to A;

(A can repeat steps 1 to 4 at will);

5. A chooses b′ ∈ {0, 1};

6. A wins if b′ = b and loses otherwise.

If A is able to correctly choose b′ with probability negligibly greater than 1/2 then the system at

hand is not semantically secure against chosen-plaintext attacks (see property 1). To the answer of the

encryption oracle to the adversary’s query one denotes by challenge ciphertext and analogously for the

decryption oracle by challenge plaintext.2A world can be seen as a binary state.

5

A stronger measure of security can be defined based on the previous definition. If the adversary

not only has access to an encryption oracle but also to a decryption oracle, then A is granted a serious

amount of resources that threaten the security of the cryptosystem at hand as this is the most critical

and undesirable type of attack to defend against.

Definition 2.1.7 (IND-CCA). Indistinguishability under chosen-ciphertext attack (IND-CCA) is a game

analogous to the one from Definition 2.1.6, but herein the adversary has access to two lr-oracles: an

encryption lr-oracle Oe and a decryption lr-oracle. This game has the additional requirement that the

adversary A may not query the decryption oracle with challenge ciphertexts.

Let Od be the decryption lr-oracle and assume that the adversary is not allowed to query Od with

any challenge ciphertext, in which case it would be trivial for A to gain advantage on the IND-CCA game

becauseA would immediately know the world W . The two properties that follow are desirable properties

for analyzing the security level of any cryptographic system.

Property 1 (IND-CPA secure). A cryptosystem is said to be IND-CPA secure if a polynomial-time

bounded adversary who plays the IND-CPA game cannot win with probability negligibly greater than

1/2.

Property 2 (IND-CCA secure). A cryptosystem is said to be IND-CCA secure if a polynomial-time

bounded adversary who plays the IND-CCA game cannot win with probability negligibly greater than

1/2.

According to [3], IND-CCA ⊂ IND-CPA, thus any system that is IND-CCA secure is also IND-CPA

secure. In the present days, the minimum threshold of security required for a cryptosystem to be ac-

ceptable with regard to its security level is to satisfy property 1.

2.2 Modern Cryptography

During World War II, Claude Shannon [4] contributed to the development of Cryptography, especially

with his results presented in a 1945 classified paper [5], which influenced the development of modern

day cryptography.

This section describes linear and differential cryptanalysis techniques, some of the most important

block ciphers with focus on AES and finally several block cipher modes of operation and their properties.

2.2.1 Block Ciphers

Block ciphers can be defined as iterated product ciphers. The notion of an iterated cipher is straight-

forward and its main components are a round function and a key schedule. As the name suggests, the

cipher consists in performing several rounds (iterations) of a round function applied on a state and a

round key, where the initial state is the plaintext, a (non-initial) state is defined as the image of the round

function and a round key and the round key is one of the elements of the output of the key schedule

algorithm. Cryptosystem 1 formally illustrates this description.

6

Cryptosystem 1. Let X = Σ∗P and Y = Σ∗C be the set of plaintext and ciphertext bitstrings, respectively

and K the key set. Let round : S ×KR → S be the round function and f : K → KR the key schedule,

where S ⊇ X ∪ Y is the set of states and KR the set of round keys with |KR| = r ∈ N, such that

x = s0;

y = sr;

f(k) = (k1, . . . , kr);

round(si, ki+1) = si+1;

(2.4)

An iterated cipher is defined as follows:

ek(x) =round(. . . round(round(s0, k1), k2) . . . , kr);

dk(y) =round−1(. . . round−1(round−1(sr, kr), kr−1) . . . , k1);(2.5)

where round−1 is well defined iff round(s, k) is injective for fixed k.

For block ciphers and according to the initial statement of this section, round usually contains a

combination of S-Boxes and/or P-Boxes.

A common technique used to increase the security of a block cipher is called key whitening and con-

sists in performing an exclusive-or operation (XOR) using the round key in the initial and the last rounds.

Whitening contributes to increasing the hardness of a brute-force attack. For a block cipher to be con-

sidered robust, it must have good confusion and diffusion [5], otherwise it may be susceptible to simple

statistical attacks, namely linear and differential cryptanalysis. Robust block ciphers are widely used in

nowadays’ cryptographic algorithms, namely in cryptographic hash functions and pseudo-random num-

ber generators.

2.2.1.1 Linear and Differential Cryptanalysis

Linear and differential cryptanalysis are the most common and devious attacks known to block ci-

phers. Generally speaking, both focus on finding probabilistic linear relations and exploit them in such

a manner that it becomes feasible to perform either known-plaintext or chosen-plaintext attacks. Both

techniques make use of the bias of a random variable, as opposed to measuring the raw probability, as

it expresses the deviation of the true value of a random variable with its expecting value. For a Bernoulli

distributed random variable (X ∼ Ber(p)), this quantity is defined by

ε(X) = p− 1

2(2.6)

Linear Cryptanalysis

Assume that, for large n, the attacker Victor (V) has access to (x1, y1) . . . , (xn, yn) such that ek(xi) =

yi for fixed k, where ek is the encryption function of the block cipher at hand. Moreover, suppose that V

is able to linearly relate subsets of the plaintext, ciphertext and key bits on a linear approximation of the

form

x⊕[a1, . . . , aj ]⊕ y⊕[b1, . . . bl] = k⊕[c1, . . . , cm] (2.7)

7

where a1, . . . aj , b1, . . . , bl, c1, . . . , cm are fixed bit indexes and v⊕[d1, . . . , dh] represents v[d1]⊕ . . .⊕v[dh]

such that v[di] ∈ {0, 1},∀i∈Ih , for fixed bit index di.

The attack consists in assigning an equally-valued counter Ck to each possible key k ∈ K. For every

pair (xi, yi) in V’s possession he computes the left side of equation 2.7 for each k and Ck is incremented

each time the abovementioned equation holds. At the end of the whole process, the key k with the

highest counter value is the key for which the bits k[c1], . . . , k[cm] are considered to be correct.

Considering T to be the random variable that represents the outcome of equation 2.7, the effec-

tiveness of a linear cryptanalytic attack is proportional to |ε(T)|[6]. According to [7], the number n of

plaintext-ciphertext pairs that V needs to know in order for the attack to succeed with high confidence is

approximately cε−2, where c ∈ R is usually small. Note that (ε(T) → 0) ⇒ (n → ∞) and n → 4c when

ε(T)→ ±1/2.

Differential Cryptanalysis

Differential cryptanalysis is very similar to the aforementioned procedure of linear cryptanalysis, with

the exception that one does not try to find a linear relation between the plaintext, ciphertext and key bits,

but instead for a linear approximation on differences3 of the plaintext and ciphertext bits with key bits.

That is, instead of the attacker V having pairs (x, y), he is now able to choose x, x′ ∈ P and compute a

differential : the pair (∆x′

x ,∆y′

y ), where ∆x′

x = x⊕ x′, ∆y′

y = y ⊕ y′, ek(x) = y and ek(x′) = y′.

For each of the possible keys k, V checks if the linear approximation between the differentials holds

and if so, the counter of k is incremented. This process is very similar to the one of linear cryptanalysis,

as the key with highest counter will be the one for which the bits of the linear relation are most likely

correct.

2.2.1.2 DES and 3DES

In 1977 DES [8] was published as an official FIPS and although considered insecure nowadays, it

had a major influence in modern cryptography. It was designed as a Feistel cipher [9] (cryptosystem 2).

Cryptosystem 2 (Feistel cipher). Let S be the set of states such that si = (sL, sR), where |sL| = |sR|

and sL ‖ sR = s, ∀s∈S , let KR be the set of round keys, g : S ×KR → S the round function and f the

function that (possibly) contains the non-linear operations of the block cipher. Then, the round function

used in the encryption procedure is given by

g((sLi−1, sRi−1), ki) = (sRi−1, s

Li−1 ⊕ f(sRi−1, ki)) (2.8)

where i = 1, . . . , N , for some number of rounds N ∈ N defined by the key schedule. Moreover, the

round function for the decryption procedure is given by

g−1((sLi , sRi ), ki) = (sRi ⊕ f(sLi , ki), s

Li ) (2.9)

In 1992 and after having (re)discovered differential cryptanalysis, Shamir and Biham published the

first theoretical attack on DES, although practically infeasible at the time due to its complexity. Later on,3By difference, one means an operation, usually XOR.

8

a practical attack was indeed discovered using linear cryptanalysis and in the years that followed the

complexity of the attacks on DES confirmed that the standard had become deprecated. The need for a

modification on DES or the design of a new algorithm had become of utmost importance.

The abrupt growth of computational capability at the time made it clear for researchers that the 56-

bit key size of DES was really small for the demanding security of the algorithm. In order to increase

its security against brute-force attacks and without changing its core procedure several parties came

up with a straightforward solution (presented in cryptosystem 3): instead of using a single key and

performing a block encryption, each block of plaintext is subject to 3 rounds of block encryption using

three (possibly) distinct keys. Note that the main goal of the cryptographers at the time hinged in solving

the problem without having to create a new algorithm, which would save time and money because there

would be no need to replace all hardware mechanisms that had DES implemented.

Cryptosystem 3 (Triple DES). Let Ek and Dk be the encryption and decryption procedures for the DES

algorithm using the 56-bit key k. The encryption and decryption procedures for Triple DES are given,

respectively, by:

ek(x) = Ek3(Ek2

(Ek1(x)))

dk(y) = Dk1(Dk2

(Dk3(y)))

(2.10)

where x ∈ P, y ∈ C and k = (k1, k2, k3) is either a 168, 112 or 56-bit key, based on the keying option.

Three options were available for the keys:

1. (k1 = k2 = k3)⇒ 56-bit key;

2. (k1 = k3 6= k2)⇒ 112-bit key;

3. (k1 6= k2 ∧ k1 6= k3 ∧ k2 6= k3)⇒ 168-bit key;

3DES is still considered to be secure due to the impracticability of the currently known linear crypt-

analytic attacks that require an infeasible number of known plaintext-ciphertext pairs. However, the

previous statement is only true for 168-bit keys as options 1 and 2 have been considered deprecated.

2.2.1.3 AES

Aiming to replace the encryption standard to cope with the modern-day demanding security, NIST

decided to launch an invitation to tender for the new encryption standard named AES. Several proposals

were submitted (21 to be precise) and after being subject to a thoroughly security analysis, each of the

five finalists were considered to be secure. The choice of the Rijndael cipher [10] as the algorithm for

the AES was based on its performance, versatility, simplicity and implementation details. In 2002, AES

was admitted as the official encryption standard. It is a symmetric cryptosystem based on an iterated

block cipher and unlike DES, the cryptosystem does not follow a Feistel network, but instead a SPN [11],

which is briefly descripted in cryptosystem 4 and whose round function is illustrated in Figure 2.1.

Cryptosystem 4 (Substitution Permutation Network). Let l,m ∈ Z+, let πS : {0, 1}l → {0, 1}m be an

S-Box and πP : {1, . . . , lm} → {1, . . . , lm} be a P-Box. Consider K = {(k1, . . . , kn+1) : kj ∈ {0, 1}lm},

9

where n is the number of rounds and P = C = {0, 1}lm. Let S be the set of states and consider two

round functions: g : S ×K → S which is given by:

g(si−1, ki) = πP (πS((si−1 ⊕ ki)|l) ‖ πS(((si−1 ⊕ ki)|2l)|l . . . ‖ πS((si−1 ⊕ ki)|l)) (2.11)

and the round function f : S ×K → S, given by

f(si−1, ki, ki+1) = (πS((si−1 ⊕ ki)|l) ‖ πS(((si−1 ⊕ ki)|2l)|l . . . ‖ πS((si−1 ⊕ ki)|l))⊕ ki+1 (2.12)

The encryption procedure consists in applying the g function n − 1 times followed by the f function,

where s0 = x.

Note that in the above cryptosystem the P-Box [7] is not applied in the last round thus allowing the

algorithm to be used for decryption without appropriate modifications.

AES block size is of 128 bits and the standard specified three4 possible key sizes: 128, 192 and 256

bits. There is a trade-off on security and performance directly related to the size of the key, since the

number of rounds of the algorithm varies according to the key length. For 128, 192 and 256-bit keys, the

number of rounds is, respectively, 10, 12 and 14. Nevertheless, even for 128-bit keys, the currently known

attacks and the foreseen computational capability in a near future lead to the conclusion that AES is

secure as a block cipher, regardless of the key length chosen. That is the reason AES is impregnated

in a large majority of modern-day cryptographic schemes or protocols that have the need to provide

secrecy.

A high-level description of the algorithm’s main functions is going to be assembled followed by the

algorithm’s pseudocode in algorithm 1.

High-level Description

In the 2001 FIPS publication of AES [12] some functions were introduced. The same names are

herein being used and their informal definitions is presented:

• AddRoundKey: performs an XOR operation between the current state and the current round key;

• SubBytes: replaces each byte of the current state for its correspondence on a fixed lookup table;

• ShiftRows: shifts the bytes of each row of the state according to a (fixed for each row) permutation;

• MixColumns: multiplies each column of the state by a fixed polynomial p(x).

4The Rijndael cipher was more versatile in the subject, as it allowed more key and block sizes (multiples of 32 bits between 128

and 256 for both cases).

10

. . .si−1<1>si−1 si−1<m>

· · ·ki<1>ki ki<m>

ui<1> · · · ui<m> ui

vi<1> · · · vi<m> vi

· · ·si<1> si<m> si

πS πS

πP in the indexes of vi

l bits

Figure 2.1: Encryption round of a SPN. Corresponds to the round function g from cryptosystem 4. It is

used in all rounds except the last.

Algorithm 1 AES algorithm

1: procedure AES(K,M ) . Encrypting x with K

2: state←M ;

3: (K1, . . . ,KN+1)← KeySchedule(K);

4: AddRoundKey(K1, state);

5: for r = 1 to N do

6: SubBytes(state, πS);

7: ShiftRows(state, πP );

8: if r ≤ N − 1 then

9: MixColumns(state);

10: end if

11: AddRoundKey(Kr, state);

12: end for

13: return c← state;

14: end procedure

2.2.2 Block Cipher Modes of Operation

Block ciphers are very useful in modern cryptography, but they are only able to encrypt or decrypt

one block of fixed size data. Block cipher modes of operation were created so that one is able to encrypt

a piece of data of arbitrary length using a block cipher and the way these modes make use of the block

cipher at hand is very relevant for the security of the cryptosystem, for one can induce flaws in the

cryptosystem with a bad usage of the block cipher, even if the latter is considered to be secure against

all known attacks.

Throughout this section let B be an arbitrary block cipher with b-bit block size, let P be an m-bit

11

plaintext, C an m′-bit ciphertext and let n be the number of blocks of the message at hand. Moreover,

consider the following notation:

• EBk := B block cipher encryption function using key k;

• DBk := B block cipher decryption function using key k;

• pi := ith b-bit block of the plaintext;

• ci := ith b-bit block of the ciphertext;

•nn

i=1

xi := x1 ‖ x2 ‖ · · · ‖ xn, where ‖ is the concatenation operator.

2.2.2.1 ECB

ECB mode is the most straightforward mode of operation for block ciphers. Its encryption and de-

cryption functions are, respectively, the following:

eECBk (P ) =

nn

i=1

ci,where ci = EBk (pi),∀i∈In

dECBk (C) =

nn

i=1

pi,where pi = DBk (ci),∀i∈In

(2.13)

where m = m′ is a multiple of the block cipher’s block size. For this reason padding is advised and it is

discussed in section 2.2.2.7. A graphic representation of the encryption and decryption procedures for

the ECB mode is presented in figures A.1a and A.1b, respectively.

2.2.2.2 CBC

Unlike ECB mode, CBC is a widely used mode of operation and is often suited to be used for au-

thentication purposes due to its ripple effect. Its encryption and decryption procedures are as follow:

eCBCk (P, IV ) =

nn

i=1

ci,where ci = EBk (ci−1 ⊕ pi),∀i∈In

dCBCk (C, IV ) =

nn

i=1

pi,where pi = ci−1 ⊕DBk (pi),∀i∈In

(2.14)

and such that c0 = IV . Figure A.2 contains a graphic representation of both cases.

Both encryption and decryption functions in (2.14) have an additional argument IV , which is a ran-

dom5 b-bit initialization vector (IV) and whose role is to contribute to the XOR of the first iteration. By

construction, one can easily observe that CBC mode encryption cannot be computed in parallel, since

each iteration depends on the previous ciphertext; on the other hand though, the decryption mechanism

can be parallelized, since each plaintext block pi can be obtained deterministically provided knowledge

of the tuple (k, ci1, ci). Due to its sequentiality, CBC is extremely susceptible to errors in transmission,

5The predictability of the IV gives room for feasible attacks on the cryptosystem. It is further discussed in section 2.6.7.

12

mainly triggered by noise in the communication channel induced either by an adversary or by environ-

mental conditions, as they propagate to every subsequent block.

2.2.2.3 CFB

CFB mode can be seen as a synchronous stream cipher [13]. Each plaintext block pi is encrypted

by applying an XOR operation with a keystream element ki, yielding the corresponding ciphertext block

yi. Each element ki of the keystream k is generated according to

ki = EBk (yi−1),∀i∈N (2.15)

and for the first block (i = 1) one has y0 = IV . This procedure is illustrated in Figure A.3a and can be

interpreted as follows:

eCFBk (P ) =

nn

i=1

(pi ⊕ ki)

dCFBk (C) =

nn

i=1

(ci ⊕ ki)(2.16)

where ci = pi ⊕ ki for i = 1, . . . , n. As for the CFB decryption, it is important to note that the block

cipher’s encryption EBk is used instead of the block cipher’s decryption DBk .

Since the keystream generator function depends on the previous ciphertext block, this mode cannot

be parallelized for encryption.

2.2.2.4 CTR

The original modes of operation published as FIPS in 1981 did not include CTR mode. Only in

2001 was it added as a standard mode of operation, the same year of the public announcement for the

consideration of AES as an effective block cipher.

The general idea behind this block cipher mode of operation (BCMO) is to handle a counter through-

out the operations. A b-bit value IV is chosen as the initial counter value and thereafter every counter

is computed based on the previous one. The block cipher is used to encrypt the counter block and

use its output to perform an XOR with the plaintext block, yielding the ciphertext block. This procedure

is illustrated in Figure A.4 and as one can easily observe, CTR mode can be seen as a synchronous

stream cipher [13].

More formally,

eCTRk (P, IV ) =

nn

i=1

(EBk (ti)⊕ pi), where t1 = IV

dCTRk (C, IV ) =

nn

i=1

(EBk (ti)⊕ ci), where t1 = IV

(2.17)

where EBk is the block cipher’s B encryption procedure, pi and ci are the ith plaintext and ciphertext

block, respectively and ti is the ith counter block such that

ti = ctr(ti−1) (2.18)

13

where ctr : {0, 1}b → {0, 1}b is the counter function.

There are several possibilities for the behaviour of the aforementioned ctr function, but the NIST

recommendation [14] goes for the Standard Incrementing Function, which is given by

ctr(x) = x|b−m ‖ (x|m + 1 mod 2m); (2.19)

where x is a b-bit word, x|m represents the last m bits of x, x|b−m represents the first b−m bits of x and

m ∈ N : m ≤ b is the counter length.

In contrast with CBC mode, the IV herein used does not need to be random, it just needs to be unique

for each encryption under the same key. In other words, CTR mode’s security lies in the uniqueness of

the pair (ti, k) for all the encryptions performed. Therefore, there is an upper bound ul for the length (in

bits) of a message to be given as input to the CTR encryption scheme, which is given by

ul = b× 2m (2.20)

because if the number of blocks exceeds the cardinality of the set of possible counters (n > 2m), then

t2m+i = ti,∀i≤2m , which cannot happen, otherwise CTR’s security becomes compromised6.

CTR mode is very suitable to be used in situations where the time complexity of the encryption

algorithm is of essence, as it can be fully parallelized. The only pre-processing needed for this mode

is the computation of the counter blocks, which is done in O(n) time, where n is the length of the input

data, because there are n/k blocks where k is the fixed block length and each increment is done in time

O(1).

2.2.2.5 CCM

The block cipher modes of operation discussed so far provide secrecy to the data at hand. Notwith-

standing, there are modes which apart from secrecy, also provide authentication. CCM is one of those

modes and combines the CTR mode with the CBC-MAC mode, the former for secrecy purposes and the

latter for authentication.

CBC-MAC is very similar to CBC mode’s encryption (see Figure A.2a), with the exception that instead

of the algorithm returning C, it returns only the last block cn, i.e.

CBC-MAC(x, k) = eCBCk (x)|b (2.21)

CCM [15] mode interleaves the authentication and confidentiality steps, taking as input a 3-tuple

(N,H,P ) such that N is a nonce (number used only once) intended to be used as the IV by the CTR

mode of operation, H is the header, which is data to be authenticated but not encrypted and P is the

plaintext that is going to be subject not only to authentication but also to encryption. The algorithm has

several pre-requisites, namely all the operations are done using the same key k and there is a formatting

function that takes as input a 3-tuple (N,H,P ) as above and returns a sequence of bitstring blocks.

There are some situations in which pieces of data may be of public knowledge and therefore there

is no need to encrypt them, as one would only be increasing the memory and computational over-6Section 2.6.7 further discusses this topic.

14

head. CCM provides a thorough solution to this potential problem, as it allows the authentication of

non-encrypted data without extending the length of the ciphertext.

2.2.2.6 GCM

Galois Counter Mode (GCM) [16] is another mode of operation that comprehends both authentication

and confidentiality. This combined mode makes use of an adapted version of CTR to encrypt the data

and the integrity and authenticity is granted by the Galois mode of authentication. The latter is known

as Galois Message Authentication Code (GMAC) and is based on a keyed-hash function which, even

though lacking the title of cryptographic hash function (see section 2.3), is well suited for the job. With

this mode it is imperative that the pair (v, k) is never reused for any given input data, where v is the IV

and k is the key. The uniqueness requirement on the IVs is necessary to grant the system immunity to

malleability by the authentication mechanism.

2.2.2.7 Padding

For the ECB and CBC modes, the length of the plaintext must be a multiple of the block size, thus

one must7 pad the plaintext prior to encryption. There are several padding techniques used by distinct

types of algorithms

Bit Padding

One of the most used padding techniques for BCMO is called bit padding and consists in appending

a bit 1 to the end of the plaintext and filling the remaining r = n× b− (m+ 1) bit fields with the bit value

0 such that the length of the nth block is b, where n =⌊mb

⌋+ 1.

PCKS7

PKCS7 padding [18] is another widely used padding technique and consists in checking how many

remaining bytes are there to the end of the block (k = 8× n− 8×m) and pad the message with k bytes

each valued k. Note that if m is already a multiple of the block size, the padding must be performed

either way, because the recipient of the message is always expecting a padded message; thus, a new

block must be added to the end of the plaintext. For this reason, this padding technique is bounded by

the maximum value of 256 bytes, for it must not be used in block ciphers whose block size is greater

than 256 bytes, i.e., 2048 bits.

7Although out of the scope of this text, there are methods to prevent the use of padding for ECB and CBC. These are named

of ciphertext stealing methods [17] and allow the ciphertext to have exactly the same length as the plaintext, while increasing the

complexity of the algorithm.

15

Usually, by padding a message, one gains the advantage to hide the true length of the plaintext.

However, if the padding is not executed properly, some vulnerabilities may rise and an adversary may

be able to successfully exploit them. Notwithstanding, when using a padding scheme, the plaintext may

become vulnerable to a Padding Oracle Attack (POA), which is explained in section 2.6.5. Therefore,

one should beware whenever applying a padding algorithm.

2.2.3 Asymmetric Cryptography

There are clearly some issues related with the symmetric cryptographic systems since all the users

that can either encrypt or decrypt data must know the unique key k a priori. If it’s physically impossible

for the users to share this information and if there is no secure channel to transmit the key then they can’t

use a symmetric key cryptographic system in order to change confidential information. The concept of

asymmetric cryptography emerged to successfully work around this problem.

A public-key cryptographic system is based on a key-pair (kpub, kpriv), where kpub stands for the

public key and kpriv for the private key, the first being known publicly and the latter known only by the

user. Herein, there is a slightly distinct mode of operation when compared to a symmetric cryptosystem,

since the owner of the private key kpriv uses it only to decrypt any received information, which has

been encrypted with his public key kpub and could have been sent by someone. For example, when

Alice wants to send Bob a message, she encrypts it with Bob’s public key Bpub and then he decrypts it

with his private key Bpriv. Asymmetric cryptosystems rely upon the infeasibility of certain mathematical

problems.

The major shortcoming of asymmetric cryptographic systems is their high computational complexity,

when compared with symmetric cryptosystems.

2.3 Cryptographic Hash Functions

A Hash function outputs a fixed-length message on any given input of arbitrary length. Therefore,

it can be a very useful tool in modern cryptography, which has led many researchers to study their

properties. For a given hash function to be considered as a cryptographic hash function it must satisfy

the following properties:

(i) Efficiency: The computation of the hash value must be incredibly fast.

(ii) One-Way Function: It’s infeasible to invert.

(iii) Avalanche Effect: A small change in the input of the hash function produces a very distinct output.

(iv) Collision Resistance: It is very hard to find two distinct inputs with the same image.8

8Note that the complexity of a birthday attack, O(2m2 ) for an m-bit message digest, is an upper bound for the best collision

resistance.

16

Several hashing algorithms were created and among the remarkably popular are Message Digest

(MD) and Secure Hashing Algorithm (SHA). The most widely used hash function from the first family

is MD5 and it was deprecated as soon as an attack was found successful in a considerable short time

frame. As for the second family, the most widely used function is SHA-1 [19] and it is considered to

be insecure due to a successful collision attack [20] published by Google in February of 2017. Several

theoretical attacks had already been found and thus this cryptographic hash function was considered to

be on the edge of failure, foreseeing that a collision would be found soon enough. Hence the creation of

SHA-2 [19] was mandatory, a version with four variants (SHA-224, SHA-256, SHA-384 and SHA-512)

that extends the set of possible hashes to a point where the present known collision attacks become

infeasible.

The most straightforward use one can give to a hash function is for integrity purposes, i.e., given a

message m and a cryptographic hash function h, the computation of h(m) yields an n-bit digest that can

be used to check the integrity of the messagem. Due to the collision resistance property of cryptographic

hash functions it is most likely that h(m) 6= h(m′) for any m 6= m′ and cryptographic hash function h.

Thus, for any word y, if h(y) = h(m) one can consider with a high level of trust y to be equal to m. This

capability of providing integrity to the messages at hand is specified in Example 2.3.1.

Example 2.3.1 (Integrity). Let h be a cryptographic hash function of public knowledge and consider that

Alice sends Bob a message x along with its digest h(x). Bob receives the pair (m, d), where m is the

message and d the message digest and wants to verify whether m is in fact x, the message that was

sent by Alice. Thus he computes h(m) and accepts the message as valid if and only if h(m) = d.

The reader should not be mislead for the example above is only successful in an unreal situation.

Since h is of public knowledge, any adversary capable of interfering with the communications would

be able to change (m, d) for (m′, h(m′)), for some malicious message m′, and Bob would successfully

conclude the integrity verification without noticing that the message had been tampered with. The use

of cryptographic hash functions require a deep understanding of the involved components in order to

satisfy the desired properties and strengthen the system at hand against vicious attacks. Most modern-

day cryptographic algorithms make use of cryptographic hash functions, as their properties provide

extremely advantageous behaviours to prevent eventual vulnerabilities.

2.3.1 SHA-256

A member of the SHA family, SHA-256 is able to generate a 256-bit message digest of any message

with binary length b satisfying 0 ≤ b < 264, for the padding scheme associated to the algorithm’s con-

struction requires b to be written as a 64-bit number. Following, a high level description of the steps of

the SHA-256 algorithm [19] is described, for an arbitrary b-bit input message x :

1. Pad and parse x into x1, . . . , xn;

2. Initialize i = 0 and zero-valued 32-bit hash values h1, . . . , h8;

3. Build the message schedule mi based on xi;

17

4. Build working variables {vk}k∈I8 , each based on the value of hk;

5. Update the values of global variables using mi and vk ∀k ∈ I8;

6. Update the hash values hk ∀j∈I8 , using the values of the variables obtained in the previous step;

7. Compute i = i+ 1 and if i < n then go to step 3, otherwise return the value h = h1 ‖ · · · ‖ h8.

2.3.2 HMAC

The use of cryptographic hash functions has been associated with authentication purposes since the

first definition of a hash-based message authentication code (HMAC) [21] in 1997.

Definition 2.3.1. Let m be a message, k an l-bit key and h a cryptographic hash function whose com-

pression function’s block size is of n-bits. The following function f defines the HMAC-h:

f(k,m) = h((k′ ⊕ opad) ‖ h((k′ ⊕ ipad) ‖ m)) (2.22)

where ipad and opad are fixed strings and k′ is the resulting key such that for j = n− l:

k′ =

k ‖ 0j if l < n

h(k) if l > n

k otherwise

(2.23)

HMAC grants both integrity and authenticity to the input data, but while the first follows trivially from

the use of a cryptographic hash function, the latter requires the key k to be shared solely between the

two involved parties. Clearly, if there are more than two parties with access to k, the recipient of the

messages will never be able to authenticate the sender.

2.4 Randomness

This section introduces the randomness concept and some useful definitions regarding this topic.

Randomness is a desired property for several algorithms, as it is a measurement of uniqueness and

unpredictability, very suitable for solving various problems in the field of cryptography. The higher known

level of randomness is theoretical, the next best thing however is extracted from physical elements,

for instance the movement of electrons. Throughout the years, researchers have been trying to develop

software algorithms that behave like a true random generator, but to no avail: true randomness is a prop-

erty yet out of reach by modern algorithms. This entails the well-known fact that hardware randomness

is better than software’s.

Definition 2.4.1. A random number generator (RNG) is an algorithm that generates an unpredictable9

sequence of values, i.e., if one uses a RNG to generate a sequence a1 · · · an for ai ∈ Σ, then a third

party cannot guess ai with probability non-negligibly greater than1

|Σ|.

9Infeasible to be computed by a polynomial time algorithm.

18

When the hardware at hand lacks a RNG and one needs to implement a random behaviour in a

certain algorithm, the only solution is to implement a RNG based on the entropy generated by software

available features. The problem is that there is no such mechanism providing true randomness: in com-

puter programming, the random number generators are pseudo-random number generators (PRNGs)

since the stream of values produced by these algorithms is only seemingly non-deterministic for the

whole process requires an input value, called seed, which makes the algorithm deterministic. So, when-

ever using a PRNG it is demanding that an adversary cannot feasibly obtain the used seed which means

that not all PRNG are suitable to be used in cryptographic algorithms. In order to make use of a PRNG

for cryptographic primitives, it must satisfy two very important properties:

1. Given an initial state of a sequence of numbers generated by the PRNG, say the first k bits of

the sequence, it is infeasible to compute the (k + 1)th bit with probability of success non-negligibly

greater than 1/2 (see next-bit test [22]).

2. It is infeasible to reconstruct the stream of numbers generated by the PRNG based on a known

internal state of the algorithm.

A PRNG satisfying the above properties and therefore suitable for cryptographic applications is named

a cryptographically secure pseudo-random number generator (CSPRNG). It is, however, extremely dif-

ficult to find a CSPRNG, since most PRNG are either vulnerable to extended personalized statistical

attacks or leak information upon the unveiling of some internal state. Many cryptographic algorithms

are very sensitive with respect to predictability, meaning that a CSPRNG is usually used in steps where

randomness is of essence, as for instance the generation of cryptographic keys or salts.

There is yet another definition [23] that needs to be addressed in order for the reader to efficiently

understand the concepts descripted in section 2.4.1.

Definition 2.4.2 (PRF). A family of functions {Fk : X → Y }k∈{0,1}∗ is a pseudo-random function (PRF)

if, for a randomly chosen instance function Fk, its output is indistinguishable (for a polynomial-time

algorithm) from the output of a random function R : X → Y , where X and Y are the domain and range

sets of the functions of the family, respectively.

PRFs are applicable in a wide variety of solutions as their properties are eximious.

2.4.1 Key Derivation

Regardless of the level of security of an underlying cryptographic algorithm, if one is able to obtain

the secret key used in the process then it becomes unreliable, with the danger of compromising all

the data that has been and/or is to be processed by it. In fact, it is possible for a cryptographic key

associated with a cryptosystem to be compromised without compromising any of the prior messages

encrypted under that cryptosystem. These systems are said to provide forward secrecy. Nevertheless,

one wants to always prevent adversaries from discovering the envisaged keys.

Throughout the years researchers have been using CSPRNGs to create and refine algorithms for the

generation of stronger cryptographic keys. These methods are called key derivation functions (KDFs)

19

and output an enhanced key for a given input secret. The increased resistance to attacks of the resulting

cryptographic keys makes them suitable for most real-world applications.

In 2000, RSA Laboratories published a specification [24] in which the PBKDF2 key derivation function

was introduced; it became quite popular and one of the most widely used nowadays. The password-

based key derivation function 2 (PBKDF2) takes as input five parameters:

PBKDF2(PRF, pass, salt, iter, len) (2.24)

where PRF is a pseudo-random function, pass and salt are octet strings such that the former is the

secret password and the latter the cryptographic salt, both to be used in the inherent PRF; iter is an

integer value corresponding to the number of iterations of the PRF and len is the length, in octets, of

the envisaged output key. The number of iterations is directly related with the level of security of the

procedure. The steps that describe the PBKDF2 algorithm can be found more specifically in [25].

There are several known KDFs but, among the most secure of its kind, PBKDF2 is considered to be

the better suited for using in real-world applications for it is the one with better performance [26].

2.5 Communication Protocols in Wireless Networks

Institute of Electrical and Electronic Engineers Standards Association (IEEESA) is an association that

develops standards for several technological fields, namely telecommunication and information technol-

ogy. They have been developing standards for over ten decades and among the published works is a

family of network protocols for parties trying to connect to a local area network (LAN) or wireless local

area network (WLAN), specified by the set S = {IEEE 802.1X : X is a unique identifier for the standard}.

The most relevant elements of S are going to be discussed, as one of them (WPA2) is considered the

most suitable protocol for wireless communication and is used nowadays throughout the world to provide

indirect access to the Internet to either personal or corporate devices without a cable connection.

2.5.1 WEP

The standard IEE 802.11 ∈ S [27] contains the description of WEP [28], an algorithm to provide data

secrecy and integrity to wireless networks such that the level of security granted would be equivalent to

the level of security of a wired network. WEP was proved insecure mainly due to the IV space being so

small for busy networks, since k is usually fixed in practice10 (recall that a stream cipher is vulnerable

against key reuse attacks). After the proof regarding WEP’s security break being publicly published

automated tools have been developed in order to recover the key used in the algorithm and nowadays a

WEP encryption can be broken in less than a minute.

10A personal computer is usually connected to a router acting as an AP and the password for the router is fixed, unless the user

changes it manually.

20

2.5.2 WPA/WPA2

IEEE 802.11i [29], an amendment for IEEE 802.11, was put into effect due to the WEP’s exploitable

flaws. The standard includes two new security protocols for communicating over a wireless channel:

WPA and WPA2, which were intended to replace WEP. Both provide authentication either by a PSK or

by an EAP, the latter requiring an authentication server. WPA’s encryption process differs from WEP’s

such that the former does not suffer from the same fragilities as its predecessor. Nevertheless, it has

been deprecated in 2012, due to its vulnerability to a message integrity code (MIC) recovery attack

[30]. This specification was mainly created as a preventive measure for hardware mechanisms that

could not support WPA2, the most recent version of WPA, which includes a CCM-AES-based encryption

mode named CCMP [29] as a replacement for the TKIP [27] encryption and grants data confidentiality,

authentication and access control.

WPA2 communication protocol is composed of three main stages:

1. Initial authentication;

2. 4-way handshake;

3. Group-key handshake;

There is an entity called Authenticator whose role is to authenticate the parties that intend to join

the network. If WPA2-PSK is in effect then it solely communicates with the client, but for EAP mode it

acts as an intermediate point between each supplicant and the authentication server. This is the general

layout of wireless networks that make use of the WPA2 communication protocol and it is illustrated in

Figure 2.2.

AuthenticatorClient Server

Figure 2.2: Network Layout.

As abovementioned, there are two distinct modes for the WPA2 protocol: WPA2-PSK and WPA2-

EAP. These two modes only differ in the first stage of the protocol, the initial authentication step, whose

objective is to derive the PMK, a mid-step key that is used in the 4-way handshake to derive the PTK

and GTK. These two unique-per-session keys contain sub-keys that are necessary for encrypting and

decrypting the data flow between the client and authenticator. The group-key handshake is used for

updating the GTK and such that the authenticator can securely distribute it over all the authenticated

clients in the network; this key is used by the clients to decrypt multicast or broadcast data sent by the

authenticator.

2.5.2.1 Initial Authentication

The PSK mode of the WPA2 protocol has the advantage of being faster than EAP because it does not

need to go through an initial authentication step. In fact, the authenticator has a pre-defined password

pass and a SSID ssid (usually the name of the network) and computes PBKDF2(HMAC-SHA-1, pass, ssid, 4096)

21

AuthenticatorClient ServerRequest

Response

Accept Client

Confirm acceptance

Authentication protocol between Client and Server(Client and Authenticator derive PMK)

Figure 2.3: Extensible Authentication Protocol (EAP).

in order to build the PSK. The client must follow the same procedure but in order to do so he must pos-

sess the (private) pass and the (public) ssid. This option is usually chosen in personal networks where

each client trusts any other client that is able to successfully authenticate itself and connect to the net-

work. On the other hand, there may be situations where each client do not completely trust in other

clients that may connect to the network. Consider, for instance, the case of a corporate wireless network

such that there is a router authenticating several employees who dislike each other; there must be a

way to prevent each and every one of them to tamper with the data that is not intended for themselves.

Simply illustrated in Figure 2.3, EAP mode must be used for those situations, as its initial authentication

step provides pairwise authentication by deriving a PMK for each client.

Independently of the chosen mode for WPA2, at the end of the initial authentication step both the

client and the authenticator possess the PMK, which is the PSK for the case of WPA2-PSK.

2.5.2.2 4-way Handshake

After the initial authentication step, the authenticator will confirm that the client possesses the correct

PMK by asking for the decryption of certain data. Moreover, the GTK is also transmitted to the client.

The 4-way handshake is depicted in Figure 2.4 and comprises the following steps:

1. Client and Authenticator each generate a nonce nonce1 and nonce2, respectively;

2. Authenticator sends nonce2 to Client;

3. Client derives the PTK such that

PTK =PRF(gen)

and

gen = PMK ‖ nonce1 ‖ nonce2 ‖MAC1ADDRESS ‖MAC2

ADDRESS

(2.25)

where PRF is a pseudo-random function and MAC1ADDRESS and MAC2

ADDRESS are the MAC addresses

of the client and authenticator, respectively.

4. Client sends nonce1 and a MIC of that nonce to Authenticator.

22

AuthenticatorClient

Generate nonce1 Generate nonce2nonce2

Derive PTK(nonce1,MIC)

Derive PTK and generate GTK

(eGTK,mGTK)

Decrypt eGTKAcknowledgement

Figure 2.4: WPA2 four-way handshake.

5. Authenticator derives the PTK and generates the GTK.

6. Authenticator encrypts the GTK with PTK (eGTK) and computes a MIC of the encrypted GTK (mGTK)

and sends the pair (eGTK,mGTK) to Client.

7. Client decrypts eGTK and sends an acknowledgement to Authenticator consisting of a MIC of the

decrypted GTK.

After performing the four-way handshake, both client and authenticator have the PTK and GTK with-

out ever disclosing these two keys and each of them knows that the other party also possesses the

same keys.

2.5.2.3 Group-key Handshake

The GTK needs to be updated every time a client disconnects from the AP (authenticator) or upon

the expiry of a timer as a security measure. The group-key handshake is a two-way handshake as

depicted in Figure 2.5 and comprises of the following steps:

1. Authenticator updates GTK;

2. Authenticator encrypts GTK with PTK (eGTK) and generates a MIC (mGTK);

3. Authenticator sends the pair (eGTK,mGTK) to Client;

4. Client decrypts eGTK and sends an acknowledgement message to Authenticator consisting of a MIC

of the decrypted GTK.

AuthenticatorClient

Encrypt GTK and generate MIC(eGTK,mGTK)

Verify MIC and decrypt GTK

Acknowledgement reply

Figure 2.5: WPA2 group-key handshake.

23

2.6 Known Attacks

Cryptographic systems are as trustworthy as their robustness to attacks, which means that a cryp-

tosystem that has survived to a countless number of distinct attacks is considered to be reliable for

practical use, while systems that have not suffered such tests do not transmit such confidence.

2.6.1 Brute Force and Dictionary Attacks

Given a certain cryptographic algorithm, a brute force attack consists in trying all possible inputs to

the algorithm and checking whether each input leads to a desired output. For example, a hacker might

try to break into a third party’s personal computer by trying all possible passwords, each at a time.

A dictionary attack is no more than a brute force attack which narrows the space of possible words by

only considering specific words, or subsets of those words, based on some alphabet. Dictionary attacks

have proven to be deadly for some cryptographic systems, especially for password-hacking purposes.

This is one of the main threats to passwords that are dictionary-based.

Countermeasures

In order to prevent brute force attacks, the number of words that can be written with alphabet Σ must

be as high as possible without compromising the computational capability of the system at hand. As for

dictionary attacks, not only the previous requirement must be met, but also the passwords or secret keys

must be long enough and as much randomized as possible.

2.6.2 Man In The Middle Attack

In a Man-in-the-middle (MiM) the adversary is able not only to eavesdrop the communications, but

also to actively participate in the exchange of messages, in such a way that the other parties are not

aware of the adversary’s true identity.

Example 2.6.1. Consider the following case scenario represented in Figure 2.6: Alice and Bob are

communicating through a non secure channel C, which is being eavesdropped by Eve, a third party

for which none of the messages sent in the channel are directed for. Since C is not secure, Eve can

listen to the communication and may be able to impersonate either Alice or Bob (or even both of them,

in a worst case scenario). Alice may send a message M intended for Bob, but Eve is able to intercept

the message M , change it for a malicious message M ′ and send M ′ to Bob, who has no idea that the

original message was M instead of M ′.

A MiM attack can even be performed by an adversary which does not gain any additional information

on the ciphertext and whose solely purpose is to disrupt all the communications by jamming the chan-

nel(s) with junk data and therefore preventing the reception of any message by the targeted end user.

Nevertheless, most MiM attackers intend to extract information by acting as the other end party for each

of the communication entities.

24

Alice Eve BobM M ′

M M ′

Figure 2.6: Man in the middle attack. Eve is able to intercept the message and/or jam the communication

channel at will.

Countermeasures

It is very hard to detect every intrusion in a wireless network, especially one where the elements of the

network are under restrictions of power consumption and activity extent. The most common procedure of

preventing an active MiM attack is to always verify each received message’s integrity and authentication.

Even though there are intrusion detection systems for alerting such undesired interference in wireless

networks and methods to grant extra layers of security such as VPN connections, their consideration is

out of the scope of this text for their usage is not suitable to the discussed problem due to constraints

imposed to some parties.

2.6.3 Birthday Attack

Categorized into the set of collision attacks, a birthday attack is no more than a brute force attack

where the attacker has some useful probabilistic insight that reduces the set of possible outputs for the

same bit security, making it more efficient than a simple brute force.

Problem 2.6.1 (Birthday Problem). Given a room with n people, what is the probability that k of those

people have the same birthday?

Let Pk(n, d) be the value of the probability that holds the answer to the Birthday Problem, where d is

the number of possible values for each element, i.e., d = 365 for this specific case. Birthday attacks for

cryptographic hash functions are based on Problem 2.6.1 with k = 2. In fact, the probability that any two

out of n people have the same birthday is given by

P2(n, 365) = 1− 365!

(365− n)!× 365n

which follows trivially from a probabilistic analysis of Problem 2.6.1.

Consider an arbitrary hash function h : X → Y , such that ∀y∈Y : |y|2 = b, where |y|2 is the number

of bits of y. One can adapt the Birthday Problem to ask the following question: ”Providing n randomly

chosen inputs {x1, . . . , xn} =: Xn such that xi ∈ X for all 1 ≤ i ≤ n, what is the probability that

h(xi1) = h(xi2), for some xi1 , xi2 ∈ Xn?”. The answer to the previous question consists simply in the

value of P2(n, 2b) and from [7, 31] one can conclude that P2(n, 2b) ≈ 1 − e−n2/2b+1

, which is no more

than the probability of finding a collision for a cryptographic hash function whose outputs are b-bit words.

Let mb be the expected number of distinct outputs of h such that P2(mb, 2b) ≥ 0.5. Then mb = 2b/2

represents a lower-bound to the number of outputs of h to be computed such that a collision is expected

to occur and is usually referred to as birthday bound.

25

This probabilistic approach entails a reduction of every cryptographic hash function’s bit security

and in order to prevent these types of attacks one has to ensure that it is computationally infeasible to

compute 2b/2 distinct outputs, for hash functions that return b-bit digests.

2.6.4 Replay Attack

A replay attack is a special case of a MiM attack. Here an adversary is able to gain additional

information by eavesdropping and saving a transmitted message or part of a message, either in another

protocol, or in another run of the same communication protocol which was eavesdropped. This attack is

based on re-transmission of data.

Countermeasures

There are several possible procedures to prevent a system of being vulnerable to a replay attack.

In general, for a cryptographic system to be resistant to this type of attack, one must assure each

communication session to be uniquely identified, what can be achieved by granting each message a

session identifier.

Another method for preventing replay attacks consists in timestamps. Suppose that Bob has a clock

from which he periodically broadcasts its real value time t, together with a message authentication code

(MAC), for authentication purposes. Whenever Alice wants to send Bob a message x, she encrypts x

into y with some cipher and then generates a guess timestampt′ of Bob’s real current time t′, based on t,

which is then authenticated with a MAC. Upon receiving the whole package at time t′′, Bob only accepts

the message for further checking its authentication and integrity if t′′ − timestampt′ < ε, for some ε > 0.

Note that if Eve wants to replay a message she is able to do it for as long as t′′ − timestampt′ < ε, i.e.,

this procedure is not completely reliable for replay attack protection, given that if either ε is not small

enough or if an attacker can replay quickly enough regardless of epsilon’s value, then the cryptographic

scheme is compromised.

2.6.5 Padding Oracle Attack

To perform a padding oracle attack one must have access to a padding oracle11 O. Among the

BCMO discussed in chapter 2.2.2, the modes that require padding (ECB and CBC) are vulnerable to a

padding oracle attack, but while this attack does not completely break ECB, for CBC it’s lethal due to its

encryption and decryption mechanisms. Nevertheless, note that if padding is used in either CFB, OFB

or CTR, then these encryption schemes also become vulnerable to this type of attack as well, provided

that no authentication layer is associated with it.

A padding oracle attack for the CBC encryption is going to be exemplified and discussed throughout

this section. Consider the following situation:

• Alice (A) and Bob (B) want to communicate through a communication channel and share a com-

mon secret s, with which they decide to use as key for a block cipher of their choice whose block11A padding oracle is a system that binarily answers to the question: ”Is this message properly padded?”.

26

size is b together with CBC mode to provide secrecy to their messages.

• A and B agree on padding the messages with padding as in [32].

• Victor (V), an adversary, has access to the communication channel and is able to perform a MiM

attack. He has also access to a padding oracle that answers whether an encrypted message y is

correctly padded.

Consider that A encrypts a message x and sends the encrypted result y along with the IV through

the communication channel , which is intercepted by V. Assume w.l.o.g. that n = 2, i.e., y = y1 ‖ y2,

where

yi =

bn

j=1

yij ,∀i∈I2

and

O(y) = true.

(2.26)

Recall CBC decryption:

dCBCs (C) =

nn

i=1

Pi

where Pi = dbs(Ci)⊕ Ci−1 and C0 = IV .

V now decides to change the last byte of C1 in the following way: C∗1b = C1b ⊕ zb ⊕ 0x01, where

zb is a guess for the last byte of P2 and 0xh is the hexadecimal representation of a byte such that

h ∈ {00, 01, . . . , FE, FF}. After having replaced C1b for C∗1b, V has C∗ = C∗1 ‖ C2, where C∗1 = C11 ‖

. . . ‖ C1(b−1) ‖ C∗1b. Then V makes use of the oracle in order to know if C∗ is properly padded by calling

O(C∗). Note that

dbs(C∗1 )⊕ IV = P ∗11 ‖ . . . ‖ P ∗1b 6= P1 (2.27)

due to the avalanche effect of the block cipher; on the other hand,

dbs(C2)⊕ C1 = P21 ‖ . . . ‖ P2(b−1) ‖ P ∗2b (2.28)

meaning that only that last byte of P2 is changed while the remaining b− 1 bytes have not been altered.

Let db correspond to the last byte of dbs(C2). Since P ∗2b = C1b ⊕ zb ⊕ 0x01 ⊕ db and C1b ⊕ db = P2b then

P ∗2b = P2b ⊕ zb ⊕ 0x01 by the associativity property of the XOR operation.

At this point, there are two possible cases:

1. P was not padded prior to encryption.

2. P was padded prior to encryption.

In case 1, the conclusion is straightforward:

(a) (zb = P2b) ⇒ (P ∗2b = 0x01), which corresponds to correct padding for PKCS7 and in this case

O(C∗) = true. This means that V has found the last byte of P2.

27

(b) (zb 6= P2b) ⇒ (P ∗2b 6= 0x01), meaning that O(C∗) = false, so V chooses a fresh12 guess zb and

repeats the process.

Thus O(C∗) = true if and only if zb = P2b.

In case 2, situation (b) is still valid, but O(C∗) = true 6⇒ zb = P2b, because now P2 = x1 ‖ . . .‖xr ‖x′1 ‖

. . . ‖ x′k where r + k = b, k ∈ (Z256 \ {0}) and x′i represent the padded bytes such that x′i = k, ∀i∈Ik . For

k > 1 there are two possible values of zb that entail an acceptance of the modified word by the oracle,

which are:

(2a) (zb = P2b)⇒ (P ∗2b = 0x01);

(2b) (zb = t)⇒ (P ∗2b = k), for some t ∈ (Z256 \ {P2b});

Therefore, in order for V to differentiate which of these two acceptances is the true last byte of P2, V

modifies the second to last byte of C1 by flipping a positive arbitrary number of bits, which will definitely

yield a distinct value than before. After doing so, V runs the previous procedure for the iterated guess zb

until he finds zb ∈ Z256 such that O(C∗) = true, in which case V is sure to have found the value of P2b.

After discovering the last byte P2b, V proceeds in trying to find the next (second to last) byte of P2,

i.e., P2(b−1). The same arguments can be applied to this situation, as follows:

C∗1b = C1b ⊕ P2b ⊕ 0x02

C∗1(b−1) = C1(b−1) ⊕ z(b−1) ⊕ 0x02(2.29)

thus C∗1 = C11 ‖ . . . ‖C1(b−2) ‖C∗1(b−1) ‖C∗1b and V keeps making calls to the oracle O(C∗) until it accepts

the input, in which case V has found the second to last byte of P2.

Note that in this case dCBCs (C∗) = P ∗1 ‖ P ∗2 where P ∗1 6= P1 due to the avalanche effect but P ∗2 =

P21 ‖ . . . ‖ P2(b−2) ‖ P ∗2(b−1) ‖ P′2b, where P ′2b = 0x02 is fixed independently of the guess z(b−1) and on the

other hand P ∗2(b−1) = C1(b−1) ⊕ z(b−1) ⊕ 0x02⊕ d(b−1), where d(b−1) is the second to last byte of dbs(C2).

Again, the same arguments 1 and 2 can applied to P ∗2(b−1) and the attacker only needs 511 attempts

in a worst case scenario in order to find the correct byte P2(b−1). Algorithm 2 contains the pseudocode

for the whole procedure and one can easily see that V is able to recover the whole plaintext P in time

O(mb), where m is the size of the ciphertext and b the block size. Since these values are usually not

large, this algorithm runs in efficient time. Lastly, note that the algorithm does not take into account case

2 but one can easily adapt it for this situation.

2.6.6 Stream Cipher Attacks

The underlying security of stream ciphers is based on their good usage, as an adversary can take

advantage if certain precautions are not taken. In general, stream ciphers are considered to be very

secure, provided that one does not reuse the key and run an authenticity check on every encrypted

message.

12By fresh, one means a value in the set of possible values that has not yet been chosen.

28

Algorithm 2 Padding Oracle Attack on CBC encryption

1: procedure POA(C,O) . Discovering P without s

2: Initialize zero valued array P with length nb bytes;

3: for i = n to 1 do

4: x← 1;

5: for j = b to 1 do

6: z ← 0;

7: A← false;

8: for k = j + 1 to b do

9: Cik ← Cik ⊕ Pik ⊕ x;

10: end for

11: while A == false do

12: z ← z + 1;

13: Cij ← Cij ⊕ z ⊕ x;

14: A← Ask oracle O if C is properly padded;

15: end while

16: Pij ← z;

17: x← x+ 1;

18: end for

19: end for

20: end procedure

Let S be a stream cipher with encryption and decryption functions ek and dk, for a given key k ∈ K

and assume that Eve is an adversary.

2.6.6.1 Key Reuse

Suppose that Eve is able to perform a MiM attack and let m1 and m2 be two messages such that

m1 6= m2 and assume w.l.o.g. that |m1| = |m2| = l. Since S is a stream cipher, there is a keystream

generator function g which produced the keystream k1k2 · · · based on some internal state. Let k′ be the

substring k1 . . . kn, n ≥ 1 such that y1 = ek(m1) = m1⊕k′ and y2 = ek(m2) = m2⊕k′. Upon intercepting

y1 and y2, Eve is able to compute y1⊕ y2 = m1⊕m2 due to the commutative and self-inverse properties

of the XOR operator.

Statistical analysis can now be applied to recover m1 and m2 with high degree of confidence. Let

Σ be the alphabet at hand, let m3 := m1 ⊕ m2 = m31 . . .m3l such that m3i ∈ Σ ∀1≤i≤l, let Xi be

a random variable representing the value of the ith element of an arbitrary plaintext x and consider

P (Xi = m3i) = pi, ∀i∈{1,...,l}. The set

Ci = {(a, b) : a⊕ b = m3i ∧ a, b ∈ Σ} (2.30)

contains the (possibly many) pairs whose XOR yields the intended ith element of m3. Assuming that the

29

probability distribution of the alphabet elements is not homogeneous, a simple approach is to choose

the pair (a, b) ∈ Ci that satisfies

max(a,b)∈Ci

P (Xi = a)P (Xi = b) (2.31)

By applying this method, Eve is able to recover the plaintexts with high confidence without knowing

the secret key used in the stream cipher’s encryption procedure. More complex, probabilistic relations

may be required to increase the confidence degree on the choosing of the pairs (a, b) that satisfy equa-

tion 2.31, for all 1 ≤ i ≤ l.

Countermeasures

The only countermeasure for this situation is to never use a key more than once. For ciphers that

include an IV as part of their input, the pair (IV, k) can be seen as the general key to the cryptosystem

and the key k may be used more than once, as long as the initialization vector IV does not repeat, which

is done in practice by randomly choosing an IV out of the set of possible IVs. Given the generally high

cardinality of the latter, CSPRNGs are the most common choice in order to maximize the underlying

algorithm’s performance.

2.6.6.2 Bit-flipping

Suppose that Alice and Bob communicate through a communication channel C on which Eve is

able to perform a MiM attack. Moreover, assume that Alice wants to send Bob a message m such that

[m]2 = m1 . . .mn and that ∃i,j∈N : 1 ≤ i ≤ j ≤ n for which Eve knows mi . . .mj . Alice encrypts m using

S encryption and sends it to Bob through C and Eve upon intercepting the message y = m⊕ k′, where

k′ is the most significant n-bit substring of the resulting keystream produced by the keystream generator

function, computes

y′ := y ⊕ (mE ⊕ v) (2.32)

where [mE ]2 = 0i−1mi . . .mj0n−j and [v]2 = 0i−1vi . . . vj0

n−j , such that the bitstring vi . . . vj is an

evil bitstring chosen by Eve. After performing the operation in (2.32) she sends the resulting message

through C to Bob, who upon receiving y′ decrypts it as follows

dk(y′) =k′ ⊕ y′

=k′ ⊕ (m⊕ k′)⊕ (mE ⊕ v)

=(m⊕mE)⊕ v

=m⊕ v

(2.33)

where [m]2 = m1 . . .mi−10j−i+1mj+1 . . .mn, thus

[m⊕ v]2 = m1 . . .mi−1vj . . . vimj+1 . . .mn (2.34)

Note that Eve does not know the secret key k shared between Alice and Bob but the knowledge of

bits of the message m being sent makes it possible to alter them at will, without Bob noticing.

30

Countermeasures

At first glance one might think that an integrity check would suffice to prevent this attack, but since

the assumption of Eve knowing bits of the message being sent holds (possibly the whole message),

then considering to produce a simple digest on the message is not enough, for if she has knowledge on

the entire message m she can easily compute h(m) for any cryptographic hash function h. Therefore

an authentication tag is needed in this situation given that Eve will not be able to silently tamper the

message without leading to the mismatch of the tag and the corrupted message.

2.6.7 Weaknesses of Block Cipher Modes of Operation

Even if a block cipher is considered to be secure and there is seemingly no way to break the cryp-

tosystem by an analysis on the cipher itself, one can make use of that block cipher repeatedly in order

to encrypt or decrypt messages of size larger than the input block size and do so in such a way that

compromises the security of the whole system. This subsection discusses the advantages and disad-

vantages of some of the BCMOs.

ECB

ECB mode is considered to be the less secure block cipher mode of operation, as it is not semanti-

cally secure. An adversary can indeed gain information of the plaintext based solely on the ciphertext

since the a given plaintext block is always encrypted to the same ciphertext block.

CBC

Recall the CBC block encryption function present in Figure A.2 and according to equation 2.14 for an

n-block message p = p1 . . . pn. Allowing the IV to be predicted by an adversary gives room to a feasible

chosen-plaintext attack on the cryptosystem at hand, where the adversary can efficiently recover any

previously sent message.

Assume that Alice and Bob communicate through a channel C and that Eve is an adversary eaves-

dropping C with access to an encryption oracle O, such that O(p) = eCBCk (p) for any b-bit block p.

Moreover, the oracle has an intrinsic IV generator function (equal to Alice’s) that produces a random

new initialization vector used in each call. Now, Alice intends to send a word m = m1 . . .mn to Bob and

in order to do so, she computes an initialization vector IV1 and encrypts m as in equation 2.14, yielding

the encrypted message y. Then she sends the pair (y, IV ) through C such that Bob is able to decrypt

the message. Consider that Eve is able to predict the next IV used by Alice (therefore by the oracle

as well); upon intercepting (y, IV ), she can recover m according to Algorithm 3 by applying n calls to

procedure PREDICT(yi, yi−1), where y = y1y2 . . . yn, |yi|2 = b ∀i∈N : 1 ≤ i ≤ n, pictured below:

1: procedure PREDICT(yi, yi−1) . yi, yi−1 are b-bit values

2: Initialize a b-bit value y′;

3: Predict the initialization vector used in the next encryption: IVp;

31

4: while y′ 6= yi do

5: Guess b-bit value m′;

6: Compute M := yi−1 ⊕ IVp ⊕m′;

7: Call oracle: y′ ← O(M)

8: end while

9: return m′;

10: end procedure

The capability of Eve to predict the IV successfully is the key to the feasibility of the attack. Indeed,

suppose that it’s highly probable tht Eve is not able to correctly predict the IV, for her guess IVp is such

that P (IVp 6= IVnew) → 1, where IVnew is the new random IV generated by the oracle O. Note that for

any b-bit block x, the query O(x) returns EBk (IVnew ⊕ x). By taking a guess m′, Eve wants to compute

M such that m′ ⊕ IVold = IVp ⊕M ⇒M = m′ ⊕ IVold ⊕ IVp, where IVold is either the value of the IV of

the pair (y, IV ) for when Eve is trying to find the first plaintext block, or the value of yi−1 for when Eve is

trying to find the ith plaintext block. Then, by calling the oracle with input M , the following holds:

O(M) =EBk (IVnew ⊕M)

=EBk (IVnew ⊕m′ ⊕ IVold ⊕ IVp)(2.35)

however, since [IVnew⊕IVp]2 6= 0b, even if equation 2.35 yields the same result of the intended ciphertext

block yi one cannot conclude that m′ is the original plaintext block mi, because it only implies that

IVnew ⊕m′ ⊕ IVp = mi.

On the other hand, if Eve is able to correctly predict the IV used in the next encryption, then

O(M) =EBk (IVnew ⊕M)

=EBk (IVnew ⊕m′ ⊕ IVold ⊕ IVp)

=EBk (m′ ⊕ IVold)

(2.36)

where the last equality holds because IVp = IVnew. Lastly, note that EBk (m′ ⊕ IVold) = yi ⇒ m′ = mi.

Algorithm 3 CBC Predictable IV attack

1: procedure PREDICTATTACK(y, IV ) . Discovering x : eCBCk (x, IV ) = y

2: Split y into n blocks y1, . . . , yn;

3: Initialize empty bitstring x;

4: for i = 1 to n do

5: a← PREDICT(yi, yi−1); . y0 = IV

6: x← x ‖ a;

7: end for

8: return x;

9: end procedure

Apart from IV predictability, CBC mode is also susceptible to POA: provided the absence of cipher-

text stealing methods, every message must be padded prior to encryption. Algorithm 2 describes the

procedure for attacking CBC given a padding oracle.

32

CTR

Let ti, pi and ci be the ith counter block, plaintext block and ciphertext block, respectively. Due to

CTR’s construction, changing the last byte of ci results in changing only that last byte of pi and the same

attack using a padding oracle for CBC can be herein applied.

Let x be an m-bit message and y an n-bit message, with n < m. Then the following holds:

x⊕ y = x|n ⊕ y

where x|n represents x truncated to its first n bits. This observation makes it clear that there is no need

for padding messages that are encrypted via CTR and the resulting ciphertext will have exactly the same

length as the original plaintext, as the XOR operation is performed bitwise.

As already mentioned in section 2.2.2.4, the pair (ti, k) needs to be unique for all i ∈ N, otherwise

CTR mode’s security is compromised. Consider the following scenario: using CTR mode with an arbi-

trary block cipher B and standard incrementing function, Alice encrypts two messages p and m (of the

same length, w.l.o.g.) such that the nonces chosen for each encryption, nonce1 and nonce2, satisfy

nonce1 + i = nonce2 + j (2.37)

for some i, j ≤ n, where n is the number of b-bit blocks, yielding the ciphertexts w and z such that

w =eCTRk (p, nonce1)

z =eCTRk (m,nonce2)

(2.38)

that are available to Eve. Given the nonce equality, the following holds

EBk (nonce1 + i) = EBk (nonce2 + j) (2.39)

meaning that Eve, who is in possession of the ciphertexts w and z is able to compute

wi ⊕ zj =(EBk (nonce1 + i)⊕ pi)⊕ (EBk (nonce2 + j)⊕mj)

=pi ⊕mj

(2.40)

where the last equality follows from (2.39) and from the XOR properties of commutativity and self-

inverse. Now, a statistical analysis technique would be the most straightforward approach to find both pi

and mj .

CFB

Generally speaking, CFB suffers from the same fragilities as CTR mode: the IV must be unique

and it may be susceptible to a POA. Furthermore the construction of each block of CFB encryption is

fully-dependent on the previous hence there is no way of parallelizing the process.

2.6.8 Side-Channel Attack

Whenever the cryptographic systems are embedded within devices that are physically exposed in

such a way that third parties can extract information from its electromagnetic field, temperature, sound,

energy consumption, or any kind of physical element variation, one says that they are vulnerable to

side-channel attacks.

33

Countermeasures

The countermeasures for side-channel attacks can be categorized into two main activity clusters:

1. Prevent the leak of information;

2. Remove or smoothen the relation between secret data and environmental changes.

Both of these actions impend a considerable amount of resources, especially access and knowledge to

the hardware development, hence it is not usually easy to prevent side-channel threats.

2.6.9 Attacks on AES

Since it was published as a standard in 2001, AES has been target to non-ceasing break attempts

throughout the years. It is yet unbreakable in terms of direct security, i.e., there is no efficient known

practical attack on the cipher itself. AES overcomes the weaknesses of DES that were exploited by

differential cryptanalysis but the development of the concept of integral cryptanalysis [33], which instead

of XOR differences is based on sets of chosen plaintexts that have some common fixed part, raised the

first attack on this robust standard apart from the brute-force approach. The latter is simply infeasible for

any of the possible key lengths. The first theoretical key recovery attack on AES [34] was published in

2011 and it was approximately four times faster than a brute-force attack. Even with the improvements

to this attack dating to the current days, it is not yet possible to efficiently implement these attacks due

to their time complexity.

There are many possible ways to break a cipher and some of the most devious attacks do not

directly target the cipher but instead work around it and try to gain information leaked by the behaviour

of external components. The type of attack that deviates the most from the cryptographic features

related with the privacy provider encryption scheme is a side-channel attack and in 2016 a very efficient

attack of this kind was created [35] that relies on aspects related with the central processing unit (CPU)’s

cache memory and can break AES in less than a minute. Notwithstanding, most modern-day CPUs are

already resilient to this category of time-based side-channel attacks.

34

Chapter 3

Network

The present technological advancements entail an increasing complexity in computational security.

Given a network under certain restrictions on both its elements’ autonomy, capacity and connectivity,

arises the problem of transmitting data with integrity, authenticity and non-repudiation, using the nowa-

days’ cryptographic standards.

The goal of this chapter is to provide a topological solution together with a communication protocol

for a specific network. The problem being studied is addressed in section 3.1, followed by a description

of the components and restrictions imposed to the network in section 3.2. Then, some possible solutions

for the network’s topology are compared and a choice is made in section 3.3 and section 3.4 contains a

step-by-step description of the communication protocol under the chosen option. Lastly, in section 3.5

some concepts are introduced for the global characterization of the network’s inherent encryption and

decryption mechanisms.

3.1 The Problem

The main purpose of this work is to develop an optimal solution for the topological and cryptograph-

ical components of a restricted network. Basically, upon being provided with network requirements,

which are either constraints to the network’s elements and their connections or to the capabilities of the

communication channels, one is intended to choose the encryption and authentication schemes and

analyze their level of security for fitting state-of-the-art properties and definitions. The goal of these

schemes is to provide the data cryptographic properties that will strengthen the resilience of the data

stored in the network elements’ memory against possible threats. The security layer of any of the pro-

tocols used for the transmission of data between any two network parties is also a relevant subject of

study for it will determine the level of security of the communications.

Consider the scenario depicted in figure 3.1. Certain measurable elements from the environment

are processed into digital data by a specific type of device, who stores the information after processing.

Then, the data is to be transmitted to a secure database, where it is stored and used as required. The

problem is to come up with a secure mean of transmitting the data from the device to the database, pro-

35

Measurable element

Detects activity

Device

Processes and stores information

Database

Data transmission

Data storage

Figure 3.1: General purpose and activity of the envisaged network.

vided restrictions to the device’s lifetime, autonomy, capacity of processing and memory space. Thus, a

network for the transmission of the data is to be constructed, which consists of distinct clusters such that

each is composed by a fixed number of elements, each element of the same group shares a set of fea-

tures and the connections between clusters are restrained under some pre-defined rules. The previously

mentioned parties’ features range from the computational power scope and available cryptographic al-

gorithms and their respective keys to the assigned mission of the network element. More specifically,

the purpose of the network is to gather real-time data and transmit it to a secure database while granting

the collected evidence confidentiality, integrity, authenticity and non-repudiation properties.

3.2 Details

Let the network be composed by three main components:

• Gathering devices (GDs): field-deployable parties that gather the raw data (with a maximum

threshold of 256kB/s), process it and subsequently send it to an authorized party via an asyn-

chronous channel. The length of each of the generated messages is a multiple of a minimum

defined length l1 and is maximized by 256 octets. These elements are restricted with respect to

memory (256kB RAM) and autonomy, as their energy source is a non-rechargeable battery and

remain in the same geographic location throughout the extent of their lifetime.

• Middle-point party (MPP): a gateway party who is near the deployed gathering devices in order to

wirelessly receive the data and/or send command messages. May also possess a serial connec-

tivity option for posteriorly physically transmitting the sensitive data to an authorized party.

• Mission and data manager (MnDM): headquarters’ positioned device that receives the data from

the middle-point party, makes the necessary verifications and stores it in a secure centralized

database. It is also capable of generating and sending command messages, whose length is a

multiple of a pre-defined minimum length l2.

Figure 3.2 summarizes the interactions between the abovementioned components. Note that the

GDs do not communicate directly with the MnDM and vice-versa. The data flow from the GDs to the

36

MPP and subsequently to the MnDM is denoted by Upstream Data Lifecycle (UDL) and in the inverse

direction is denoted by Downstream Data Lifecycle (DDL).

DATA

COLLECTION

Gathering

Devices

DATA

TRANSMISSION

Middle-Point

Party

DATA

TRANSMISSION

Mission and

Data Manager

Upstream Data Lifecycle

Downstream Data Lifecycle

DATA

STORAGE

Figure 3.2: General layout of the desired network.

The description of the network entities uproots the term of command message. These are pre-

defined formatted messages whose contents are intended to give an instruction to another network

party and can be generated by the MPP and MnDM. It is important to note that both the (binary) length

of the messages generated by the GDs and the command messages herein introduced is a multiple of

a value l1 and l2, respectively, for some l1, l2 ∈ N : l1 ≤ 2048 ∧ l2 ≤ 2048. For every message having

length L = m× l it is equivalent to have m messages of length L with respect to the gathering process.

That is, for every message x and command message m

∃L1,L2∈N : (L1 = m1l1 ∧ L2 = m2l2) ∧ (|x| = L1 ∧ |m| = L2) (3.1)

for some m1,m2 ∈ N : mi ≤ 211/li, ∀i∈I2 .

For secrecy, authentication and non-repudiation purposes some cryptographic algorithms are going

to be used, for which are compelled cryptographic keys. Prior to deployment there must be a setup

stage, in which the required keys are generated, transmitted to the envisaged target and stored in solid

memory. These keys are generated inside secure headquarters, called the pre-mission system (PMS)

and will be discussed further on. Figure 3.3 contains a simple diagram that depicts the whole step:

at the PMS, a family of keys K = (K1,K2,K3) is generated such that K2 and K3 are the sets of keys

transmitted to MPP and MnDM, respectively, and K1 =⋃n

j=1Kj1, where n is the total number of GDs

and Kj1 is the set of keys transmitted to GDj , ∀j∈N : 1 ≤ j ≤ n. When the setup stage is concluded the

devices meet the required constraints for the set up of the network.

Pre-Mission System Generate key family K

GDj

MPP

MnDM

Kj1

K2

K3

Store keys

Store keys

Store keys READY FOR DEPLOYMENT STAGE

Figure 3.3: Pre-deployment stage

37

The GDs, upon deployed, can be in one of two states: active mode or sleep mode. When the devices

are in active mode they keep on gathering evidence according to their data collecting schedule and

send the processed data to the envisaged end party via an asynchronous communication channel [36]

according to the data flow schedule. The sleep mode, in turn, is a low power consumption state in which

the devices are not actively performing any activity other than periodically searching for a connection

to an asynchronous communication channel. There must be an activity schedule on which the GDs’

actions rely on and it can be one of the components affected by the command messages that either the

MPP or the MnDM send to the GDs.

Due to the restrictions imposed on the gathering devices, there must be a device within range that

generates the WLAN on which the devices share information. There are two options for this situation,

which are discussed in detail in the next section:

1. The network is generated by a field-deployed AP;

2. The network is generated by the MPP directly.

When in active mode, the gathering devices are intended to be collecting and internally storing rele-

vant data around the clock, but the transmission of this information to another party does not need to be

performed at the same time, meaning that the WLAN can be periodically created and at that timespan

the transmission of data should be prioritized. For this reason, the data stored in solid memory of the

gathering devices is expected to be encrypted. The GDs are connected to the MPP via Wi-Fi, which

becomes a security vulnerability since the messages are transmitted as radio signals hence an attacker

may be able to eavesdrop on the communication channel and attempt to break the cryptosystem. For

these reasons, the chosen protocol for the communication between the GDs and the MPP has been

decided to be WPA2-PSK, since WPA2 is the most robust option for Wi-Fi communication channels. The

choice of PSK over EAP follows from two facts: firstly, the gathering devices do not need to hide infor-

mation from one another and secondly the EAP mode of the WPA2 protocol requires more computations

for the initial authentication step and therefore it increases the energy consumption when compared with

PSK. As for the Internet communication protocol between the MPP and the MnDM, it’s been decided

that the TCP protocol [37] should be used for the Transport Layer along with IP for the Internet Layer.

3.3 Network Topology

There are several possibilities to be considered for the network’s topology and their suitability differs

on the purpose that weighs more and passively implies the remaining options to become cumbersome.

Following, some proposals will be presented and their practicality discussed.

AP-based network

This approach considers that the gathering devices solely communicate with the access point. There

are two possibilities for this infrastructure mode, both represented in Figure 3.4.

38

MPPACCESS POINT

GD1

GD2

GD3

Wi-FiMnDM DB

Wi-Fi Internet

(a) Including a deployed router on the field.

MPP

GD1

GD2

GD3

Wi-FiMnDM DB

Internet

(b) Middle-point party performs the AP role.

Figure 3.4: Topology of AP-based networks.

Proposal A: The first choice is represented in Figure 3.4a and considers the deployment of a fixed

AP on the field, which would be able to maintain the WLAN in effect continuously. The main advantage

of this option lies on the gathering device’s energy consumption reduction entailed by the fact that the

encryption and subsequent internal storing of the gathered data would be performed by the AP itself,

i.e., the GD would only need to spend energy on the communication protocol and not on the storing

mechanism, the latter being performed by the AP. Notwithstanding, the latter would not only require a

high amount of energy to be running, but would also be very difficult to be hidden due to its dimensions,

which would allow an adversary to easily detect it on the field and eventually destroy it or try to crack the

communications. These two very strong arguments lead to the deemphasizing of this possible solution,

since the AP is required to be a centralized element of the network.

Proposal B: The second possible option is represented in Figure 3.4b for a total of 3 gathering

devices. The arrowed edge represents the uni-directional data flow between the MnDM and the DB,

whilst the simple edges represent a bi-directional data flow. It features the MPP as the access point of

the network. Since the MPP is considered to be a versatile element in the sense that it is not deployed

on the field, this option is considered to be very suitable for the features at hand.

So far, the topology presented in proposal A has been deemphasized, leaving B as the only viable

option. Notwithstanding, another possibly highly reliable solution is going to be discussed.

39

Ad-hoc network

Proposal C: Considering the case where the gathering devices may communicate with one another,

one is presented with an ad-hoc network (Figure 3.5). The dashed lines in the previously mentioned

figure represent a possible communication, i.e., upon agreeing on a certain frequency for the commu-

nication channel, the devices can either broadcast a message or send it to a number of targets of their

choice. An advantage of this option over the previously mentioned AP-based case is the network’s

scalability and self-management.

This layout is indeed a strong option for the topology of the network, since it may allow the GDs

to never broadcast and therefore save energy. More explicitly, each GD can communicate with a finite

chosen number of other GDs and/or the MPP. This would imply that the GD would have previously set

up targets, allowing low energy consumption communications. However, the more possible connections

the more keys one needs to store in each GD due to the required secrecy layer on the data stored

within the gathering devices. Moreover, the end-to-end transmission in an ad-hoc network is usually

slower given that the message will have to be transmitted from one party to the next until it arrives to the

desired target, in the absence of broadcast. This step back may have a relevant impact on the system

at hand, not due to the time needed for the MPP to gather the data, but to the energy spent by the GDs

in the transmission process (recall that the GDs must save as much energy as possible for an extended

autonomy on the field).

MPP

GD1

GD2

GD3

Wi-FiMnDM DB

Internet

Figure 3.5: Topology of the ad-hoc network.

In this case, the use of Elliptic Curve Cryptography (ECC) would be very useful on the grounds that

the memory savings (as a result to the smaller key sizes), the lower computational complexity on both

the encryption and authentication processes and the reduction of the GDs’ power consumption are all

demanding features, given the restrictions at hand. However, as opposed to standardized cryptosys-

tems, it is not yet common to find embedded systems hardware-programmed with ECC, which is why

proposal B has been chosen over C in practice.

40

3.4 Protocol

This section describes the protocol that specifies the key generation stage and the data processing

that occurs on the network’s data lifecycle in both directions, i.e., UDL and DDL.

Recall that the pre-deployment stage copes with Figure 3.3 and the network’s topology is as pre-

sented in Figure 3.4b. Let n be the number of gathering devices and GDi be the GD at hand for some

fixed index i ∈ I. Moreover, consider the following notation:

• (kGMPA )i: The 128-bit key shared between the GDi and the MPP, used by the AES cipher.

• kMPMA : The 128-bit key shared between the MPP and the MnDM, used by the AES cipher.

• (kGMPH )i: The 256-bit key shared between the GDi and the MPP, used in the HMAC-SHA-256 algo-

rithm.

• (kGMDH )i: The 256-bit key shared between the GDi and the MnDM, used in the HMAC-SHA-256

algorithm.

• kMPMH : The 256-bit key shared between the MPP and the MnDM, used in the HMAC-SHA-256

algorithm.

These abbreviations are to be considered throughout the entire text.

3.4.1 Setup

The setup stage must take place in a secure location and be performed by trusted users, given that

herein all the necessary keys are generated and inserted into the corresponding targets. There are three

types of keys that need to be generated and distributed: keys for encryption, keys for authentication and

a single key for the Wi-Fi communication protocol.

The key generation protocol occurs at PMS and comprises the following steps:

1. The user inputs (n, pass), a tuple consisting in the number of gathering devices that are going to

be deployed and a password, respectively;

2. The key generation algorithm is applied with input (n, pass) and outputs K = (K1,K2,K3)1;

3. Export the keys to the envisaged devices according to the following distribution:

• Kj1 = {(kGMPA )j , (kGMPH )j , (kGMDH )j};

• K2 = {kMPMA , kMPMH } ∪ {(kGMPA )j}j∈I ;

• K3 = K1 ∪ K2;

4. The keys in Kj1 are inserted into GDj , ∀j ∈ N : 1 ≤ j ≤ n;

5. The keys in K2 are inserted into the MPP;

1As defined in section 3.2

41

6. The keys in K3 are inserted into the MnDM.

After all the steps are concluded the devices are ready for the deployment stage, in which the GDs

are distributed among the desired initial locations li0,∀i∈I . Each device GDi will remain in li0 for its entire

lifetime without any key schedule algorithm to update the keys.

3.4.2 Communication Protocol

In this section, only the steps of the protocol are described. The utility of each of the components of

the ciphertexts is explained in chapter 4.

Subsequent to the setup stage, the GDs are deployed into the field of action and start gathering the

data to be sent to the MPP. This data is encrypted and saved in the GD solid memory, waiting to be

sent through the communication channel to the MPP, via Wi-Fi and using the WPA2-PSK protocol. The

communication protocol of the whole network encompasses the two directions of the data flow: UDL

and DDL.

Consider the devices to be already deployed in the field. Let GDi be one of the deployed gathering

devices for some i ∈ {1, . . . , n}, let fid ∈ Z256 be a unique identifier of GDi stored in its solid memory

and consider the following abbreviations:

• eIV1 ≡ Encryption mode AES-CTR-128 with key (kGMPA )i using IV as the initialization vector.

• dIV1 ≡ Decryption mode AES-CTR-128 with key (kGMPA )i using IV as the initialization vector.

• eIV2 ≡ Encryption mode AES-CTR-128 with key kMPMA using IV as the initialization vector.

• dIV2 ≡ Decryption mode AES-CTR-128 with key kMPMA using IV as the initialization vector.

• h1 ≡ HMAC-SHA-256 with key (kGMPH )i.

• h2 ≡ HMAC-SHA-256 with key (kGMDH )i.

• h3 ≡ HMAC-SHA-256 with key kMPMH .

Upstream Data Lifecycle

The data is gathered by GDi, encrypted and stored in solid memory. Then, subject to the wireless

channel’s communication protocol, it is sent to the MPP where its integrity and authenticity are verified

and another layer of security is applied prior to being saved in the solid memory of the MPP. Lastly,

the package is sent from the MPP to the MnDM either via Internet or serial connection and if all the

verifications succeed at the MnDM then the plain data is sent to a secure database (DB).

The operations that are carried out in each device will now be listed.

Gathering Devices

1. GDi transforms the gathered analog raw data to digital data D and assigns to it a 4-octet message

identifier mid;

42

2. Compute h1(D);

3. Compute h2(D);

4. Compute inner pack := h1(D) ‖ h2(D) ‖D;

5. Generate a 16-octet initialization vector IV1;

6. Perform an encryption: eIV11 (inner pack);

7. Build the final package Pack1 := fid ‖mid ‖ IV1 ‖ eIV11 (inner pack);

8. Store Pack1 in solid memory.

The MPP gets in range and starts listening to incoming requests. GDi attempts to connect to the WLAN

hosted by the MPP and as soon as the connection is established, the messages that the GDi had stored

in memory are sent, subject to the WPA2-PSK protocol.

Middle-Point Party

9. Pack1 is parsed into its main components: fid, mid, IV1 and eIV11 (inner pack);

10. Decrypt eIV11 (inner pack), i.e., compute dIV1

1 (eIV11 (inner pack)) in order to obtain inner pack;

11. Parse inner pack into its main components: h1(D), h2(D) and D, such that h1(D) =

inner pack|256, h2(D) = (inner pack \ inner pack|256)|256 and D = inner pack \ inner pack|512;

12. Verify the message’s integrity, i.e., compute h1(D)′ and check whether h1(D)′ = h1(D), where

h1(D)′ is a new instance of the function h1 applied to the data D found in the decrypted package.

If the verification is unsuccessful, then consider the message at hand as compromised and abort

at this step by clearing all the memory associated with it.

13. Send the 32-bit word message identifier mid as an acknowledgement2 related to the message at

hand back to GDi;

14. Compute Pack2 := fid ‖ IV1 ‖ h1(D) ‖ h2(D) ‖ eIV11 (inner pack), where h1(D) and h2(D) are

extracted from step 11;

15. Generate a 16-octet initialization vector IV2;

16. Encrypt the previously built package: eIV22 (Pack2);

17. Generate a digest of the encrypted data: h3(eIV22 (Pack2));

18. Build the final package Pack3 := h3(eIV22 (Pack2)) ‖ IV2 ‖ eIV2

2 (Pack2);

19. Store Pack3 in solid memory.2The message acknowledgement is subject to the WPA2-PSK protocol and thus is protected while travelling through the net-

work. Upon receiving this information, the GDi will trust that this information has been successfully delivered to the intended

party.

43

After the data has been gathered the MPP closes the WLAN and connects to the MnDM via Internet,

transmitting all the recently stored data subject to the TCP/IP protocol.

Mission and Data Manager

20. Parse Pack3 into its main components: h3(eIV22 (Pack2)), IV2, and eIV2

2 (Pack2), where

h3(eIV22 (Pack2)) = Pack3|256;

IV2 = (Pack3 \ Pack3|256)|128;

eIV22 (Pack2) = (Pack3 \ Pack3|384);

21. Verify the integrity of the encrypted data by computing a new HMAC instance h3(eIV22 (Pack2))′

and checking whether h3(eIV22 (Pack2))′ = h3(eIV2

2 (Pack2)). If successful, proceed to the next

step, otherwise consider the message as compromised and abort at this step.

22. Perform the decryption dIV22 (eIV2

2 (Pack2)) in order to obtain Pack2;

23. Parse Pack2 into its main components:

fid = Pack2|8;

IV1 = (Pack2|136)|128;

h1(D) = (Pack2|264)|256;

h2(D) = (Pack2|392)|256;

eIV11 (inner pack) = Pack2 \ Pack2|392;

24. Perform the decryption dIV11 (eIV1

1 (inner pack)) in order to obtain inner pack;

25. Parse inner pack into its main components: h1(D)∗, h2(D)∗ and D∗.

26. Check whether hi(D) = hi(D)∗, ∀i∈I2 . If successful, then proceed to the integrity check in the next

step. Otherwise, consider this message to be incorrect and abort the execution at this step;

27. Compute two new HMAC instances of the data D: h1(D)′ and h2(D)′ and check whether hi(D)′ =

hi(D), ∀i∈I2 . If successful, send Pack3 to the DB, for it can be assumed with a high level of trust

that the message has not been tampered with during the whole course. Otherwise, consider the

message as compromised and abort the execution.

Downstream Data Lifecycle

Both the MnDM and the MPP can generate messages, usually called command messages, whose

purpose is to give an instruction to another network element; for instance they can order a GD to change

its data gathering time frame. The format of the command message is pre-defined and varies according

to the type of inherent command.

All messages generated and sent by the MnDM fall into the category of command messages and one

can discriminate two distinct clusters: the commands intended for the MPP and the commands intended

44

for the GDs. Either way, the messages with origin at the MnDM are ciphered prior to being stored in the

MnDM’s solid memory and sent to the MPP. Then the data is sent via Internet to the MPP and subject

to a verification process upon arrival, after which it is either encrypted and saved in the MPP’s solid

memory while waiting to be sent to the envisaged GD via Wi-Fi or read and applied on the fly. Moreover,

the MPP can also generate local instructions intended for GDi, thus any command message that leaves

the MPP via Wi-Fi must be flagged according to its sender. When the command message reaches the

target GDi, its authenticity and integrity are verified and it is saved in a stack while waiting to be read

and applied by the internal command manager protocol.

The list of steps that are carried out in each device is now presented.

Mission and Data Manager

1. Generate a message M ;

2. If M is a command for a GD then generate a HMAC of the message: h2(M) and proceed to step

4. Otherwise proceed to step 3;

3. Consider inner pack := M and fid a zero valued 8-bit identifier. Proceed to step 5;

4. Prepend the HMAC to the message: inner pack := h2(M) ‖M and choose a GD identifier fid ∈

(Z2)8 : fid 6= 0. Proceed to step 5;

5. Generate a pseudo-random 16-octet initialization vector IV1;

6. Encrypt inner pack by computing eIV11 (inner pack);

7. Generate a HMAC of the encrypted package: h3(eIV11 (inner pack));

8. Build the package Pack0 = h3(eIV11 (inner pack)) ‖ fid ‖ IV1 ‖ eIV1

1 (inner pack). If the command is

not intended for a GD, then the device’s identifier field will hold a full zero 8-bit array, in which case

it will flag that the recipient of the message is the MPP;


10. Perform the encryption eIV22 (Pack0);

11. Build Pack1 = IV2 ‖ eIV22 (Pack0);

12. Store Pack1 in solid memory.

Pack1 is now sent via Internet to the MPP subject to the TCP/IP protocol.

Middle-Point Party

Steps 13 to 27 (block of execution A) represent the phase of the protocol where the MPP receives

the MnDM’s command message, proceeds to the necessary verifications and processes it accordingly,

whereas steps 28 to 36 (block of execution B) describe the case where the MPP generates the com-

mand message to be sent directly to the GDi. Both blocks may not be executed synchronously; for

45

instance, the MnDM may send a message to the MPP while the latter is processing its own command.

Nevertheless, at the end of both execution blocks (A and B) the protocol follows to step 37.


IV2 = Pack1|128;

eIV22 (Pack0) = Pack1 \ Pack1|128;

14. Perform the decryption: dIV22 (eIV2

2 (Pack0)) in order to obtain Pack0;


h3(eIV11 (inner pack));

fid;

IV1;

eIV11 (inner pack);

16. Compute a new HMAC instance h3(eIV11 (inner pack))∗;

17. If h3(eIV11 (inner pack))∗ 6= h3(eIV1

1 (inner pack)) then assume the message to be compromised,

discard it and abort the execution. Otherwise continue;

18. Perform the decryption: dIV11 (eIV1

1 (inner pack)) in order to obtain inner pack;

19. If fid = 0 then apply the corresponding command and finish the execution at this step, otherwise

continue3.

20. Initialize a single bit flag ∈ Z2 : flag = 1;

21. Parse inner pack and generate a HMAC of the message: h1(M);

22. Build inner pack1 := flag ‖ h1(M) ‖ h2(M) ‖M ;


24. Perform the encryption eIV31 (inner pack1);

25. Generate a unique message identifier m∗id;

26. Build the final package Pack2 := m∗id ‖ IV3 ‖ eIV31 (inner pack1);

27. Store Pack2 in solid memory;

As already stated, the following steps 28 to 36 are related with the case in which the command message

M is generated at the MPP instead of the MnDM. This block of execution does not necessarily follow

from the previous one (steps 13 to 27) and may be triggered either by a user interaction on the MPP or

by a scheduled command.

3Note that in this case inner pack := h2(M) ‖M .

46

28. Generate a command message M ;

29. Compute a HMAC of M : h1(M);

30. Initialize a single bit flag ∈ Z2 : flag = 0;

31. Build inner pack2 := flag ‖ h1(M) ‖M ;


33. Perform the encryption eIV41 (inner pack2);

34. Generate a unique message identifier m∗∗id ;

35. Build Pack2 := m∗∗id ‖ IV4 ‖ eIV41 (inner pack2);

36. Store Pack2 in solid memory;

At the end of either block of execution A or block of execution B, the package Pack2 is sent to the GDi

via Wi-Fi and subject to the WPA2-PSK protocol.

Gathering Devices


mid = Pack2|32;

IV = (Pack2|160)|128;

eIV1 (inner pack) = Pack2 \ Pack2|160;

38. Perform the decryption dIV1 (eIV1 (inner pack)) in order to obtain inner pack;

39. Read flag = inner pack|1. If flag = 0 go to step 40, if flag = 1 go to step 44, otherwise abort the

execution and consider the message as corrupted.

40. Parse inner pack into its main components

h1(M) = (inner pack|257)|256;

M = inner pack \ inner pack|257;

41. Compute a new HMAC of M : h1(M)′;

42. If h1(M)′ 6= h1(M) consider the message to have been tampered with and abort the execution,

otherwise continue;

43. Store M in the commands’ FIFO stack, waiting to be applied as soon as possible. Successfully

exit the downstream data protocol after applying the envisaged command.

44. Parse inner pack into its main components

h1(M) = (inner pack|257)|256;

h2(M) = (inner pack|513)|256;

M = inner pack \ inner pack|513;

47

45. Compute two new HMAC instances of M : h1(M)′ and h2(M)′;

46. If hi(M)′ 6= hi(M) for some i ∈ {1, 2} then consider the message to have been tampered with and

abort the execution, otherwise acknowledge the integrity of M and continue;

47. Store M in the commands’ FIFO stack, waiting to be applied as soon as possible. Successfully

exit the downstream data protocol after applying the envisaged command.

3.5 Message Formats

This section aims to identify the distinct message formats built in the protocol description of section

3.4.2 and formally define the network’s packing and unpacking mechanisms. The five distinct mes-

sage formats comprised in the protocol are visually presented in Appendix C according to the following

specification:

F1 : encrypted by a GD and decrypted by the MPP.

F2 : encrypted by the MPP and decrypted by a GD (plaintext generated by the MPP).

F3 : encrypted by the MPP and decrypted by a GD (plaintext originally generated by the MnDM).

F4 : encrypted by the MPP and decrypted by the MnDM.

F5 : encrypted by the MnDM and decrypted by the MPP.

Hereinafter this notation is to be considered. All the abovementioned message formats are built based on

two other category of message formats F∗i and F∗∗i , which are, respectively, the formats correspondent

to the outer and inner layers of encryption within Fi for every i ∈ I5. These can also be consulted in

Appendix C. The following definitions are useful for the upcoming discussion.

Definition 3.5.1 (Confidential plaintext). The raw data gathered by the GDs as well as the command

messages are denominated of confidential plaintext.

The previous definition highlights the piece of information within the packages that is of utmost impor-

tance and envisaged to be transmitted to and read by the desired parties. The aforementioned figures

may clarify the subject.

The only cryptosystem involved in the processing of the packages is the AES cipher. Since it is a

128-bit block cipher and the encrypted data does not necessarily have 128 bits, a BCMO is required

and, as already stated, it has been decided to be the CTR mode of operation. Let G be the ciphertext

generator operator such that

G(C,M, v, k, x) = y (3.2)

where y is the result of encrypting x via block cipher C using mode of operation M with initialization

vector v (if applicable) and key k. Analogously,G−1 is the inverse operator and returns the corresponding

plaintext:

G−1(C,M, v, k, y) = x (3.3)

48

Consider the set of all words with format Fi,

Si = {y ∈ Σ∗ : y is of format Fi} (3.4)

Moreover consider a family of functions

Ei : I × Σ∗ × Σ∗ ×K × P → Si (3.5)

such that Ei(j, a, b, k, x) represents the instance that outputs an element of Si, for some confidential

plaintext x, key k, GD’s identifier j and global parameters a and b. For instance, for the format F1 the

following expression is satisfied

E1(j,m, v, k, x) = j ‖m ‖ v ‖G(AES,CTR, v, k, h1(x) ‖ h2(x) ‖ x) (3.6)

where the HMAC functions h1 and h2 are defined as according to section 3.4. This function is called of

package ciphertext generator function (PCgF) and hereinafter will be addressed accordingly. For a fixed

key j, the function

Eji : Σ∗ × Σ∗ ×K × P → Si (3.7)

is an instance function from the family of PCgF with the same expression.

Definition 3.5.2 (Package ciphertext). Let x be a confidential plaintext, j ∈ I, k a key and a, b ∈ Σ∗

two parameters of choice. The data resulting from the computation of Ei(j, a, b, k, x) is designated as

package ciphertext.

The inverse function for the PCgF is defined by

Di : I × K × Si → P (3.8)

where K is the set of keys and P is the set of all confidential plaintexts and is called of package ciphertext

unpacking function (PCuF). This means that for every z ∈ P:

Dji (k,Ej

i (a, b, k, z)) = z (3.9)

for every j ∈ I, where

Dji : K × Si → P (3.10)

is an instance of the family Di of PCuFs.

The following definition is based on the previously presented ones and will be useful to the security

analysis presented in chapter 4.

Definition 3.5.3 (Packing Scheme). A packing scheme (PSch) is defined by a 3-tuple (E,D,K) where

E is a family of PCgF, D is a family of PCuF and K is the key set.

Let PiS be the PSch associated with message format Fi such that ∀i∈I5 :

PiS =(Ei, Di,K)

and

Ei =

n⋃j=1

Eji and Di =

n⋃j=1

Dji

(3.11)

49

where Eji and Dj

i are as according to expressions 3.7 and 3.10, respectively.

For every i ∈ I5, the packing schemes PiS define the distinct message formats and their security will

be thoroughly studied in chapter 4.1.

Consider x to be an n-bit confidential plaintext generated by a given GD. The construction of the CTR

mode of operation entails restrictions to the pair (v, k) used in CTR mode of operation, where v stands

for the initialization vector and k for the key. Even though it is not required the initialization vector v to be

unpredictable by an adversary, it is mandatory that the pair (v, k) does not repeat for the same block of

plaintext. Thus, there is an upper bound on the length of the message to be encrypted using the CTR

mode: the number of blocks of the message must not be greater than 2m, where m is the block size of

the block cipher at hand, in bits4. Since AES is a 128-bit block cipher, the bit-length n of the plaintext

must satisfy

n ≤ 128× 2128 = 2135 (3.12)

which is a really large number and does not restrain the set of possible plaintexts in practice, for n may

take larger values than the number of atoms in the universe. However, since the GDs are assumed to

only have their RAM upper bounded by 256kB, then the plaintext messages’ length is bounded by this

value, i.e.,

n < 218 − ai (3.13)

where ai represents the binary length of the data that was added in order to build the message with

format Fi, that is

ai = |Eji (u, v, x)|2 − n, ∀j∈I,i∈I5 (3.14)

where u and v are the required external variables for the construction of the package Fi. Moreover, note

that for fixed |Eji (u, v, x)|2 = |Ek

i (u, v, x)|2|,∀j,k∈I . The strictly lesser operator is due to the usage of

some of the RAM by internal processes of the system at hand, that are needed for it to properly execute

certain required background tasks.

4Recall that there are 2m distinct values for a bitstring of length m.

50

Chapter 4

Security Analysis

In real-world projects where computer security is of essence, there are always limitations directly

caused by one or more of many factors, such as available funding or ethical restrictions. This means

that one is not provided with unlimited resources in practice and choosing the best possible scenario

comes both as an unavoidable consequence and arduous task. In general there is an inverse relation

between security and performance but each system must be analyzed individually, as its features may

entail that the former relation does not hold, in which case an optimal solution in terms of security may

imply a greater amount of computational resources.

This chapter aims to analyze the considered most important cryptographic properties of the packing

schemes descripted in section 3.5 with respect to security. The strengths and weaknesses of the pro-

posed message formats and protocols are scrutinized followed by suggestive solutions for the observed

flaws.

4.1 Strengths and Weaknesses

The network considered in section 3.3 is not perfectly secure as one would expect and there are

some ingrained fragilities induced by the chosen topology or by the communication protocol descripted in

section 3.4. This section discusses some of those flaws and key aspects related with the key generation

stage and the packing schemes built within the scope of the communication protocol.

4.1.1 Key Generation

In the key generation stage, as described in 3.4.1, all the required cryptographic keys are generated.

In order to increase the resilience of the key against key-recovery attacks it must be generated as

randomly as possible. Based on the results presented in [26] PBKDF2 is a good option for the generation

of the keys, namely with HMAC-SHA-1 because it is the keyed-hash function with better performance

and provides enough security [38], even though the inherent hash function is not strong with respect to

collision resistance [20].

The password serves both as the seed for the pseudo-random generator that constructs the salt byte

51

array as well as the password passed as argument to the PBKDF2 algorithm. Ideally, one wouldn’t want

to make the key generation process depend solely on a single password input by the user because it

clearly lowers the security level of the key generation process, since to break the key generator comes

down to finding a single password and replicate the process. Nevertheless, this simplistic approach was

the one agreed to be used because of its simplistic features and the high level of trust placed in the user

U operating of the MnDM. One very straightforward solution to increase the security level of this very im-

portant stage of the protocol would be for U to provide two passwords: one to be used in the construction

of the seed for the pseudo-random algorithm and the other to be used as the password for the PBKDF2

algorithm. However, both these approaches require a complete trust on the user that is generating the

keys; if U is evil-intended, then the whole network becomes compromised. The answer to overcome this

problem lies in the Two-Person Concept, which is a mechanism based in the following requirement: to

launch a nuclear missile, there must be two distinct and unique individuals, each possessing a distinct

key that is not known by the other party, inserting their credentials in the launching computer at the same

time. By adapting this concept, one could build a similar behaviour for the generation of keys in the PMS,

where a pre-processing stage would take place to construct a master key out of the keys of each of the

k chosen parties, for some k > 1 (e.g., out of an XOR operation). This key would then be used by the

key generation process and no party could ever single-handedly replicate the construction of the whole

key set and attack the system. Figure 4.1 briefly depicts this procedure for k = 3.

Key Constructor Algorithm

User1

User2

User3

Key Generation Algorithm

secret1

secret2

secret3

master keyKey Generation Algorithm

Figure 4.1: Key generation based on k users

One can also exploit flaws embedded in the distribution of the keys. Theoretically speaking, one

does not even question the integrity of the MnDM but in practice all situations must be considered. Let

Tom be a malicious user monitoring both the sent data and the results received at the MnDM. Since

he is in possession of all the keys, he is able to generate a corrupted message M and produce a

package E4(M) to be inserted into the DB, whose invalid authenticity is not traceable by anyone that

attempts to make the verifications. That is, using the keys (kGMPA )j , kMPMA , (kGMPH )j , (kGMDH )j and kMPMH , for

some 1 ≤ j ≤ n, Tom would be able to produce a package P with format F4, holding M as its core

message and such that any party who would attempt to trace the message would conclude that P is a

package sent by the MPP to the MnDM whose principal components originated in GDj without being

tampered with in the process. A possible solution for this issue would be to not provide the MnDM with

52

the keys (kGMPH )i ∀1≤i≤n. This way, Tom would still be able to read and send messages from and to the

MPP, respectively, but he would not be able to produce a malicious package P and send it to the DB

without being flagged as compromised by a trusted authority. Apart from the aforestated problem, the

current key distribution does not provide a whole control of the network by the MPP on the grounds that

this device does not contain any of the keys (kGMDH )i ∀1≤i≤n. Even if the MPP becomes compromised, it

does not possess all the necessary knowledge to trick the GDs nor the MnDM into accepting corrupted

messages. Nevertheless, the compromisability of any of the active elements of the network is a subject

of utmost concern.

4.1.2 Packing Schemes and Protocols

Apart from the IVs and HMACs there are two elements that may occur in the packages’ headers:

the gathering device identifier did and the message identifier mid. The former is intended to provide

the MPP a mean to know which key of the set K1 = {(kGMPA )i : 1 ≤ i ≤ n ∧ i ∈ N} was used in the

encryption so that the same key is used in the decryption and/or in order for the MPP to know to which

GD must the message be sent to, while the latter is helpful in the GD’s memory management. According

to the protocol descripted in section 3.4.2, upon receiving a message, the MPP checks its integrity and

authenticity and if this verification succeeds then an acknowledgment message containing mid is sent

back to the envisaged GD. When the latter receives the ACK, it can release the memory associated with

the message having identifier mid, using a look-up table for instance.

Even though both did andmid are not immune against tampering or unintentional errors, the following

proposition holds.

Proposition 4.1.1. It is infeasible for an adversary to perform replay attacks or trick the MPP into as-

suming the gathering material is located elsewhere.

Proof. Let i 6= j and Fj and Fi be two gathering devices with identifiers djid and diid, and located at

positions X and Y , respectively. Consider that an adversary wants to display malicious activities at

location X, which would be detected by Fj . If he is aware of the presence of Fj , he might attempt to

tamper with the reports on the location by changing djid for diid; this way the information transmitted to

the MPP would be that the malicious activity is in effect in location Y instead of X. However, changing

the GD identifier value to diid will result in the MPP calling the PCuF of P1S using the keys shared with

Fi, meaning that the resulting decrypted text would be distinct from the original plaintext, due to the

injectivity of AES. Thus such an attack becomes infeasible since the adversary has a very thin margin

of success. The resistance to replay attacks follows directly from the WPA2 protocol [29].

Hereinafter, due to simplicity purposes, consider Ej1(x) to represent the PCgF with omitted global

parameters a and b and for which the keys used in the encryption scheme are associated with GDj .

Proposition 4.1.2. PSch PiS grants secrecy, integrity and authenticity to the confidential plaintext, for

every i ∈ I5.

53

Proof. Let A be an adversary and j ∈ I a fixed identifier representing the operational gathering device

GDj . For some unknown confidential plaintext x suppose that A is in possession of y := Eji (x), the

associated package ciphertext.

• i = 1: The secrecy of the plaintext follows directly from the secrecy property of the AES-CTR encryp-

tion scheme; A cannot obtain x simply with the knowledge of y because the former is encrypted with

AES-CTR using the key (kGMPA )j ∈ K1, which is of private knowledge uniquely to the MPP and GDj .

Based on figure C.2, recall that Ej1(x) = did ‖ mid ‖ IV ‖ g(AES,CTR, IV, (kGMPA )j , w), where w =

h1(x) ‖ h2(x) ‖ x. Clearly there is no integrity nor authenticity protection to the encrypted message

and the malleability property of the AES-CTR encryption scheme gives the attacker an opportunity of

tampering the ciphertext. In case A interferes with the ciphertext, then after decryption one ends up

with the word w∗ = h1(x)∗ ‖ h2(x)∗ ‖ x∗, due to the properties of AES-CTR. It is certainly very unlikely

that ∀i∈I2 : hi(x)∗ = hi(x∗) because of the avalanche property of cryptographic hash functions, which

means that the adversary has a negligible probability of corrupting an encrypted message without

compromising the plaintext’s integrity check. The integrity and authenticity follows from the fact that

the HMAC keys (kGMPH )j and (kGMDH )j are uniquely distributed among the pairs (GDj , MPP) and (GDj ,

MnDM), respectively.

• The same arguments can be applied analogously for i = 2, . . . , 5.

The previous proposition expresses that the data going through the UDL is authenticated and not

tampered with in the process. However, when the data reached the DB how can an umpire prove to a

third party that the data is legit? The following proposition addresses this question.

Proposition 4.1.3 (Non-repudiation). The data stored in the DB is granted the non-repudiation property,

provided complete trust on the user of the PMS.

Proof. Let U be the user of the PMS, J the umpire, V the entity asking for the verification and y an

arbitrary message stored at DB. In order for J to show V that y corresponds to a package ciphertext

of some message that originated at the GDj , he requires of U a simulation of the key generation stage.

Because U is the only one with access to the password used at the process then, having access to the

correct PCuF f , V just needs to execute the function f with the keys that were assigned to GDj . If the

output is a valid message then all the verifications for the unpacking procedure succeeded and J has

proved to V that y is legitimate, as well as its secret contents.

Even though an adversary is not able to directly find the exact plaintext from its corresponding cipher-

text it doesn’t mean the cryptosystem provides full security to the message’s secrecy. It could be the

case that A would gain information on the plaintext if the IV is reused1, since the key remains static for

the entire device’s lifetime. At the GDs the IVs are being generated through the standard incrementing

function which means that the leak of information would only occur if the threshold of the state space1See section 2.6.6.1 for key-reuse attack.

54

had been reached. Given that the IVs are fixed-sized words of 16 octets (128 bits), the total number of

distinct IVs is 2128 − 1, which is a very large number.

Every GD is limited to the maximum processing of 256 kB of data per second and each plaintext’s

length is upper bounded by 256 octets therefore the maximum number of data packages that a GD

can process per second is 1024. Continuously gathering data at such rate, a GD would overextend the

space of possible message identifiers (mid) in around 48 days. Thus, the recommended lifetime for a

GD under this conditions is 1 month in average, which means that the danger of reusing a key under the

current encryption scheme is negligible. This conclusion confirms that the absence of a key scheduling

algorithm for the GDs was a good option, for it would be a waste of energy to perform the computations

to update the keys when there is virtually no threat against a brute-force attack.

4.1.2.1 Semantic security

The leak of partial information of the plaintext from the ciphertext is an undesired property that must

be seen as a very dangerous threat when exploited by a capable adversary. This is the notion of

semantic security introduced in chapter 2. The following proposition is very important for it expresses

the level of security of the message formats inherent to the packing schemes PiS ∀i ∈ I5.

Proposition 4.1.4. For every i ∈ I5, the PSch PiS is semantically secure against chosen-plaintext at-

tacks.

Proof. The fields did andmid within message format F1 are independent of the plaintext and soA cannot

infer any relation with the associated plaintext. As has already been stated, the IV must be exposed as

plaintext for the security of the AES-CTR encryption scheme and it does not leak any information for the

adversary to exploit. Therefore, for i = 1, . . . , 5 the semantic security of PiS follows from the IND-CPA

security of AES-CTR proved in [2] and from the equivalence between IND-CPA and SEM-CPA proved in

[39].

Informally and assuming every confidential plaintext to be equally-sized, it means that the package

ciphertext does not reveal any information whatsoever about the confidential plaintext. That is, for any

two adversaries A and B who are given two confidential plaintexts x0 and x1 where (x0 6= x1) ∧ (|x0| =

|x1|) and such that B is also given the package ciphertext Ei(x)j for any i ∈ I5 and j ∈ I, B has no

advantage over A when trying to discover relevant information about xb ∀b∈Z2.

However, it so happens that the confidential plaintexts are not all of equal length and no padding

method is used in the packing schemes, thus POAs are infeasible and the process is more efficient but

it means that the ciphertext transmits the plaintext’s length information to the attacker. This unfortunate

leak of information makes the cryptosystem vulnerable to chosen-plaintext attacks in which the adver-

sary is able to pick plaintexts of non-equal length, since an adversary with knowledge of two plaintexts

p0 and p1 such that |p0| 6= |p1| would be able to decide which of the them corresponds to the oracle’s

answer cb for some random b ∈ {0, 1} with probability deviated from 1/2 given that |cb| = h+ |pb| where

h is the message header’s length.

55

Let A be the event corresponding to the storage in memory of package ciphertexts2 whose corre-

sponding confidential plaintexts have lengths Lj for j ∈ Ik according to equation 3.1 and let B be the

event analogous to A but for which the confidential plaintexts are all partitioned into equal-length sub-

words of size l prior to being ciphered and stored. Let ∆ represent the difference between the memory

payoff for k messages in event A and the memory payoff associated with event B. Then

∆ = h

k∑j=1

(mj − 1) (4.1)

The memory restriction imposed to the GDs is very unforgiving, as presented in the following example

analysis of a worst-case situation: first, note that the PSchs associated with the GDs are PiS for i = 1, 2, 3

and let h = maxi∈I3hi where hi = |y|2 − |x|2 for any y ∈ Si and x ∈ Σ∗. Furthermore, consider a

worst-case scenario where l = 1 byte and Lj = 211 ∀j∈Ik , thus ∀j∈Ik : mj = 28 . This means that

∆ = (28− 1)kh. The GDs’ total allocated memory for storing the messages equals 221 bits as according

to section 3.2. If event A is executed instead of B then the maximum number of messages that the GD

at hand can store at a time in its solid time is

k =221

211= 210 (4.2)

On the other hand, on the occurrence of event B, one has

221 = ∆ + k211 ⇒ k ' 11.95 (4.3)

which means that the maximum number of messages that the GD can store in this case is 11, under the

same restrictions. This example highlights the relevance of memory optimization within the GD.

Following, an analysis with respect to a stronger type of attack is performed: the adversary not only

has access to an encryption oracle but also to a decryption oracle. This is considered to be the stronger

level of semantic security [40].

Proposition 4.1.5. For every i ∈ I5, PSch PiS is not semantically secure against chosen-ciphertext

attacks.

Proof. Let A be an adversary playing the IND-CCA game [2] with the words w0 = 0n and w1 = 1n for

some n ∈ N and w.l.o.g. fix j ∈ I. Let Oe and Od be the oracles with access to the PCgF and PCuF of

P1S . The following strategy grants A a non-negligible IND-CCA advantage:

1. A queries Oe with (w0, w1);

2. Oe encrypts wb into y := Ej1(wb), for b ∈ Z2 and returns it to A;

3. A flips the last bit of y and obtains y′;

4. A queries Od with y′;

5. Od returns w′b;

2In this case it is irrelevant which PCgF was used to generate the package ciphertext.

56

6. If w′b = 0n−11 then A chooses b′ = 0. If w′b = 1n−10 then A chooses b′ = 1.

The last step is only feasible due to the properties of error propagation of the CTR mode of operation

[14]. Thus, A can tell to which confidential plaintext belongs the package ciphertext with probability far

from 1/2, meaning that P1S is not IND-CCA secure. The result follows from the equivalence between

semantic security and ciphertext indistinguishability in [3].

For i = 2, 3 and 5 the proof is analogous, with a single remark for the case i = 5, in which the

associated PCgF makes a double call to the ciphertext generator operator G whose cryptosystem and

BCMO arguments are AES and CTR, respectively (see figures C.7 to C.11). In this case, there are two

layers of encryption on the confidential plaintext but the result is the same as in the previous cases due

to the direct error transmission property of CTR. That is, flipping the last bit of the package ciphertext

will entail that the last bit of the decryption of the outer layer of CTR encryption is also flipped, which will

imply the last bit of the plaintext to be wrong after the final decryption, leading to the same outcome.

For i = 4 one faces an encrypt-then-MAC procedure which is the only layout susceptible to be IND-

CCA secure. However, the fact that the IV is not targeted by the HMAC entails that P4S is also IND-CCA

insecure. In fact, note that if an adversary has access to a decryption oracle and the HMAC does not

include the IV then the adversary can change the value of the IV at will in order to claim the keystream

for the new IVs. Hence A will be able to decrypt any message that was encrypted using any of these

new IVs.

The previous result states that an adversary with (temporary) access to a decryption oracle may gain

knowledge to perform a partial or, in a worst-case scenario, complete break of the cryptosystem. This

could be prevented if an encrypt-then-MAC mechanism was adopted and the header of the message

was targeted by the HMAC.

4.1.2.2 Encryption Schemes

Now that the details about the chosen packing schemes have been presented it is time for the discus-

sion of whether the encryption schemes involved in their construction were the right choices for the job.

In order to make use of asymmetric cryptosystems in the development of digital signatures there is the

need for a certificate authority (CA) whose role is to evaluate the authenticity of the messages travelling

throughout the nodes of the network. The need for such an entity immediately deprecates this option

because it is required that this trustable entity provides all the demanding certificates on the fly and as

discussed in chapter 3, the chosen network topology does not include any permanent party on the field

other than the GD. RSA is the only asymmetric cryptosystem implemented in the device’s hardware

and there is currently no middleware developed for the call of this mechanism in software applications.

Moreover the overhead in terms of memory entailed by the usage of asymmetric cryptographic systems

and the increased key size makes these types of systems not suitable under the current restrictions,

either for authentication or privacy purposes, when compared with methods associated with the usage

of cryptographic hash functions. Hence the choice of symmetric methods is considered to be the best

option.

57

CTR CFB

Encryption Parallelizable Non-parallelizable

Decryption Parallelizable Parallelizable

Transmission

Errors

Only the wrong

bits are affected

Affects the wrong bits in the current block;

Completely destroys the following blocks.

IVMay be predictable,

must be unique.

May be predictable,

must be unique.

Table 4.1: Comparison between CTR and CFB features.

AES was the chosen cipher to provide secrecy to the confidential messages. Since this is an old

standard published by NIST, a very fast algorithm and a it is hardware-implemented in the GDs makes it

the most suitable choice for the case. CTR was the chosen mode of operation to deal with messages of

variable length and it turns the encryption scheme into a stream cipher3. Since ECB is not semantically

secure it was a right decision to have chosen CTR over ECB. As for the CBC mode, it requires an

unpredictable IV because it is susceptible to a predictable IV attack as descripted in algorithm 3 while

CTR only requires its unicity, a relevant fact for the decision made because it uplifts the execution time for

the encryption procedures in the devices that have critical issues on the power consumption. Table 4.1

specifies some features of CTR and CFB in order to discuss their applicability to the presented problem

and, as one can observe, the critical consequences of transmission errors and the non-parallelizable

encryption are facts that lead to the deprecation of CFB over CTR.

As discussed so far, CTR mode of operation is the best option over the classical block cipher modes

of operation ECB, CBC and CFB under the constraints at hand. The fact that CTR mode with HMAC-256

checksum (CTR-H) mode was chosen over CCM and GCM is not a very straightforward outcome since

the latter are two authenticated modes which have been well studied and have better performances es-

pecially in terms of memory optimization; for CTR to achieve the same level of security as the previously

mentioned authenticated modes it will definitely perform poorly with respect to memory optimization due

to the added length of the MAC. Notwithstanding, it is in fact faster to execute the CTR-H mode of oper-

ation rather than any of the other two modes because the former is implemented in the chosen device’s

hardware, opposite to the latter; CCM is in fact a very slow mode due to its double block encryption

procedures. Also, note that in this case it is preferable to choose an encryption algorithm with lower

execution time than lower memory usage due to the device’s energy consumption for there is an upper

bound on the data gathering rate and the memory associated with the gathered messages is deleted

upon confirmation of its arrival to the recipient, meaning that the available 256kB suffices for the storage

of the messages. Moreover, the deployment of the devices onto the field of action requires not only

financial but also human resources, hence the durability of the devices is of utmost importance. Another

strong argument for the choice of CTR mode alongside a HMAC is the fact that it makes use of an

extra key (the authenticity and privacy keys are distinct) when compared with the authenticated modes

3See section 2.6.6 for attacks on stream ciphers.

58

of operation GCM and CCM, which use a single key for the whole process. The devices’ short lifetime

entails a high resilient system against brute-force attacks and is the feature that prompts the inexistence

of a key scheduler algorithm.

With this in mind the choice of CTR-H over all the other presented modes is considered to be suitable

for the situation, provided a good usage of the authentication mechanism, i.e., a usage that prevents an

attacker of exploiting weaknesses on the encryption scheme at hand, such as for example malleability.

4.1.3 Attacks

This subsection discusses some of the aspects that may be exploitable by an adversary in practice,

under the assumption that all the restrictions imposed to the network and its elements hold. In practice,

the GDs are deployed onto a fixed location in order to keep on gathering data. It is then possible for an

adversary to physically tamper the devices, which is why the data is encrypted in solid memory in the

first place. Let A be a polynomial-time bounded adversary with physical access to the GDs. If A tries

to read the memory with the ambition of directly retrieving the secret data gathered by GDs then he/she

will have a bad time in doing so because these are encrypted using the PSch PiS defined in equation

3.11, and as seen in section 4.1 these are semantically secure apart from chosen-ciphertext attacks.

Moreover, the adversary will not be able to retrieve the keys from memory because these are stored in a

memory location of restricted access [41]. Now assume that the GDs are deployed at time 0 and that at

time t there is an event E which will directly or indirectly provide information to be gathered by some GD.

The latter will get to know this information upon collecting the data within the time interval T = [t, t+ ε[,

where ε > 0 is the duration of the intelligence leak of the event E. Suppose also that A is aware of both

E and ε. Then, the adversary just needs to interfere with the envisaged GDs in the time interval T in

order to prevent them from collecting any of the relevant data. Thus, for an intelligent adversary who is

able to physically interfere with the GDs, this technique is less likely to be spotted by a tamper detection

mechanism, while allowing A to optimize his/her energy consumption on the attack.

Another attack that could be performed is the reading of volatile memory [42] by a capable adversary.

No message is ever stored as plaintext in solid memory, but both before encryption and after decryp-

tion, the confidential plaintext is automatically stored in volatile memory, even if for a short time frame.

Nonetheless, this time frame may suffice for the adversary to harvest the secret information.

Assuming that A cannot physically harm or interfere with the GDs a straightforward attack would be

for the adversary to perform a MiM attack known as wireless denial of service attack that consists in

jamming the communication channel with junk data, making the exchange of data between the GDs and

the MPP impossible, provided that he finds the correct frequency. This method can somewhat be seen

as a last resource for an adversary who is unable to partially or completely break the cryptosystem for it

would be in his interest to stealthy eavesdrop on the communications in order to eventually tamper the

messages or acquire information and change his strategy accordingly.

59

4.2 Possible Solutions

Some suggested solutions for the problems discussed in the previous section are now presented.

It is important to note that these are just suggestions that aim to improve the selected choices. Even

though they may seem better in a theoretical point of view it could just so happen that in practice these

approaches are not fit for a variety of possible reasons. The reader should notice that the possible

solutions to the issues related with the key generation stage have already been discussed in section

4.1.1.

A reasonable approach to prevent an attack where the adversary takes advantage on the physical

exposure of the device is to choose a device whose hardware provides a tamper detection mechanism

and memory management [43] such that in case of memory compromise it clears all the memory as-

sociated with the confidential data. This is a last resource and leads to the disablement of the device

at hand. With respect to the jamming attack on the wireless communication channel, there is no way

to prevent it from happening. The only solution would be to wire the connection, where the adversary

would be unable to jam the channel without being targeted by the MPP.

4.2.1 Chosen-plaintext attack

Let x1, . . . , xk be confidential plaintexts for some k ∈ N and y1, . . . , yk their correspondent package

ciphertexts, respectively. Recall that ∀i∈I5 : PiS is not secure against chosen-plaintext attacks in which

the adversary is able to pick plaintexts of distinct length. It is known from equation 3.1 that the length Lj

of confidential plaintext xj is a multiple of a minimum length l, i.e.,

Lj = lmj ∀j∈Ik (4.4)

Furthermore, let h be the length of the header of a package ciphertext, which is assumed to be fixed for

any PSch, for simplicity purposes. If one does not consider the header’s overhead, then it is equivalent

to partition each message of length Lj into mj messages of length l and the system would become

resistant to variable-length chosen-plaintext attacks. This is uniquely a theoretical solution for this prob-

lem because the assumption of the absence of the header does not hold in practice. The only way to

overcome this issue is to define all confidential plaintexts to have the same length.

4.2.2 Chosen-ciphertext attack

Proposition 4.1.5 refers to the fragility of the PSchs PiS with respect to chosen-ciphertext attacks.

The previous result holds under the assumption that the adversary A has access to a decryption oracle.

However, in practice there is no feasible way for A to be admitted such resources. The only way A

would be able to succeed would be to impersonate an element of the network, use the correct4 PCgF

and then send it to the end party and be able to extract the plaintext from its volatile memory at the

moment of decryption. This approach is infeasible in practice because no element of the network can

be impersonated by a polynomial-time bounded adversary, as shown in proposition 4.1.2.4By correct one means the correct function using the correct keys.

60

Chapter 5

Implementation Details

This chapter covers the majority of the developed code by illustrating some pieces of pseudocode

and in some cases performing its complexity analysis. Two mock-up application examples were cre-

ated: one is related with the key generation step and the other with the data flow of the discussed

network. Appendix B contains a user guide manual for the first case. The application with regard to the

communication is not presented in this text due to the confidential nature of its execution requirements.

The code for the GDs has been developed in C programming language, whilst all the remaining code

is in Java language. Clearly the low-level programming allows a more versatile working environment,

but this advantage is evened by the inherent problem of performance and memory optimization.

5.1 Key Generation

As described in section 3.4.1 all the keys are generated in the PMS. This process is triggered by a

user-input on the key generation program specified by Algorithm 5. The latter depends on Algorithm 4,

which is the atomic procedure for securely generating keys of a given size.

The PSK for the WPA2 protocol is generated as in section 2.5.2:

PBKDF2(HMAC-SHA-1, wipass, ssid, 4096, 32) (5.1)

where ssid is the wireless network’s SSID and wipass is the password shared by the authorized parties,

i.e., the GD and the MPP. The primitives within these two devices used for the WPA2 secured wireless

network require the SSID and password as input, since the generation of the PMK occurs internally.

Hence, for the generation of the Wi-Fi pre-shared key it suffices for the tuple (ssid, wipass) to be pro-

vided. Therefore the trusted user starts by inputting the SSID and password for the WLAN as well as the

number of GDs and a seed, which can be seen in an informative point of view as a password that makes

the pseudo-random key generation process behave deterministically, meaning that the knowledge of

that seed may serve as proof for the authenticity of the keys and validate any future report, if needed.

Let ΣK = Z256 be the key alphabet and consider A = {k : k ∈ Σ16K } and H = {k : k ∈ Σ32

K } to be

the set of keys generated for encryption and authentication, respectively. The key generation process

for the elements of A and H is very similar, with the exception of the key length. According to [44], the

61

length of the keys in H must be at least equal to the length of the output of the hash function used, i.e.,

32 octets.

Algorithm 4 contains a high level representation (in pseudocode) of the generation step for the en-

cryption and authentication keys. The main layout of the algorithm is as follows: a customized process is

applied to the input secret in order to output a value that will be used as a seed to the cryptographic hash

function SHA-1. This function is used as a PRF in order to produce the salt used in the PBKDF2 function

that generates the key. Following, two important methods of the algorithm are thoroughly explained.

InitRand(pass) is a customized deterministic procedure that makes use of the user-input pass ∈ Σ∗

in order to seed a CSPRNG (namely the SHA-1 function) which produces the salt used in the password-

based key derivation function. The general idea behind it is to process the input in order to increase

the password-complexity in such a way that an adversary who is able to obtain the password chosen by

the user does not know immediately the seed used in the generation of the salt. Let n = bl2/32c be the

number of 32-bit blocks of pass where l2 is the binary length of pass, let pi be the binary representation

of the ith block of 32 bits ∀i∈In and let pn+1 be the last block with k bits, for some 0 ≤ k < 32. It

is useful to convert each of the blocks pi into their decimal representation xi ∈ Z256 to perform the

necessary addition operations. The seed s of the SHA-1 pseudo-random function is computed out of

two sub-seeds s1 and s2. The first is given by:

s1 =

n∑i=1

xi + x∗n+1 (5.2)

where x∗n+1 is the decimal representation of pn+1 ‖ 0m for m = 32 − k whenever k > 0, or x∗n+1 = 0 if

k = 0. Figure 5.1 describes the aforementioned method.

pass CHARACTER STRING

BINARY STRING

DECIMAL STRING

· · ·p2p1 pn pn+1

32 bits 32 bits 32 bits k bits

pn+1 ‖ 0m

32 bits

· · ·x2x1 xn x∗n+1

+

0 < k < 32

s1

Figure 5.1: Pre-processing steps of the secret pass for the generation of the seed of the SHA-1 pseudo-

random function.

As for the computation of the sub-seed s2, let l be the length of pass with respect to elements of Σ

62

and consider

X = (

nn

i=1

xi) ‖ x∗n+1 (5.3)

Clearly, X ∈ Z is a value whose number of digits is greater or equal than l because in the worst case

each xi contains a single digit, which means that one can compute the second sub-seed as s2 = X|land the value used for seeding the SHA-1 cryptographic hash function is given by

s = s1 + s2 (5.4)

GenerateSalt() is a procedure that calls the previously seeded SHA-1 CSPRNG in order to produce

a value composed by 16 octets. The value outputted by this method is used as salt for the PBKDF2

function.

Algorithm 4 Key generation for a fixed length

1: procedure KEYGEN(n, pass, len) . Generating n keys with len bits under the password pass

2: keystream← null;

3: InitRand(pass);

4: for r = 1 to n do

5: salt← GenerateSalt();

6: key ← PBKDF2(hmac-sha-1, pass, salt, 10000, len);

7: keystream← keystream ‖ key;

8: end for

9: return keystream;

10: end procedure

After calling GLOBALKEYGEN(n, pass) there are 3n + 2 keys in stack memory, out of which the first

n+1 are designated to be used by the AES algorithm and the remaining 2n+1 to be used in the HMAC-

SHA-256 algorithm. Let k1, . . . , k3n+2 be all the keys generated, ordered as they were generated, i.e.,

k1 was the first and k3n+2 the last. Then, the following correspondences have been defined:

• (kGMPA )i := ki for 1 ≤ i ≤ n.

• kMPMA := kn+1.

• (kGMPH )i := kj ∀i,j : (n+ 2) ≤ j ≤ (2n+ 1) and i = j − (n+ 1).

• (kGMPH )i := kj ∀i,j : (2n+ 2) ≤ j ≤ (3n+ 1) and i = j − (2n+ 1).

• kMPMH := k3n+2.

Algorithm 5 Key generation algorithm

1: procedure GLOBALKEYGEN(n, pass) . Generating n keys with len bits under the password pass

2: a← KEYGEN(n+ 1, pass, 128);

3: b← KEYGEN(2n+ 1, pass, 256);

4: return a ‖ b;

5: end procedure

63

Considering n to be the total number of GDs, the time complexity of algorithm 4 is O(n). In fact,

note that InitRand(pass) ∈ O(m) for m = |pass|. However, the restriction imposed on the password

(8 ≤ |pass| ≤ 63) entails an upper bound to the run time of this procedure, independently of the chosen

password. Thus, asymptotically under this constraint, the time of InitRand is upper bounded by a

constant, i.e.,O(1). The password-based key derivation function PBKDF2 runs in timeO(1) therefore the

n-step cycle will run in time O(n). The graphics presented in figures 5.2a and 5.2b show the behaviour

of the time spent in the key generation process with the increasing of the total number of GDs. The

y-axis contains the average time for the generation of the key set and the x-axis the number of GDs;

each point in figure 5.2a and 5.2b was taken out from a set of 100 and 20 values, respectively. The first

image intends to expose the results obtained for a real-world situation, which would become impractical

if the number of deployed devices happened to overextend the maximum presented magnitude (100)

and the second shows the results for a theoretical large number of GDs (5000).

(a)

(b)

Figure 5.2: Scatter plots of the average key generation time per number of gathering devices.

Note that there is a slightly variation in the average key generation time for when the number of

64

GDs is near 85, depicted in figure 5.2a. This variation can be explained by the usage of the CPU by

some external processes (e.g. operational system processes). Both scatter plots are in agreement with

the stated time complexity for the generation of the keys. Of course, these graphics do not show the

asymptotically linear time complexity behaviour but give an insight on what happens in practice.

5.2 Data Processing

In order to make use of the keys generated according to the previous section all the elements of the

network must agree on algorithms that pack and unpack the data into each of the formats Fi, i ∈ I5

while making the required verifications. These can be seen as implementations based on the protocol

descripted in chapter 3.4.2.

Let I = {k ∈ Z : k is a GD identifier} and consider the function valid : Z→ Z2 given by

valid(f) =

1 if f ∈ I

0 otherwise(5.5)

Clearly |I| equals the total number of deployed GDs, meaning that there are only |I| possible values

out of the domain of valid for which this functions returns 1. This function is very useful in the upcom-

ing discussion because it can filter whether a given 4-octet field represents a valid gathering device’s

identifier and the associated message should be discarded whenever it returns 0. Moreover, consider

CTR-ENCCv (k, x) to be the result of the encryption of x with the block cipher mode of operation CTR us-

ing the block cipher C, initialization vector v and key k, where the initial blocks given as input to C are

built according to equation 2.18. Analogously, CTR-DECCv (k, y) represents the decryption operation on y.

Lastly, let HMAC-SHA-256(k,m) be the result of the HMAC-SHA-256 function applied to m using k as key,

coping with equation 2.22.

5.2.1 Upstream Algorithm Lifecycle

The algorithms that represent the implementation of the data processing throughout the UDL are

now illustrated and discussed. After gathering the relevant data the GDs will proceed to its processing,

which is descripted in algorithm 6.

GenIvGd takes as input n ∈ N and generates an array with n bits according to the standard incre-

menting function, starting with a hardcoded value for the first initialization vector. That is, whenever a

new message is processed by the GD, the new IV vnew is computed as follows:

[vnew]2 = [vold]2 + 1 (5.6)

This procedure is done in time O(m), where m is the size of the initialization vector. Since the latter

is always fixed independently of the size of the confidential plaintext given as input to the procedure

PACKGD, the GenIvGd method runs in constant time, i.e., O(1).

PACKGD runs in time O(n) where n is the size of its input, because the methods HMAC-SHA-256 and

CTR-ENC run in time O(n), as well as the concatenation operator.

65

Algorithm 6 GD’s data packing algorithm

Input: x . Confidential plaintext

Output: Ej1(x) . For some j ∈ I

1: procedure PACKGD(x) . Pack the data into format F1

2: h1 ← HMAC-SHA-256((kGMPH )j , x);


4: inner ← h1 ‖ h2 ‖ x;

5: v ← GenIvGd(128);

6: enc← CTR-ENCAESv ((kGMPA )j , inner)

7: return fid ‖mid ‖ v ‖ enc

8: end procedure

Upon taking as input a confidential plaintext, the algorithm PACKGD transforms it into an element

of S1 using the keys that are stored in the GD’s solid memory with restricted access. After the data

has been packed it is sent to the MPP and upon reception the algorithm PROCESSMPPGD is triggered,

illustrated in Algorithm 10. The latter makes a call to each of the procedures descripted in algorithms 7

to 9.

Algorithm 7 MPP’s data unpacking algorithm: sender GD

1: procedure UNPACKMPPGD(y) . y ∈ S1

2: t← Parse(y,F1); . t is a 4-tuple

3: if !CheckMsgId(t[1]) or valid(t[0]) == 0 then

4: return error;

5: else

6: p← CTR-DECAESt[2] ((kGMPA )j , t[3]); . The key (kGMPA )j follows from t[0] in time O(1)

7: end if

8: return (t, p);

9: end procedure

The first procedure called by Algorithm 10 is UNPACKMPPGD, which receives as input y ∈ S1 and

returns a 2-tuple (t, p) such that t is the 4-tuple for which each element corresponds to the parsed

principal components of y (represented in step 7 of algorithm 6) and p is the result of the decryption of

the body1 of y.

Parse(x, f) is a function that parses the main components of x according to format f . The first of

the input parameters x is the piece of data to be parsed and f is the message format to be considered,

i.e., f ∈ F , where F = {Fi}i∈I5 ∪ {F∗i }i∈I5 . It returns a tuple such that the jth element of the tuple

corresponds to the jth element according to the format f . The time complexity for the parsing of the

word x is linear in the size of the input, that is O(n) where n is the length of x.

CheckMsgId(m) is a function that looks for the value m in a look-up table and in case the value is

1By ’body’ one means apart from the header.

66

found it returns false, otherwise returns true. This table contains all the identifiers of the messages

already received and serves a memory management purpose, especially useful at the GDs for memory

optimization. The time complexity for the search of a value in the table is O(1) in average.

Moreover, valid is also executed in constant time. Therefore, algorithm 7 runs in time O(n).

Algorithm 8 Authenticity and integrity verification algorithm

Input: (h, x, k)

Output: true or false . Success or insuccess, respectively

1: procedure VERIFY(h, x, k) . Verifies whether h is a HMAC of x using key k

2: h∗ ← HMAC-SHA-256(k, x);

3: if h∗ == h then

4: return 0;

5: else

6: return 1;

7: end if

8: end procedure

Upon performing the required decryption and splitting the package into its main components, it’s

necessary for the MPP to call VERIFY in order to make the required authenticity and integrity checks.

If this verification succeeds then it can proceed with building the final message Ej4(x), where x is the

confidential plaintext sent by the GD with identifier j ∈ I. To this end, a call to PACKMPPMNDM is made.

Algorithm 9 MPP’s data packing algorithm: recipient MnDM

1: procedure PACKMPPMNDM(z) . Desired recipient: MnDM

2: v ← PrGenIv(128);

3: e← CTR-ENCAESv (kMPMA , z);

4: h← HMAC-SHA-256(kMPMH , e);

5: return h ‖ v ‖ e;

6: end procedure

PrGenIv(n) is a method that includes the function SHA-1 as a PRNG to construct a unique2 n-bit

value and its execution time is linear on the size of the input. PACKMPPMNDM calls this method in order

to generate the IV used in the AES-CTR encryption. As one can easily observe PACKMPPMNDM ∈ O(n),

where n is the size of the input.

Thus, the whole process of unpacking and verifying the data that reaches the MPP and is sent from

a GD is performed in time O(n), where n is the size of the data given as input to PROCESSMPPGD.

2For a negligible probability of collision.

67

Algorithm 10 MPP’s data processing algorithm: recipient MnDM

Input: y . y ∈ S1

Output: z . z ∈ S4

1: procedure PROCESSMPPGD(y)

2: (t, p)← UNPACKMPPGD(y);

3: u← Parse(p,F∗1 );

4: if VERIFY(u[0], u[2], (kGMPH )j) then

5: r ← t[0] ‖ t[2] ‖ u[0] ‖ u[1] ‖ t[3];

6: return PACKMPPMnDM(r);

7: else

8: return error;

9: end if

10: end procedure

Subsequent to the construction of the package, its delivery to the envisaged recipient takes place.

Upon the arrival of the message, the MnDM executes the steps specified in Algorithm 13 in order to

process and extract the confidential plaintext. This procedure comprises of 2 methods UNPACKMNDM

and VERIFYMNDM specified in algorithms 11 and 12, respectively.

Algorithm 11 MnDM’s data unpacking algorithm

Input: y . y ∈ S4

Output: z . z ∈ S∗41: procedure UNPACKMNDM(y)

2: u← Parse(y,F4);

3: if VERIFY(u[0], u[2], kMPMH ) then

4: return CTR-DECAESu[1] (k

MPMA , u[2]);

5: else

6: return error;

7: end if

8: end procedure

In the UNPACKMNDM method, first the data is unpacked into its principal components (see figure

C.8), then the authenticity and integrity verifications take place and lastly a decryption is performed to

retrieve the plain data within the outer layer of encryption, with format F∗4 (see figure C.7). This procedure

runs in time O(n).

68

Algorithm 12 MnDM’s data verification algorithm

Input: y . y ∈ S∗4Output: t . 3-tuple corresponding to the decrypted part of F∗4

1: procedure VERIFYMNDM(y)

2: u← Parse(y,F∗4 );

3: d← CTR-DECAESu[1] ((k

GMPA )j , u[4]);

4: t← Parse(d,F∗1 );

5: if t[0]! = u[2] or t[1]! = u[3] or !VERIFY(t[0], t[2], (kGMPH )j) or !VERIFY(t[1], t[2], (kGMDH )j) then

6: return error

7: else

8: return t;

9: end if

10: end procedure

The VERIFYMNDM procedure is expected to be given an input with format F∗4 ; any other input will

result in an error returning message and the input data will be discarded. This function translates the

behaviour presented in steps 23 to 27 within the UDL in chapter 3.4.2. Moreover, VERIFYMNDM ∈ O(n).

Algorithm 13 MnDM’s data processing algorithm

Input: y . y ∈ S4

Output: x . Confidential plaintext

1: procedure PROCESSMNDM(y)

2: o← UNPACKMNDM(y);

3: t← VERIFYMNDM(o);

4: return t[3];

5: end procedure

As the name states, PROCESSMNDM ∈ O(n) combines algorithms 11 and 12 in order to process the

data that arrives at the mission and data manager entity. Figure 5.3 contains a high-level visualization of

the interaction between the specified algorithms used in the UDL, thus it has been named of Upstream

Algorithm Lifecycle (UAL).

PACKGD PROCESSMPPGD

• UNPACKMPPGD

• VERIFY

• PACKMPPMNDM

PROCESSMNDM

• UNPACKMNDM

• VERIFYMNDM

Figure 5.3: Upstream Algorithm Lifecycle

69

5.2.2 Downstream Algorithm Lifecycle

In this section the algorithms that represent the implementation of the data processing throughout

the DDL are described.

MnDM generates a command message, processes it according to algorithm 14 and sends it to the

envisaged recipient (MPP)

Algorithm 14 MnDM’s data packing algorithm

Input: x . Confidential plaintext

Output: y . y ∈ S5

1: procedure PACKMNDM(x)

2: h1 ← HMAC-SHA-256((kGMDH )j , x);

3: inner ← h ‖ x;

4: v1 ← PrGenIv(128);

5: e← CTR-ENCAESv1 ((kGMPA )j , inner);

6: h2 ← HMAC-SHA-256(kMPMH , e);

7: v2 ← PrGenIv(128);

8: outer ← h2 ‖ fid ‖ v1 ‖ e;

9: r ← CTR-ENCAESv2 (kMPMA , outer);

10: return v2 ‖ r;

11: end procedure

PackMnDM treats the input x to be the confidential plaintext and builds Ej5(x). This algorithm runs in

time O(n) where n is the size of x, due to the time complexity of all the inherent methods, which are

described in section 5.2.1.

Algorithm 15 MPP’s data unpacking algorithm: sender MnDM

Input: y . y ∈ S5

Output: p or error . p ∈ S∗∗51: procedure UNPACKMPPMNDM(y)

2: t← Parse(y,F5);

3: d← CTR-DECAESt[0] (kMPMA , t[1]);

4: f ← Parse(y,F∗5 );

5: if !VERIFY(f [0], f [3], kMPMH ) or valid(f [1]) == 0 then

6: return error;

7: else

8: return CTR-DECAESf [2] ((k

GMPA )j , f [3]);

9: end if

10: end procedure

Upon receiving the data, the MPP will make a call to the first procedure represented in algorithm

70

17. PROCESSMPPMNDM ∈ O(n) calls two methods (presented in algorithms 15 and 16), both with time

complexity O(n), where n is the size of the input. Algorithm 15 contains the pseudocode associated with

the steps required for unpacking the data with format F5 into the plaintext with format F∗∗5 and algorithm

16 contains the packing method for the MPP, which takes as input a 2-tuple and packs the first entry of

that tuple into format F2 or F3, depending on the second element of the input.

Algorithm 16 MnDM’s data packing algorithm: recipient GD

Input: (x, flag)

Output: y . y ∈ S2 or y ∈ S3, depending on x

1: procedure PACKMPPGD(x, flag)

2: if flag == 1 then

3: t← Parse(x,F∗∗5 );

4: h1 ← HMAC-SHA-256((kGMPH )j , t[1]);

5: else


7: end if

8: inner ← flag ‖ h1 ‖ x;

9: v ← PrGenIv(128);

10: e← CTR-ENCAESv ((kGMPA )j , inner);

11: mid + +; . Global value

12: return mid ‖ v ‖ e;

13: end procedure

However, it could also happen that the MPP would generate the comand message x, instead of

acting solely as a communication bridge between the GD and MnDM. In this situation, the procedure

PROCESSMPP ∈ O(n) descripted in algorithm 17 is called and returns a package ciphertext belonging

to the set S2.

Algorithm 17 MPP’s data processing algorithm

Input: x

Output: y . y ∈ S2 or y ∈ S3, depending on the procedure called

1: procedure PROCESSMPPMNDM(x) . Sender MnDM

2: p← UNPACKMPPMNDM(x);

3: return PACKMPPGD(p, 1); . y ∈ S3

4: end procedure

1: procedure PROCESSMPP(x) . MPP generates the confidential plaintext x

2: return PACKMPPGD(x, 0); . y ∈ S2

3: end procedure

Regardless of the method used by the MPP to process the data, it will send it to the envisaged GD

and upon reaching the recipient GDi the procedure descripted in algorithm 20 is triggered. PROCESSGD

71

∈ O(n), where n is the size of the input data that was received through the asynchronous communication

channel established between the MPP and the GD. This method calls the procedures descripted in

algorithms 18 and 19.

Algorithm 18 GD’s data unpacking algorithm

Input: y

Output: d or error . d ∈ S∗2 or d ∈ S∗3 , depending on y

1: procedure UNPACKGD(y)

2: t← Parse(y,F2); . Parsing with format F3 would have the same effect

3: if !CheckMsgId(t[0]) then

4: return error;

5: end if

6: return CTR-DECAESt[1] ((kGMPA )j , t[2]);

7: end procedure

UNPACKGD parses the input with format F2 into its main components, decrypts the ciphered compo-

nent and returns the obtained plaintext; it runs in time O(n). Note that parsing for F3 achieves the same

result since these two formats only differ in the inner format layers F∗2 and F∗3 .

Algorithm 19 GD’s verification algorithm

Input: d

Output: x or error . Confidential plaintext

1: procedure VERIFYGD(d)

2: if msb(d) == 1 then


4: if VERIFY(t[1], t[3], (kGMPH )j) and VERIFY(t[2], t[3], (kGMPH )j) then

5: return t[3];

6: end if

7: else


9: if VERIFY(t[1], t[2], (kGMPH )j) then

10: return t[2];

11: end if

12: end if

13: return error;

14: end procedure

After the ciphered contents are revealed, the system proceeds to the required verification. To this

end, the procedure VERIFYGD ∈ O(n) descripted in algorithm 19 is called.

msb(d) returns the most significant bit of the word d and is performed in constant time (O(1)) because

it only needs to extract the first bit in memory from the required field.

72

Algorithm 20 gathering device’s data processing algorithm

Input: y

Output: x or error . Confidential plaintext

1: procedure PROCESSGD(y)

2: d← UNPACKGD(y);

3: return VERIFYGD(d);

4: end procedure

Figure 5.4 contains a high-level visualization of the interaction between the specified algorithms used

in the DDL, thus it has been named of Downstream Algorithm Lifecycle (DAL).

PROCESSGD PROCESSMPPMNDM

• UNPACKMPPMNDM

• PACKMPPGD

PACKMNDM

• UNPACKGD

• VERIFYGD

Figure 5.4: Downstream Algorithm Lifecycle

73

Chapter 6

Results

With this work one can easily observe that what is better in theory may not always be more suitable

for the specific practical case at hand, where the real constraints must be thoroughly taken into account.

The study, decisions and analysis of this specific network were performed under the supervision of

analysts and developers of the company GMVIS Skysoft, S.A..

The keys’ generation process is very reliable in the sense that it is not only performed within secured

headquarters but also a very efficient method regarding security and time. More specifically, it is a

linear-time process with respect to the number of gathering devices that are to be deployed.

The selected network topology is considered to be the one that better suits the practical needs of

the mission, whilst in theory an ad-hoc network might have a better performance when combined with

elliptic curves [45].

The set of chosen packing schemes is considered to be a robust and secure option for the case,

but would achieve a higher level of security if adopting an encrypt-then-MAC method of encryption

and authentication with addition to including the header in the input to the HMAC; this approach would

assure the system to be IND-CCA secure. However, even though it would strengthen the theoretical

level of security, it would have no impact in practice because the variable size of the plaintexts induce an

inexorable fragility. As for the encryption scheme, AES-GCM should be preferred over AES-CTR-HMAC

in order to grant authenticity, integrity and privacy to the plaintexts in a theoretical point of view. The

former is underqualified simply because it is not implemented in the hardware of this particular type of

devices. Would any other devices with distinct characteristics have been chosen, the outcome would

certainly differ from the one presented. All packing schemes are vulnerable to chosen ciphertext attacks

which is a fact of some concern because an attacker with access to a decryption oracle might be able to

break the system, even if just partially. There is virtually no way of preventing an adversary of performing

a lunchtime attack [40] when the devices are in sleep mode.

All the data processing methods are linear in the size of the input. Note that the size of each of the

inputs to the data processing algorithms descripted in section 5.2 is dependent of the size of the confi-

dential plaintext. It is indeed the only dependency since the size of the confidential plaintext is the only

variable term when computing the size of each of the message formats. Given the GDs’ memory limita-

75

tion, the size of the confidential plaintexts generated by these elements has an upper bound according

to equation 3.13, which implies that the running time of the previously mentioned data processing algo-

rithms is also upper bounded due to this constraint, for their time complexity is O(n). Thus, in practice,

these algorithms are time-efficient.

6.1 Future Work

One possible improvement to the amplitude of the given network would be to allow the parallel activity

of more than one MPP. This would require more keys to be generated not only for privacy and integrity

purposes on the data, but such that all the MPPs are uniquely recognizable by the network parties (that

is, provide an authentication mechanism).

Another subject with good prospects is the hardware improvement of the devices such that their

capabilities allow more efficient and secure packing schemes. By efficient one means both in terms of

time and space complexity. A good example is to implement in hardware a randomized primitive that

makes use of analogue entropy sources in order to obtain fairly randomized values. This feature would

be extremely useful for the IV scheduler within the GDs and, in the event of increasing the devices’

battery lifetime, it would also be very fruitful for the development and maintenance of a key scheduler

algorithm.

In addition, a potential improvement would be to implement in hardware standardized authenticated

modes of operation such as GCM. Thus, adopting AES-GCM instead of AES-CTR-H would optimize

the system’s memory usage and therefore allow the GDs to be able to store more messages as well as

shorten the GDs’ sleep mode time-frame.

76

References

[1] G. Bertrand. Enigma: ou, La plus grande enigme de la guerre 1939-1945. Plon, 1973.

[2] Bellare, Mihir and Rogaway, Phillip. Course Notes: Introduction to Modern Cryptography. University

of California, San Diego.

[3] Yodai Watanabe, Junji Shikata, and Hideki Imai. Equivalence between Semantic Security and

Indistinguishability against Chosen Ciphertext Attacks. RIKEN Brain Science Institute, 2003.

[4] Claude Shannon. https://en.wikipedia.org/wiki/Claude_Shannon.

[5] Shannon, Claude. Communication Theory of Secrecy Systems. 1949.

[6] Matsui, Misturu. Linear Cryptanalysis Method for DES Cipher. Computer and Information Systems

Laboratory.

[7] Douglas Stinson. Cryptography: Theory and Practice,Third Edition. CRC/C&H, 3rd edition, 2005.

[8] National Institute of Standards and Technology. FIPS PUB 46-3: Data Encryption Standard (DES).

National Institute of Standards and Technology, Gaithersburg, MD, USA, October 1999. Super-

sedes FIPS PUB 46-2 1993 December 30.

[9] Michael Luby and Charles Rackoff. How to construct pseudorandom permutations from pseudo-

random functions. SIAM Journal on Computing, 17(2):373–386, 1988.

[10] Rijmen, Vincent Daemen, Joan. AES Proposal: Rijndael. April 2003.

[11] Douglas Stinson. Substitution-permutation networks. In Cryptography: Theory and Practice,Third

Edition, pages 74–79. CRC/C&H, 2005.

[12] National Institute of Standards and Technology. FIPS PUB 197: Advanced Encryption Standard

(AES). National Institute of Standards and Technology, Gaithersburg, MD, USA, November 2001.

[13] M.J.B. Robshaw. Stream ciphers. Technical report, RSA Data Security, Inc. ftp://ftp.

rsasecurity.com/pub/pdfs/tr701.pdf.

[14] National Institute of Standards and Technology. Recommendation for Block Cipher Modes of Op-

eration. National Institute of Standards and Technology, Gaithersburg, MD, USA, 2001.

77

https://en.wikipedia.org/wiki/Claude_Shannon

ftp://ftp.rsasecurity.com/pub/pdfs/tr701.pdf

ftp://ftp.rsasecurity.com/pub/pdfs/tr701.pdf

[15] Dworkin, Morris. Recommendation for Block Cipher Modes of Operation: The CCM Mode for

Authentication and Confidentiality. National Institute of Standards and Technology, Gaithersburg,

MD, USA, May 2014.

[16] Dworkin, Morris. Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode

(GCM) and GMAC. National Institute of Standards and Technology, Gaithersburg, MD, USA,

November 2007.

[17] Phillip Rogaway, Mark Wooding, and Haibin Zhang. The Security of Ciphertext Stealing, pages

180–195. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.

[18] R. Housley. Cryptographic message syntax (cms). STD 70, RFC Editor, September 2009. http:

//www.rfc-editor.org/rfc/rfc5652.txt.

[19] National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. National

Institute of Standards and Technology, Gaithersburg, MD, USA, April 1995. Supersedes FIPS PUB

180-3 2012 March 6.

[20] Stevens, Marc and Bursztein, Elie and Albertini, Ange and Markov, Yaric. The first collision for full

SHA-1. 2017.

[21] Hugo Krawczyk, Mihir Bellare, and Ran Canetti. Hmac: Keyed-hashing for message authentication.

RFC 2104, RFC Editor, February 1997. http://www.rfc-editor.org/rfc/rfc2104.txt.

[22] Andrew Chi-Chih Yao. Theory and applications of trapdoor functions. In 23rd IEEE Symposium on

Foundations of Computer Science, 1982.

[23] Oded Goldreich. Pseudorandom functions. In Foundations of Cryptography: Volume 1, pages

106–113, New York, NY, USA, 2006. Cambridge University Press.

[24] B. Kaliski. PKCS #5: Password-Based Cryptography Specification Version 2.0. RFC 2898, RFC

Editor, September 2000. http://www.rfc-editor.org/rfc/rfc2898.txt.

[25] B. Kaliski. PBKDF2. In PKCS #5: Password-Based Cryptography Specification Version 2.0, pages

9–11. RFC Editor, 2000.

[26] Ertaul, Levent and Kaur, Manpreet and Gudise, V. A. K. R . Implementation and Performance

Analysis of PBKDF2, Bcrypt, Scrypt Algorithms. http://www.mcs.csueastbay.edu/~lertaul/

PBKDFBCRYPTCAMREADYICWN16.pdf.

[27] Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications.

IEEE Std. 802.11i-2004, August 1999.

[28] National Institute of Standards and Technology. FIPS PUB 800-48: Guide to Securing Legacy

IEEE 802.11 Wireless Networks. National Institute of Standards and Technology, Gaithersburg,

MD, USA, July 2008.

78

http://www.rfc-editor.org/rfc/rfc5652.txt




http://www.mcs.csueastbay.edu/~lertaul/PBKDFBCRYPTCAMREADYICWN16.pdf

http://www.mcs.csueastbay.edu/~lertaul/PBKDFBCRYPTCAMREADYICWN16.pdf

[29] Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications

Amendment 6: Medium Access Control (MAC) Security Enhancements. IEEE Std. 802.11i-2004,

July 2004.

[30] Tews, Erik Beck, Martin. Practical attacks against WEP and WPA. November 2008.

[31] Kohno, Tadayoshi Bellare, Mihir. Hash Function Balance and its Impact on Birthday Attacks. May

2004.

[32] Burt Kaliski. Pkcs #7: Cryptographic message syntax version 1.5. RFC 2315, RFC Editor, March

1998. http://www.rfc-editor.org/rfc/rfc2315.txt.

[33] Lars Knudsen and David Wagner. Integral Cryptanalysis, pages 112–127. Springer Berlin Heidel-

berg, Berlin, Heidelberg, 2002.

[34] Andrey Bogdanov, Dmitry Khovratovich, and Christian Rechberger. Biclique Cryptanalysis of the

Full AES. August 2011.

[35] A. C, R. P. Giri, and B. Menezes. Highly efficient algorithms for aes key retrieval in cache access

attacks. In 2016 IEEE European Symposium on Security and Privacy (EuroS P), pages 261–275,

March 2016.

[36] Z. Manna and A. Pnueli. The Temporal Logic of Reactive and Concurrent Systems: Specification.

Springer New York, 2012.

[37] Jon Postel. Transmission control protocol. STD 7, RFC Editor, September 1981. http://www.

rfc-editor.org/rfc/rfc793.txt.

[38] Bellare, Mihir. New Proofs for NMAC and HMAC: Security without Collision-Resistance. June 2006.

[39] A.C.A. Nascimento and P. Barreto. Information Theoretic Security: 9th International Conference,

ICITS 2016, Tacoma, WA, USA, August 9-12, 2016, Revised Selected Papers. Lecture Notes in

Computer Science. Springer International Publishing, 2016.

[40] P. Rogaway, D. Pointcheval, A. Desai, and M. Bellare. Relations Among Notions of Security for

Public-Key Encryption Schemes. June 2001.

[41] Douglas Cook. Measuring memory protection. In Proceedings of the 3rd International Conference

on Software Engineering, ICSE ’78, pages 281–287, Piscataway, NJ, USA, 1978. IEEE Press.

[42] Kristine Amari. Techniques and Tools for Recovering and Analyzing Data from Volatile Memory.

March 2009.

[43] Grand, Joe. Practical Secure Hardware Design for Embedded Systems. In Proceedings of the

2004 Embedded Systems Conference. CMP Media, April 2004.

[44] Krawczyk, Hugo and Bellare, Mihir and Canetti, Ran. RFC 2104: HMAC: Keyed-Hashing for Mes-

sage Authentication. 1997.

79




[45] Douglas Stinson. Elliptic curves. In Cryptography: Theory and Practice,Third Edition, pages 254–

266. CRC/C&H, 2005.

[46] The Java Tutorials. https://docs.oracle.com/javase/tutorial, 1995.

[47] The ASCII Character Set. http://ee.hawaii.edu/~tep/EE160/Book/chap4/subsection2.1.1.

1.html, August 1994.

80

https://docs.oracle.com/javase/tutorial

http://ee.hawaii.edu/~tep/EE160/Book/chap4/subsection2.1.1.1.html

http://ee.hawaii.edu/~tep/EE160/Book/chap4/subsection2.1.1.1.html

Appendix A

Schemes of Block Cipher Modes of

Operation

EBk

p1P p2 · · · pn

c1C c2 · · · cn

EBk EBk

input

output

(a) ECB encryption.

DBk

c1C c2 · · · cn

p1P p2 · · · pn

DBk DBk

input

output

(b) ECB decryption.

Figure A.1: ECB mode encryption and decryption procedures using an arbitrary block cipher B.

81

EBk

IV

p1P p2 · · · pn

c1C c2 · · · cn

EBk EBk

· · ·

· · ·

(a) CBC encryption.

DBk

c1C c2 · · · cn

IV

p1P p2 · · · pn

DBk DBk

· · ·

· · ·

(b) CBC decryption.

Figure A.2: CBC mode encryption and decryption procedures using an arbitrary block cipher B.

82

IV EBk

p1 p2 · · · pn

c1 c2 · · · cn

EBk EBk

k1 k2 kn

· · ·

· · ·

(a) CFB encryption.

EBk

c1C c2 · · · cn

IV

p1P p2 · · · pn

EBk EBk

· · ·

· · ·

(b) CFB decryption.

Figure A.3: CFB mode encryption and decryption procedures using an arbitrary block cipher B.

83

EBk EBk EBk

t1IV t2 tn

p1 p2 pn

c1 c2 cn

· · ·

(a) CTR encryption.

EBk EBk EBk

t1IV t2 tn

c1 c2 cn

p1 p2 pn

· · ·

(b) CTR decryption.

Figure A.4: CTR mode encryption and decryption procedures using an arbitrary block cipher B.

84

Appendix B

User Manual: Key Generation

Application

An application has been developed with the objective of providing the reader a close look on how

the keys are generated in practice. KeyGeneratorApp is the executable JAR file [46] that contains the

mock-up program for the key generation. This application is able to upload the keys directly into a serial-

connected device or into an encrypted file with a pre-defined data format. The first case is out of the

scope of this text and thus only the second is going to be discussed step-by-step.

When running KeyGeneratorApp, one should see a window similar to Figure B.1a so that the user

can fill the required fields. Figure B.1b contains a suggestive filling of the fields and it is going to be

considered hereinafter given that these fields define the produced keystream. Upon clicking on the

”Generate Keys” button the program generates the whole keystream and saves it in volatile memory

while waiting for the next order. A pop-up window should appear similarly to Figure B.1c; by clicking

”Yes” the program proceeds.

85

(a)

(b) (c)

Figure B.1: KeyGeneratorApp’s initial screen.

The environment of the interface should now change according to Figure B.2a. There are several

options to be chosen, one of which is triggered by the button ”Generate another key set”; it goes back

to the previous key generation step in order to overwrite the current keystream with a new keystream

based on new inputs chosen by the user. This option is advised if the user wants to change some of the

previous inputs. In the last mentioned figure there are three options to be selected for the destination

target, i.e., the user chooses the option where to upload the keys. The first two options are dependent

on a serial-connected device and as previously mentioned this section will not discuss such scenario,

so the only remaining viable field to be selected in this situation is ”Encrypted File”. Figure B.2b details

the chosen sequence: the keys associated with GD with identifier did = 6 in the program’s memory are

going to be exported to a file.

86

(a)

(b)

Figure B.2: KeyGeneratorApp’s target choice screen.

Upon clicking ”Next” a window similar to Figure B.3a should appear and the user can now choose

the name and path in the file system of the (encrypted) file that will hold the keys, and the password that

is used by the password based key derivation function which outputs the key used in the CTR mode of

operation with the AES cipher. The length n of this password must satisfy 8 ≤ n ≤ 63. The filling of

the fields in this image are merely illustrative. Nevertheless the same options can be used apart from

the file path, which must be chosen according to the user’s local file system. After filling the fields, the

button ”Generate File” creates the .enc encrypted file with the chosen password in the desired location,

containing the envisaged keys.

87

(a)

(b)

Figure B.3: KeyGeneratorApp’s file details.

After all these steps are concluded the button ”Finish” is enabled and its action triggers the image

depicted in Figure B.4. Here, the user has several options:

• Communication Application Test: starts the communication mock-up application for the message

interaction between the GD and the MPP. This application is out of the scope of this text, since to

run this executable there are additional requirements uniquely in the possession of the developers;

• Key Checker Application Test: decrypts and exports the keys within a previously encrypted file in

the file system into a file with extension .txt ;

• Export to another target: goes back to the selection of the destination target for the current key set

(Figure B.2);

• Generate another key set: resets the program by clearing the memory associated with the current

keystream and goes back to the initial screen (Figure B.1);

• Exit: safely exits the program.

88

Figure B.4: KeyGeneratorApp’s key export final step.

By selecting ”Key Checker Application Test” as the next step, a window similar to Figure B.5a shall

appear. In the upper right corner the ”Show Help” button drops down a description of the behaviour of

the program. The user can now choose one of the files previously created lying in the file system and fill

the password field with the password that matches the one used in the file’s encryption. The type of file

being decrypted is also a required parameter to be chosen since the program needs to parse the file’s

contents. The parameters chosen throughout this guide are the following:

• WLAN SSID: MISSION2801WINET ;

• WLAN password: secretpassword ;

• Seed: myrandomseed ;

• Number of GD: 73;

• Target file type: GD;

• ID: 6;

• File name: GD6Keys.enc;

• File path: C:\Users \Ricardo \Desktop;

• Password for encrypting the file: fileEncPass;

(a) (b)

Figure B.5: KeyGeneratorApp’s key checker example screen.

89

Figure B.6: Pre-deployment stage secret information’s revealment.

Therefore the chosen options for this case should cope with Figure B.5a. The file with extension .txt

is created in the same location as the file with extension .enc in the file system. Figure B.6 illustrates

the contents of the GD6Keys.txt file for the abovementioned parameters, which can be opened by the

reader in any way of his choice; in this case the source code editor Notepad++ was used. Each entry

corresponds to an element of the keystream and is represented by an array of byte values, that is each

element a of the array is such that a ∈ Z256, according to the ASCII character set [47].

90

Appendix C

Message Formats

0 63

256-bit HMAC h1(D)

HMAC-SHA-256

with key (kFHH )i

256-bit HMAC h2(D)

Header

HMAC-SHA-256

with key (kFMH )i

D

...

confidential

plaintext2

{

Figure C.1: Message format F∗1

0 7 8 39 63

fid mid

128-bit initialization vector IVHeader

Data with format F∗1

AES-CTR

encrypted

with key (kFHA )i

and IV

Figure C.2: Message format F1

The gray field in figure C.2 represents the absence of elements in that position. It was chosen to be

pictured this way for a better visualization of the fields.

2Length may be variable.

91

0 1 31

f

h1(D)HMAC-SHA-256

with key (kFHH )i

D

...

confidential

plaintext


0 31

mid

128-bit IV

Header


AES-CTR

encrypted

with key (kFHA )j


92

0 1 31

f

h1(D)HMAC-SHA-256

with key (kFHH )i

h2(D)HMAC-SHA-256

with key (kFMH )i

D

...

confidential

plaintext


0 31

mid

128-bit initialization vector IVHeader


AES-CTR

encrypted

with key (kFHA )j and IV


93

0 7 8 63

fid

128-bit initialization vector IV

h1(D)

HMAC-SHA-256

with key (kFHH )i

h2(D)

Header

HMAC-SHA-256

with key (kFMH )i


AES-CTR

encrypted

with key (kFHA )i

and IV


0 63

h3(enc pack)

HMAC-SHA-256

with key kHMH


Header

enc pack:


AES-CTR

encrypted

with key kHMA

and IV


0 63

h2(D)

HMAC-SHA-256

with key (kFMH )i

D

...

}Confidential

plaintext

Figure C.9: Message format F∗∗5

94

0 7 8 63

h3(enc pack)

fid


Header

enc pack:

Data with format F∗∗5

AES-CTR

encrypted

with key (kFHA )i

and IV


0 63


}Header


AES-CTR

encrypted

with key kHMA

and IV


95

cryptography on a customized network · palavras-chave: indistinguibilidade de texto cifrado; ......

Documents