cryptography on a customized network · palavras-chave: indistinguibilidade de texto cifrado; ......
TRANSCRIPT
Cryptography on a Customized Network
Ricardo Martinho Ferreira Miranda
Thesis to obtain the Master of Science Degree in
Mathematics and Applications
Examination CommitteeChairperson: Prof. Maria Cristina Sales Viana Serodio Sernadas
Supervisor: Prof. Paulo Alexandre Carreira MateusCo-supervisor: Bruno Neto de Oliveira Tavares
Member of the Committee: Prof. Andre Nuno Carvalho Souto
November 2017
Acknowledgements
I want to thank both my supervisors Paulo Mateus and Bruno Tavares for all their support and guid-
ance. I would also like to thank my dearest friend Sofia Brito, whose counseling and motivation were
crucial aspects in the overcoming of the most difficult moments.
i
ii
Resumo
Construir uma rede segura para ser utilizada em aplicacoes reais onde ha restricoes impostas
as capacidades dos elementos da rede e a transferencia de informacao necessita uma analise crip-
tografica costumizada de forma a proteger as comunicacoes e detectar e minimizar as vulnerabilidades
do sistema que poderao ser exploradas. Neste documento, uma rede com essas condicoes e apre-
sentada, procura-se encontrar um esquema topologico otimo antes de se escolherem os componentes
criptograficos da rede embutidos nas comunicacoes e armazenamento e posteriormente analiza-se a
sua seguranca. De entre as alternativas escrutinadas, apenas uma e escolhida como a solucao, por
comparacao em termos de performance, seguranca e adaptacao as restricoes impostas. Esta solucao
e implementada usando as linguagens de programacao C e Java. Prova-se que os esquemas de
encriptacao e protocolos escolhidos sao opcoes altamente adequadas e o seu uso na pratica e acon-
selhado. Estes resultados sao apenas validos para este especıfico caso de estudo, uma vez que na
eventualidade de alguma das restricoes ser alterada entao e provavel que exista uma solucao diferente
da sugerida e mais apropriada.
Palavras-chave: indistinguibilidade de texto cifrado; modo de operacao de cifra de bloco; seguranca
semantica; sistema de encriptacao simetrico.
iii
iv
Abstract
Building a secure network to be used in real-world applications where there are constraints strictly
imposed to the capabilities of the network’s elements and to the data flow requires a customized crypto-
graphic analysis in order to protect the communications and detect and minimize the system’s exploitable
vulnerabilities. In this document a network under such conditions is presented and one is challenged
with providing an optimal topological scheme prior to choosing the network’s cryptographic components
embedded in the communication and data storage protocols and posteriorly analyzing their security.
Among the scrutinized alternatives a single one of them is elected as the solution by a comparison in
terms of performance, security and suitability under the enforced restrictions. This solution is imple-
mented using C and Java programming languages. The selected encryption schemes and protocols are
proven to be highly reasonable options and their use in practice is advised. These results are only valid
for this specific case of study, for if any of the established constraints is ruled out then it is most likely the
insurgence of an enhanced solution.
Keywords: ciphertext indistinguishability; block cipher mode of operation; semantic security; symmetric
cryptosystem.
v
vi
Glossary
In The set {k ∈ N : 1 ≤ k ≤ n}.
P (A) Probability of occurrence of event A..
A∗ The Kleene star of A.
I The set of unique identifiers of gathering devices.
O f ∈ O(g)⇔ ∃M∈R+∃x0∈R∀x≥x0|f(x)| ≤M |g(x)|.
bitstring An element of Z∗2.
byte A metric related with data-storage, composed by 1 octet.
kB 1 kB = 1024 bytes.
octet A sequence of 8 bits.
vii
viii
List of Abbreviations
bxc Floor function of x, for some x ∈ R.
0j The bitstring composed of j ’0’s, for some j ∈ N.
1j he bitstring composed of j ’1’s, for some j ∈ N.
[w]2 Binary representation of the word m.
dxe Ceiling function of x, for some x ∈ R.
w|k Suffix of w of length k, for some k ∈ N.
w|k Prefix of w of length k, for some k ∈ N.
x ‖ y Concatenation of words x and y.
x \ y Difference of x and y.
|w|2 The number of bits of the word w.
3DES Triple DES.
ACK Acknowledgement.
AES Advanced Encryption Standard.
BCMO block cipher mode of operation.
CA certificate authority.
CBC Cipher Block Chaining.
CCM Counter with CBC-MAC.
CFB Cipher Feedback.
CPU central processing unit.
CSPRNG cryptographically secure pseudo-random number generator.
CTR Counter.
CTR-H CTR mode with HMAC-256 checksum.
DAL Downstream Algorithm Lifecycle.
DB database.
DDL Downstream Data Lifecycle.
DES Data Encryption Standard.
ix
EAP Extensible Authentication Protocol.
ECB Electronic Codebook.
ECC Elliptic Curve Cryptography.
FIFO First in first out.
GCM Galois Counter Mode.
GD Gathering device.
GDj gathering device with unique identifier j ∈ I.
GMAC Galois Message Authentication Code.
GTK Group Temporal Key.
HMAC hash-based message authentication code.
IEEE Institute of Electrical and Electronic Engineers.
IEEESA Institute of Electrical and Electronic Engineers Standards Association.
IND-CCA Indistinguishability under chosen-ciphertext attack.
IND-CPA Indistinguishability under chosen-plaintext attack.
IV initialization vector.
KDF key derivation function.
LAN local area network.
MAC message authentication code.
MIC message integrity code.
MiM Man-in-the-middle.
MnDM Mission and data manager.
MPP Middle-point party.
NIST National Institute of Standards and Technology.
PBKDF2 password-based key derivation function 2.
PCgF package ciphertext generator function.
PCuF package ciphertext unpacking function.
PMK Pairwise Master Key.
PMS pre-mission system.
POA Padding Oracle Attack.
x
PRF pseudo-random function.
PRNG pseudo-random number generator.
PSch packing scheme.
PSK Pre-shared key.
PTK Pairwise Transient Key.
RFC Request for Comments.
SEM-CPA Semantic security under chosen-plaintext attack.
SPN Substitution Permutation Network.
SSID Service Set Identifier.
UAL Upstream Algorithm Lifecycle.
UDL Upstream Data Lifecycle.
WLAN wireless local area network.
XOR exclusive-or operation.
xi
xii
List of Tables
4.1 Comparison between CTR and CFB features. . . . . . . . . . . . . . . . . . . . . . . . . . 58
xiii
xiv
List of Figures
2.1 Encryption round of a SPN. Corresponds to the round function g from cryptosystem 4. It
is used in all rounds except the last. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Network Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Extensible Authentication Protocol (EAP). . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 WPA2 four-way handshake. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 WPA2 group-key handshake. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Man in the middle attack. Eve is able to intercept the message and/or jam the communi-
cation channel at will. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 General purpose and activity of the envisaged network. . . . . . . . . . . . . . . . . . . . 36
3.2 General layout of the desired network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Pre-deployment stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Topology of AP-based networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Topology of the ad-hoc network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 Key generation based on k users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1 Pre-processing steps of the secret pass for the generation of the seed of the SHA-1
pseudo-random function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Scatter plots of the average key generation time per number of gathering devices. . . . . 64
5.3 Upstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Downstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1 ECB mode encryption and decryption procedures using an arbitrary block cipher B. . . . 81
A.2 CBC mode encryption and decryption procedures using an arbitrary block cipher B. . . . 82
A.3 CFB mode encryption and decryption procedures using an arbitrary block cipher B. . . . 83
A.4 CTR mode encryption and decryption procedures using an arbitrary block cipher B. . . . 84
B.1 KeyGeneratorApp’s initial screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
B.2 KeyGeneratorApp’s target choice screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
B.3 KeyGeneratorApp’s file details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
B.4 KeyGeneratorApp’s key export final step. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
xv
B.5 KeyGeneratorApp’s key checker example screen. . . . . . . . . . . . . . . . . . . . . . . . 89
B.6 Pre-deployment stage secret information’s revealment. . . . . . . . . . . . . . . . . . . . . 90
C.1 Message format F∗1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
C.2 Message format F1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
C.3 Message format F∗2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
C.4 Message format F2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
C.5 Message format F∗3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
C.6 Message format F3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
C.7 Message format F∗4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
C.8 Message format F4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
C.9 Message format F∗∗5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
C.10 Message format F∗5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
C.11 Message format F5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
xvi
Contents
Resumo iii
Abstract v
Glossary vii
List of Abbreviations ix
List of Tables xiii
List of Figures xv
1 Introduction 1
1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Basic Concepts 3
2.1 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Modern Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1.1 Linear and Differential Cryptanalysis . . . . . . . . . . . . . . . . . . . . . 7
2.2.1.2 DES and 3DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1.3 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Block Cipher Modes of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2.1 ECB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2.2 CBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2.3 CFB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2.4 CTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2.5 CCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2.6 GCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2.7 Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Asymmetric Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Cryptographic Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 SHA-256 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
xvii
2.3.2 HMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Key Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Communication Protocols in Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.1 WEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.2 WPA/WPA2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.2.1 Initial Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.2.2 4-way Handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.2.3 Group-key Handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Known Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.1 Brute Force and Dictionary Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.2 Man In The Middle Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.3 Birthday Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.6.4 Replay Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6.5 Padding Oracle Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6.6 Stream Cipher Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6.6.1 Key Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.6.2 Bit-flipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.7 Weaknesses of Block Cipher Modes of Operation . . . . . . . . . . . . . . . . . . . 31
2.6.8 Side-Channel Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.6.9 Attacks on AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Network 35
3.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.2 Communication Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5 Message Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Security Analysis 51
4.1 Strengths and Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.1 Key Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.2 Packing Schemes and Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.2.1 Semantic security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1.2.2 Encryption Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.3 Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Possible Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.1 Chosen-plaintext attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
xviii
4.2.2 Chosen-ciphertext attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5 Implementation Details 61
5.1 Key Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2.1 Upstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2.2 Downstream Algorithm Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6 Results 75
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
References 77
A Schemes of Block Cipher Modes of Operation 81
B User Manual: Key Generation Application 85
C Message Formats 91
xix
xx
Chapter 1
Introduction
The first approach to secure a certain piece of information dates back to the Ancient Greece. Ever
since, mankind has been continuously developing new methods for securing desired secrets and while
some create the methods to secure information, others put a lot of effort on discovering weaknesses in
order to retrieve the envisaged secrets. An example that reflects cryptography’s tremendous relevance
in modern days is World War II. The victory of the Allies is considered to have been greatly influenced
by their ability to eavesdrop on the enemy’s communications after being able to break the Enigma [1]
cipher and as such, this war propelled the major advancements in the fields of cipher construction and
cryptanalysis. The continuous demand for protecting information is the fuel that thrives the evolution of
computational security.
In the current work one is presented with a network composed by several types of devices with cer-
tain restriction with respect to memory, space and power consumption and aims to choose the most
suitable topology for the network and to create a security mechanism to be included in the communi-
cation protocol with the objective of providing the satisfiability of some cryptographic properties to the
messages travelling through the network. This work was developed for a real-world project in a business
environment under the supervision of analysts and developers of the company GMVIS Skysoft, S.A.
therefore several limitations were imposed.
1.1 Summary
This dissertation is segmented in six chapters and an appendix with three sections.
In chapter 2 some required state-of-the art concepts that are somehow related with the developed
work are addressed. In chapter 3 the problem is introduced, the options with regard to the network’s
topology and communication protocols are discussed, the general protocol for the upstream and down-
stream data lifecycles is presented and its underlying message formats are defined. Chapter 4 contains
a security analysis for the network defined in chapter 3. In chapter 5 the details regarding the implemen-
tation are discussed and some parts of the code are presented and analyzed as pseudocode. Chapter
6 is the last of its kind and contains a general overview of the results obtained and motivation for future
1
work on the subject.
In appendix A lie the figures regarding the construction of the block cipher modes of operation pre-
sented in chapter 2, appendix B is a user-guide manual for the developed user-interface application
with respect to the generation of keys and in appendix C one can visualize all the message formats
introduced in chapter 3.
2
Chapter 2
Basic Concepts
This chapter provides an overview of several cryptographic concepts and algorithms that are ap-
plicable to the system that is considered in the next section and references are provided for various
topics that are considered relevant, but out of the central scope of the text. The reader is assumed
to be familiarized with basic cryptographic theory. Section 2.1 addresses some security definitions for
cryptosystems and the most common techniques used by adversaries, section 2.2 introduces modern
cryptographic concepts with highlight for private-key cryptosystems, section 2.3 contains the properties
of cryptographic hash functions as well as a high level description of SHA-256, section 2.4 describes the
problem of generating random values, section 2.5 specified the wireless communication protocols used
in the problem at hand and the last section refers to nowadays’ known attacks to some of the systems
described throughout the chapter. The minor level of detail assigned to public-key cryptography is due
to the deprecation of this option with regard to the solution of security nominated in chapter 3.
The suite of algorithms for the key generation, encryption and decryption processes form a cryp-
tographic system (cryptosystem) and these are usually implemented to provide the user the ability of
covering classified information.
Definition 2.0.1. A Cryptosystem is a 5-tuple (P, C,K, E ,D) such that:
• P is the set of all possible plaintexts.
• C is the set of all possible ciphertexts.
• K is the set of all possible keys, also denoted as the key space.
• E := {Ek : P → C}k∈K is the family of all encryption functions.
• D := {Dk : C → P}k∈K is the family of all decryption functions.
Two types of cryptosystems can be defined: symmetric and asymmetric. In the first, the same key
is used for encryption and decryption, while in the latter there are two distinct keys, one for encryp-
tion and one for decryption. Section 2.2.3 further discusses asymmetric cryptography also known as
public-key cryptography; for now this distinction suffices. The following section is oriented for symmetric
cryptography.
3
2.1 Cryptanalysis
It is possible to break1 cryptographic systems without the knowledge of the used key(s) or even of the
algorithm itself. As the name suggests, Cryptanalysis is the study of cryptosystems with the objective to
find flaws or weaknesses that entail a gain of information from unauthorized parties, without necessarily
discovering the secret key. Distinct cryptanalysis’ methods can be categorized based on the information
available to the attacker. Following, the most common cryptanalysis methods are presented:
Definition 2.1.1 (Ciphertext-only attack). A ciphertext-only attack is one where the adversary possesses
information regarding the ciphertext and is able to deduce either the corresponding plaintext or the key,
without being provided any details about the plaintext itself, theoretically. Notwithstanding, in practice,
the attacker usually does have access to useful information, such as the alphabet in which the plaintext
is written.
Definition 2.1.2 (Known-plaintext attack). A Known-plaintext attack focuses on finding the secret key
(or key stream) of the cryptosystem at hand, provided the knowledge of both the ciphertext and its
corresponding plaintext.
Definition 2.1.3 (Chosen-plaintext attack). In a Chosen-plaintext attack, the adversary has access to
an encryption oracle, which encrypts any plaintext given by the attacker and outputs the corresponding
ciphertext.
A cryptosystem is said to be secure against chosen-plaintext attacks if and only if given an adversary
who is able to choose any pair of plaintexts x0, x1 whose encryptions are y0, y1 respectively, cannot
decide which of the following is true:
yi = ek(xi) or yi = ek(xi+1) (2.1)
with probability greater than 0.5.
Definition 2.1.4 (Chosen-ciphertext attack). In a Chosen-ciphertext attack, the adversary has access to
a decryption oracle, which decrypts any ciphertext given by the attacker and outputs the corresponding
plaintext.
The increasing complexity of cryptosystems throughout the years has demanded a serious develop-
ment on the methods of cryptanalysis. A system is as secure as its resilience to the most devious pos-
sible attack and every cryptosystem used in the present day is continuously being targeted by hackers,
who never cease their attacking spree and are constantly developing new methods aiming to increase
their success rate. So, how can one be certain that a cryptosystem will always remain secure against
certain types of attacks? The following definition highlights such property.
Definition 2.1.5 (Semantic Security). Let C = (X,Y,K, E ,D) be a cryptosystem and ek ∈ E . A cryp-
tosystem is said to be semantically secure if given y = ek(x), then V (A(y, l)) = V (B(l)), where A and
1By ”break” one means being able to recover the corresponding plaintext for any given ciphertext.
4
B are two polynomial-time bounded adversaries, l = |y| and V is the advantage of the adversary, which
is defined by
V (a) = P (a chooses the wrong plaintext)− P (a chooses the correct plaintext) (2.2)
for every adversary a, where P (E) defines the probability of occurrence of event E.
Notwithstanding, it is possible that a cryptosystem is semantically secure against some types of
attacks while having some flaws with respect to its construction, which entail undesired properties that
can be used by an adversary to exploit vulnerabilities in the scheme at hand.
The upcoming definitions are helpful to the semantic security analysis of cryptosystems. The follow-
ing description is based on the results presented in [2] and some similar notation is used.
Consider the following scenario: an adversaryA living in one of two worlds2 (left world L or right world
R) is trying to break a cryptographic system C and has access to an encryption oracle Oe. A does not
know which world he lives in but the world W is defined a priori and cannot be changed throughout the
entire activity of A. The encryption oracle, given any two plaintexts p0, p1 always returns the ciphertext
ek(pb) where ek is the encryption function of C for some k ∈ K and b ∈ {0, 1} is picked according to the
following relation
b =
0 if W = L
1 if W = R
(2.3)
Oe is known as lr-oracle.
Definition 2.1.6 (IND-CPA). Let C = (P,C,K, E ,D) be a cryptosystem with encryption and decryption
functions ek and dk, respectively, for some k ∈ K, let A be an adversary and O an encryption lr-
oracle. Consider that A is in possession of X = (x1, . . . , xn) ∈ Pn and also that |xi| = |xj |,∀i 6=j .
Indistinguishability under chosen-plaintext attack (IND-CPA) is a game defined as follows:
1. A picks two messages x′0, x′1 ∈ X;
2. A queries the oracle O with (x′0, x′1);
3. O encrypts x′b yielding ek(x′b), according to 2.3;
4. O returns the encryption output ek(x′b) to A;
(A can repeat steps 1 to 4 at will);
5. A chooses b′ ∈ {0, 1};
6. A wins if b′ = b and loses otherwise.
If A is able to correctly choose b′ with probability negligibly greater than 1/2 then the system at
hand is not semantically secure against chosen-plaintext attacks (see property 1). To the answer of the
encryption oracle to the adversary’s query one denotes by challenge ciphertext and analogously for the
decryption oracle by challenge plaintext.2A world can be seen as a binary state.
5
A stronger measure of security can be defined based on the previous definition. If the adversary
not only has access to an encryption oracle but also to a decryption oracle, then A is granted a serious
amount of resources that threaten the security of the cryptosystem at hand as this is the most critical
and undesirable type of attack to defend against.
Definition 2.1.7 (IND-CCA). Indistinguishability under chosen-ciphertext attack (IND-CCA) is a game
analogous to the one from Definition 2.1.6, but herein the adversary has access to two lr-oracles: an
encryption lr-oracle Oe and a decryption lr-oracle. This game has the additional requirement that the
adversary A may not query the decryption oracle with challenge ciphertexts.
Let Od be the decryption lr-oracle and assume that the adversary is not allowed to query Od with
any challenge ciphertext, in which case it would be trivial for A to gain advantage on the IND-CCA game
becauseA would immediately know the world W . The two properties that follow are desirable properties
for analyzing the security level of any cryptographic system.
Property 1 (IND-CPA secure). A cryptosystem is said to be IND-CPA secure if a polynomial-time
bounded adversary who plays the IND-CPA game cannot win with probability negligibly greater than
1/2.
Property 2 (IND-CCA secure). A cryptosystem is said to be IND-CCA secure if a polynomial-time
bounded adversary who plays the IND-CCA game cannot win with probability negligibly greater than
1/2.
According to [3], IND-CCA ⊂ IND-CPA, thus any system that is IND-CCA secure is also IND-CPA
secure. In the present days, the minimum threshold of security required for a cryptosystem to be ac-
ceptable with regard to its security level is to satisfy property 1.
2.2 Modern Cryptography
During World War II, Claude Shannon [4] contributed to the development of Cryptography, especially
with his results presented in a 1945 classified paper [5], which influenced the development of modern
day cryptography.
This section describes linear and differential cryptanalysis techniques, some of the most important
block ciphers with focus on AES and finally several block cipher modes of operation and their properties.
2.2.1 Block Ciphers
Block ciphers can be defined as iterated product ciphers. The notion of an iterated cipher is straight-
forward and its main components are a round function and a key schedule. As the name suggests, the
cipher consists in performing several rounds (iterations) of a round function applied on a state and a
round key, where the initial state is the plaintext, a (non-initial) state is defined as the image of the round
function and a round key and the round key is one of the elements of the output of the key schedule
algorithm. Cryptosystem 1 formally illustrates this description.
6
Cryptosystem 1. Let X = Σ∗P and Y = Σ∗C be the set of plaintext and ciphertext bitstrings, respectively
and K the key set. Let round : S ×KR → S be the round function and f : K → KR the key schedule,
where S ⊇ X ∪ Y is the set of states and KR the set of round keys with |KR| = r ∈ N, such that
x = s0;
y = sr;
f(k) = (k1, . . . , kr);
round(si, ki+1) = si+1;
(2.4)
An iterated cipher is defined as follows:
ek(x) =round(. . . round(round(s0, k1), k2) . . . , kr);
dk(y) =round−1(. . . round−1(round−1(sr, kr), kr−1) . . . , k1);(2.5)
where round−1 is well defined iff round(s, k) is injective for fixed k.
For block ciphers and according to the initial statement of this section, round usually contains a
combination of S-Boxes and/or P-Boxes.
A common technique used to increase the security of a block cipher is called key whitening and con-
sists in performing an exclusive-or operation (XOR) using the round key in the initial and the last rounds.
Whitening contributes to increasing the hardness of a brute-force attack. For a block cipher to be con-
sidered robust, it must have good confusion and diffusion [5], otherwise it may be susceptible to simple
statistical attacks, namely linear and differential cryptanalysis. Robust block ciphers are widely used in
nowadays’ cryptographic algorithms, namely in cryptographic hash functions and pseudo-random num-
ber generators.
2.2.1.1 Linear and Differential Cryptanalysis
Linear and differential cryptanalysis are the most common and devious attacks known to block ci-
phers. Generally speaking, both focus on finding probabilistic linear relations and exploit them in such
a manner that it becomes feasible to perform either known-plaintext or chosen-plaintext attacks. Both
techniques make use of the bias of a random variable, as opposed to measuring the raw probability, as
it expresses the deviation of the true value of a random variable with its expecting value. For a Bernoulli
distributed random variable (X ∼ Ber(p)), this quantity is defined by
ε(X) = p− 1
2(2.6)
Linear Cryptanalysis
Assume that, for large n, the attacker Victor (V) has access to (x1, y1) . . . , (xn, yn) such that ek(xi) =
yi for fixed k, where ek is the encryption function of the block cipher at hand. Moreover, suppose that V
is able to linearly relate subsets of the plaintext, ciphertext and key bits on a linear approximation of the
form
x⊕[a1, . . . , aj ]⊕ y⊕[b1, . . . bl] = k⊕[c1, . . . , cm] (2.7)
7
where a1, . . . aj , b1, . . . , bl, c1, . . . , cm are fixed bit indexes and v⊕[d1, . . . , dh] represents v[d1]⊕ . . .⊕v[dh]
such that v[di] ∈ {0, 1},∀i∈Ih , for fixed bit index di.
The attack consists in assigning an equally-valued counter Ck to each possible key k ∈ K. For every
pair (xi, yi) in V’s possession he computes the left side of equation 2.7 for each k and Ck is incremented
each time the abovementioned equation holds. At the end of the whole process, the key k with the
highest counter value is the key for which the bits k[c1], . . . , k[cm] are considered to be correct.
Considering T to be the random variable that represents the outcome of equation 2.7, the effec-
tiveness of a linear cryptanalytic attack is proportional to |ε(T)|[6]. According to [7], the number n of
plaintext-ciphertext pairs that V needs to know in order for the attack to succeed with high confidence is
approximately cε−2, where c ∈ R is usually small. Note that (ε(T) → 0) ⇒ (n → ∞) and n → 4c when
ε(T)→ ±1/2.
Differential Cryptanalysis
Differential cryptanalysis is very similar to the aforementioned procedure of linear cryptanalysis, with
the exception that one does not try to find a linear relation between the plaintext, ciphertext and key bits,
but instead for a linear approximation on differences3 of the plaintext and ciphertext bits with key bits.
That is, instead of the attacker V having pairs (x, y), he is now able to choose x, x′ ∈ P and compute a
differential : the pair (∆x′
x ,∆y′
y ), where ∆x′
x = x⊕ x′, ∆y′
y = y ⊕ y′, ek(x) = y and ek(x′) = y′.
For each of the possible keys k, V checks if the linear approximation between the differentials holds
and if so, the counter of k is incremented. This process is very similar to the one of linear cryptanalysis,
as the key with highest counter will be the one for which the bits of the linear relation are most likely
correct.
2.2.1.2 DES and 3DES
In 1977 DES [8] was published as an official FIPS and although considered insecure nowadays, it
had a major influence in modern cryptography. It was designed as a Feistel cipher [9] (cryptosystem 2).
Cryptosystem 2 (Feistel cipher). Let S be the set of states such that si = (sL, sR), where |sL| = |sR|
and sL ‖ sR = s, ∀s∈S , let KR be the set of round keys, g : S ×KR → S the round function and f the
function that (possibly) contains the non-linear operations of the block cipher. Then, the round function
used in the encryption procedure is given by
g((sLi−1, sRi−1), ki) = (sRi−1, s
Li−1 ⊕ f(sRi−1, ki)) (2.8)
where i = 1, . . . , N , for some number of rounds N ∈ N defined by the key schedule. Moreover, the
round function for the decryption procedure is given by
g−1((sLi , sRi ), ki) = (sRi ⊕ f(sLi , ki), s
Li ) (2.9)
In 1992 and after having (re)discovered differential cryptanalysis, Shamir and Biham published the
first theoretical attack on DES, although practically infeasible at the time due to its complexity. Later on,3By difference, one means an operation, usually XOR.
8
a practical attack was indeed discovered using linear cryptanalysis and in the years that followed the
complexity of the attacks on DES confirmed that the standard had become deprecated. The need for a
modification on DES or the design of a new algorithm had become of utmost importance.
The abrupt growth of computational capability at the time made it clear for researchers that the 56-
bit key size of DES was really small for the demanding security of the algorithm. In order to increase
its security against brute-force attacks and without changing its core procedure several parties came
up with a straightforward solution (presented in cryptosystem 3): instead of using a single key and
performing a block encryption, each block of plaintext is subject to 3 rounds of block encryption using
three (possibly) distinct keys. Note that the main goal of the cryptographers at the time hinged in solving
the problem without having to create a new algorithm, which would save time and money because there
would be no need to replace all hardware mechanisms that had DES implemented.
Cryptosystem 3 (Triple DES). Let Ek and Dk be the encryption and decryption procedures for the DES
algorithm using the 56-bit key k. The encryption and decryption procedures for Triple DES are given,
respectively, by:
ek(x) = Ek3(Ek2
(Ek1(x)))
dk(y) = Dk1(Dk2
(Dk3(y)))
(2.10)
where x ∈ P, y ∈ C and k = (k1, k2, k3) is either a 168, 112 or 56-bit key, based on the keying option.
Three options were available for the keys:
1. (k1 = k2 = k3)⇒ 56-bit key;
2. (k1 = k3 6= k2)⇒ 112-bit key;
3. (k1 6= k2 ∧ k1 6= k3 ∧ k2 6= k3)⇒ 168-bit key;
3DES is still considered to be secure due to the impracticability of the currently known linear crypt-
analytic attacks that require an infeasible number of known plaintext-ciphertext pairs. However, the
previous statement is only true for 168-bit keys as options 1 and 2 have been considered deprecated.
2.2.1.3 AES
Aiming to replace the encryption standard to cope with the modern-day demanding security, NIST
decided to launch an invitation to tender for the new encryption standard named AES. Several proposals
were submitted (21 to be precise) and after being subject to a thoroughly security analysis, each of the
five finalists were considered to be secure. The choice of the Rijndael cipher [10] as the algorithm for
the AES was based on its performance, versatility, simplicity and implementation details. In 2002, AES
was admitted as the official encryption standard. It is a symmetric cryptosystem based on an iterated
block cipher and unlike DES, the cryptosystem does not follow a Feistel network, but instead a SPN [11],
which is briefly descripted in cryptosystem 4 and whose round function is illustrated in Figure 2.1.
Cryptosystem 4 (Substitution Permutation Network). Let l,m ∈ Z+, let πS : {0, 1}l → {0, 1}m be an
S-Box and πP : {1, . . . , lm} → {1, . . . , lm} be a P-Box. Consider K = {(k1, . . . , kn+1) : kj ∈ {0, 1}lm},
9
where n is the number of rounds and P = C = {0, 1}lm. Let S be the set of states and consider two
round functions: g : S ×K → S which is given by:
g(si−1, ki) = πP (πS((si−1 ⊕ ki)|l) ‖ πS(((si−1 ⊕ ki)|2l)|l . . . ‖ πS((si−1 ⊕ ki)|l)) (2.11)
and the round function f : S ×K → S, given by
f(si−1, ki, ki+1) = (πS((si−1 ⊕ ki)|l) ‖ πS(((si−1 ⊕ ki)|2l)|l . . . ‖ πS((si−1 ⊕ ki)|l))⊕ ki+1 (2.12)
The encryption procedure consists in applying the g function n − 1 times followed by the f function,
where s0 = x.
Note that in the above cryptosystem the P-Box [7] is not applied in the last round thus allowing the
algorithm to be used for decryption without appropriate modifications.
AES block size is of 128 bits and the standard specified three4 possible key sizes: 128, 192 and 256
bits. There is a trade-off on security and performance directly related to the size of the key, since the
number of rounds of the algorithm varies according to the key length. For 128, 192 and 256-bit keys, the
number of rounds is, respectively, 10, 12 and 14. Nevertheless, even for 128-bit keys, the currently known
attacks and the foreseen computational capability in a near future lead to the conclusion that AES is
secure as a block cipher, regardless of the key length chosen. That is the reason AES is impregnated
in a large majority of modern-day cryptographic schemes or protocols that have the need to provide
secrecy.
A high-level description of the algorithm’s main functions is going to be assembled followed by the
algorithm’s pseudocode in algorithm 1.
High-level Description
In the 2001 FIPS publication of AES [12] some functions were introduced. The same names are
herein being used and their informal definitions is presented:
• AddRoundKey: performs an XOR operation between the current state and the current round key;
• SubBytes: replaces each byte of the current state for its correspondence on a fixed lookup table;
• ShiftRows: shifts the bytes of each row of the state according to a (fixed for each row) permutation;
• MixColumns: multiplies each column of the state by a fixed polynomial p(x).
4The Rijndael cipher was more versatile in the subject, as it allowed more key and block sizes (multiples of 32 bits between 128
and 256 for both cases).
10
. . .si−1<1>si−1 si−1<m>
· · ·ki<1>ki ki<m>
ui<1> · · · ui<m> ui
vi<1> · · · vi<m> vi
· · ·si<1> si<m> si
πS πS
πP in the indexes of vi
l bits
Figure 2.1: Encryption round of a SPN. Corresponds to the round function g from cryptosystem 4. It is
used in all rounds except the last.
Algorithm 1 AES algorithm
1: procedure AES(K,M ) . Encrypting x with K
2: state←M ;
3: (K1, . . . ,KN+1)← KeySchedule(K);
4: AddRoundKey(K1, state);
5: for r = 1 to N do
6: SubBytes(state, πS);
7: ShiftRows(state, πP );
8: if r ≤ N − 1 then
9: MixColumns(state);
10: end if
11: AddRoundKey(Kr, state);
12: end for
13: return c← state;
14: end procedure
2.2.2 Block Cipher Modes of Operation
Block ciphers are very useful in modern cryptography, but they are only able to encrypt or decrypt
one block of fixed size data. Block cipher modes of operation were created so that one is able to encrypt
a piece of data of arbitrary length using a block cipher and the way these modes make use of the block
cipher at hand is very relevant for the security of the cryptosystem, for one can induce flaws in the
cryptosystem with a bad usage of the block cipher, even if the latter is considered to be secure against
all known attacks.
Throughout this section let B be an arbitrary block cipher with b-bit block size, let P be an m-bit
11
plaintext, C an m′-bit ciphertext and let n be the number of blocks of the message at hand. Moreover,
consider the following notation:
• EBk := B block cipher encryption function using key k;
• DBk := B block cipher decryption function using key k;
• pi := ith b-bit block of the plaintext;
• ci := ith b-bit block of the ciphertext;
•nn
i=1
xi := x1 ‖ x2 ‖ · · · ‖ xn, where ‖ is the concatenation operator.
2.2.2.1 ECB
ECB mode is the most straightforward mode of operation for block ciphers. Its encryption and de-
cryption functions are, respectively, the following:
eECBk (P ) =
nn
i=1
ci,where ci = EBk (pi),∀i∈In
dECBk (C) =
nn
i=1
pi,where pi = DBk (ci),∀i∈In
(2.13)
where m = m′ is a multiple of the block cipher’s block size. For this reason padding is advised and it is
discussed in section 2.2.2.7. A graphic representation of the encryption and decryption procedures for
the ECB mode is presented in figures A.1a and A.1b, respectively.
2.2.2.2 CBC
Unlike ECB mode, CBC is a widely used mode of operation and is often suited to be used for au-
thentication purposes due to its ripple effect. Its encryption and decryption procedures are as follow:
eCBCk (P, IV ) =
nn
i=1
ci,where ci = EBk (ci−1 ⊕ pi),∀i∈In
dCBCk (C, IV ) =
nn
i=1
pi,where pi = ci−1 ⊕DBk (pi),∀i∈In
(2.14)
and such that c0 = IV . Figure A.2 contains a graphic representation of both cases.
Both encryption and decryption functions in (2.14) have an additional argument IV , which is a ran-
dom5 b-bit initialization vector (IV) and whose role is to contribute to the XOR of the first iteration. By
construction, one can easily observe that CBC mode encryption cannot be computed in parallel, since
each iteration depends on the previous ciphertext; on the other hand though, the decryption mechanism
can be parallelized, since each plaintext block pi can be obtained deterministically provided knowledge
of the tuple (k, ci1, ci). Due to its sequentiality, CBC is extremely susceptible to errors in transmission,
5The predictability of the IV gives room for feasible attacks on the cryptosystem. It is further discussed in section 2.6.7.
12
mainly triggered by noise in the communication channel induced either by an adversary or by environ-
mental conditions, as they propagate to every subsequent block.
2.2.2.3 CFB
CFB mode can be seen as a synchronous stream cipher [13]. Each plaintext block pi is encrypted
by applying an XOR operation with a keystream element ki, yielding the corresponding ciphertext block
yi. Each element ki of the keystream k is generated according to
ki = EBk (yi−1),∀i∈N (2.15)
and for the first block (i = 1) one has y0 = IV . This procedure is illustrated in Figure A.3a and can be
interpreted as follows:
eCFBk (P ) =
nn
i=1
(pi ⊕ ki)
dCFBk (C) =
nn
i=1
(ci ⊕ ki)(2.16)
where ci = pi ⊕ ki for i = 1, . . . , n. As for the CFB decryption, it is important to note that the block
cipher’s encryption EBk is used instead of the block cipher’s decryption DBk .
Since the keystream generator function depends on the previous ciphertext block, this mode cannot
be parallelized for encryption.
2.2.2.4 CTR
The original modes of operation published as FIPS in 1981 did not include CTR mode. Only in
2001 was it added as a standard mode of operation, the same year of the public announcement for the
consideration of AES as an effective block cipher.
The general idea behind this block cipher mode of operation (BCMO) is to handle a counter through-
out the operations. A b-bit value IV is chosen as the initial counter value and thereafter every counter
is computed based on the previous one. The block cipher is used to encrypt the counter block and
use its output to perform an XOR with the plaintext block, yielding the ciphertext block. This procedure
is illustrated in Figure A.4 and as one can easily observe, CTR mode can be seen as a synchronous
stream cipher [13].
More formally,
eCTRk (P, IV ) =
nn
i=1
(EBk (ti)⊕ pi), where t1 = IV
dCTRk (C, IV ) =
nn
i=1
(EBk (ti)⊕ ci), where t1 = IV
(2.17)
where EBk is the block cipher’s B encryption procedure, pi and ci are the ith plaintext and ciphertext
block, respectively and ti is the ith counter block such that
ti = ctr(ti−1) (2.18)
13
where ctr : {0, 1}b → {0, 1}b is the counter function.
There are several possibilities for the behaviour of the aforementioned ctr function, but the NIST
recommendation [14] goes for the Standard Incrementing Function, which is given by
ctr(x) = x|b−m ‖ (x|m + 1 mod 2m); (2.19)
where x is a b-bit word, x|m represents the last m bits of x, x|b−m represents the first b−m bits of x and
m ∈ N : m ≤ b is the counter length.
In contrast with CBC mode, the IV herein used does not need to be random, it just needs to be unique
for each encryption under the same key. In other words, CTR mode’s security lies in the uniqueness of
the pair (ti, k) for all the encryptions performed. Therefore, there is an upper bound ul for the length (in
bits) of a message to be given as input to the CTR encryption scheme, which is given by
ul = b× 2m (2.20)
because if the number of blocks exceeds the cardinality of the set of possible counters (n > 2m), then
t2m+i = ti,∀i≤2m , which cannot happen, otherwise CTR’s security becomes compromised6.
CTR mode is very suitable to be used in situations where the time complexity of the encryption
algorithm is of essence, as it can be fully parallelized. The only pre-processing needed for this mode
is the computation of the counter blocks, which is done in O(n) time, where n is the length of the input
data, because there are n/k blocks where k is the fixed block length and each increment is done in time
O(1).
2.2.2.5 CCM
The block cipher modes of operation discussed so far provide secrecy to the data at hand. Notwith-
standing, there are modes which apart from secrecy, also provide authentication. CCM is one of those
modes and combines the CTR mode with the CBC-MAC mode, the former for secrecy purposes and the
latter for authentication.
CBC-MAC is very similar to CBC mode’s encryption (see Figure A.2a), with the exception that instead
of the algorithm returning C, it returns only the last block cn, i.e.
CBC-MAC(x, k) = eCBCk (x)|b (2.21)
CCM [15] mode interleaves the authentication and confidentiality steps, taking as input a 3-tuple
(N,H,P ) such that N is a nonce (number used only once) intended to be used as the IV by the CTR
mode of operation, H is the header, which is data to be authenticated but not encrypted and P is the
plaintext that is going to be subject not only to authentication but also to encryption. The algorithm has
several pre-requisites, namely all the operations are done using the same key k and there is a formatting
function that takes as input a 3-tuple (N,H,P ) as above and returns a sequence of bitstring blocks.
There are some situations in which pieces of data may be of public knowledge and therefore there
is no need to encrypt them, as one would only be increasing the memory and computational over-6Section 2.6.7 further discusses this topic.
14
head. CCM provides a thorough solution to this potential problem, as it allows the authentication of
non-encrypted data without extending the length of the ciphertext.
2.2.2.6 GCM
Galois Counter Mode (GCM) [16] is another mode of operation that comprehends both authentication
and confidentiality. This combined mode makes use of an adapted version of CTR to encrypt the data
and the integrity and authenticity is granted by the Galois mode of authentication. The latter is known
as Galois Message Authentication Code (GMAC) and is based on a keyed-hash function which, even
though lacking the title of cryptographic hash function (see section 2.3), is well suited for the job. With
this mode it is imperative that the pair (v, k) is never reused for any given input data, where v is the IV
and k is the key. The uniqueness requirement on the IVs is necessary to grant the system immunity to
malleability by the authentication mechanism.
2.2.2.7 Padding
For the ECB and CBC modes, the length of the plaintext must be a multiple of the block size, thus
one must7 pad the plaintext prior to encryption. There are several padding techniques used by distinct
types of algorithms
Bit Padding
One of the most used padding techniques for BCMO is called bit padding and consists in appending
a bit 1 to the end of the plaintext and filling the remaining r = n× b− (m+ 1) bit fields with the bit value
0 such that the length of the nth block is b, where n =⌊mb
⌋+ 1.
PCKS7
PKCS7 padding [18] is another widely used padding technique and consists in checking how many
remaining bytes are there to the end of the block (k = 8× n− 8×m) and pad the message with k bytes
each valued k. Note that if m is already a multiple of the block size, the padding must be performed
either way, because the recipient of the message is always expecting a padded message; thus, a new
block must be added to the end of the plaintext. For this reason, this padding technique is bounded by
the maximum value of 256 bytes, for it must not be used in block ciphers whose block size is greater
than 256 bytes, i.e., 2048 bits.
7Although out of the scope of this text, there are methods to prevent the use of padding for ECB and CBC. These are named
of ciphertext stealing methods [17] and allow the ciphertext to have exactly the same length as the plaintext, while increasing the
complexity of the algorithm.
15
Usually, by padding a message, one gains the advantage to hide the true length of the plaintext.
However, if the padding is not executed properly, some vulnerabilities may rise and an adversary may
be able to successfully exploit them. Notwithstanding, when using a padding scheme, the plaintext may
become vulnerable to a Padding Oracle Attack (POA), which is explained in section 2.6.5. Therefore,
one should beware whenever applying a padding algorithm.
2.2.3 Asymmetric Cryptography
There are clearly some issues related with the symmetric cryptographic systems since all the users
that can either encrypt or decrypt data must know the unique key k a priori. If it’s physically impossible
for the users to share this information and if there is no secure channel to transmit the key then they can’t
use a symmetric key cryptographic system in order to change confidential information. The concept of
asymmetric cryptography emerged to successfully work around this problem.
A public-key cryptographic system is based on a key-pair (kpub, kpriv), where kpub stands for the
public key and kpriv for the private key, the first being known publicly and the latter known only by the
user. Herein, there is a slightly distinct mode of operation when compared to a symmetric cryptosystem,
since the owner of the private key kpriv uses it only to decrypt any received information, which has
been encrypted with his public key kpub and could have been sent by someone. For example, when
Alice wants to send Bob a message, she encrypts it with Bob’s public key Bpub and then he decrypts it
with his private key Bpriv. Asymmetric cryptosystems rely upon the infeasibility of certain mathematical
problems.
The major shortcoming of asymmetric cryptographic systems is their high computational complexity,
when compared with symmetric cryptosystems.
2.3 Cryptographic Hash Functions
A Hash function outputs a fixed-length message on any given input of arbitrary length. Therefore,
it can be a very useful tool in modern cryptography, which has led many researchers to study their
properties. For a given hash function to be considered as a cryptographic hash function it must satisfy
the following properties:
(i) Efficiency: The computation of the hash value must be incredibly fast.
(ii) One-Way Function: It’s infeasible to invert.
(iii) Avalanche Effect: A small change in the input of the hash function produces a very distinct output.
(iv) Collision Resistance: It is very hard to find two distinct inputs with the same image.8
8Note that the complexity of a birthday attack, O(2m2 ) for an m-bit message digest, is an upper bound for the best collision
resistance.
16
Several hashing algorithms were created and among the remarkably popular are Message Digest
(MD) and Secure Hashing Algorithm (SHA). The most widely used hash function from the first family
is MD5 and it was deprecated as soon as an attack was found successful in a considerable short time
frame. As for the second family, the most widely used function is SHA-1 [19] and it is considered to
be insecure due to a successful collision attack [20] published by Google in February of 2017. Several
theoretical attacks had already been found and thus this cryptographic hash function was considered to
be on the edge of failure, foreseeing that a collision would be found soon enough. Hence the creation of
SHA-2 [19] was mandatory, a version with four variants (SHA-224, SHA-256, SHA-384 and SHA-512)
that extends the set of possible hashes to a point where the present known collision attacks become
infeasible.
The most straightforward use one can give to a hash function is for integrity purposes, i.e., given a
message m and a cryptographic hash function h, the computation of h(m) yields an n-bit digest that can
be used to check the integrity of the messagem. Due to the collision resistance property of cryptographic
hash functions it is most likely that h(m) 6= h(m′) for any m 6= m′ and cryptographic hash function h.
Thus, for any word y, if h(y) = h(m) one can consider with a high level of trust y to be equal to m. This
capability of providing integrity to the messages at hand is specified in Example 2.3.1.
Example 2.3.1 (Integrity). Let h be a cryptographic hash function of public knowledge and consider that
Alice sends Bob a message x along with its digest h(x). Bob receives the pair (m, d), where m is the
message and d the message digest and wants to verify whether m is in fact x, the message that was
sent by Alice. Thus he computes h(m) and accepts the message as valid if and only if h(m) = d.
The reader should not be mislead for the example above is only successful in an unreal situation.
Since h is of public knowledge, any adversary capable of interfering with the communications would
be able to change (m, d) for (m′, h(m′)), for some malicious message m′, and Bob would successfully
conclude the integrity verification without noticing that the message had been tampered with. The use
of cryptographic hash functions require a deep understanding of the involved components in order to
satisfy the desired properties and strengthen the system at hand against vicious attacks. Most modern-
day cryptographic algorithms make use of cryptographic hash functions, as their properties provide
extremely advantageous behaviours to prevent eventual vulnerabilities.
2.3.1 SHA-256
A member of the SHA family, SHA-256 is able to generate a 256-bit message digest of any message
with binary length b satisfying 0 ≤ b < 264, for the padding scheme associated to the algorithm’s con-
struction requires b to be written as a 64-bit number. Following, a high level description of the steps of
the SHA-256 algorithm [19] is described, for an arbitrary b-bit input message x :
1. Pad and parse x into x1, . . . , xn;
2. Initialize i = 0 and zero-valued 32-bit hash values h1, . . . , h8;
3. Build the message schedule mi based on xi;
17
4. Build working variables {vk}k∈I8 , each based on the value of hk;
5. Update the values of global variables using mi and vk ∀k ∈ I8;
6. Update the hash values hk ∀j∈I8 , using the values of the variables obtained in the previous step;
7. Compute i = i+ 1 and if i < n then go to step 3, otherwise return the value h = h1 ‖ · · · ‖ h8.
2.3.2 HMAC
The use of cryptographic hash functions has been associated with authentication purposes since the
first definition of a hash-based message authentication code (HMAC) [21] in 1997.
Definition 2.3.1. Let m be a message, k an l-bit key and h a cryptographic hash function whose com-
pression function’s block size is of n-bits. The following function f defines the HMAC-h:
f(k,m) = h((k′ ⊕ opad) ‖ h((k′ ⊕ ipad) ‖ m)) (2.22)
where ipad and opad are fixed strings and k′ is the resulting key such that for j = n− l:
k′ =
k ‖ 0j if l < n
h(k) if l > n
k otherwise
(2.23)
HMAC grants both integrity and authenticity to the input data, but while the first follows trivially from
the use of a cryptographic hash function, the latter requires the key k to be shared solely between the
two involved parties. Clearly, if there are more than two parties with access to k, the recipient of the
messages will never be able to authenticate the sender.
2.4 Randomness
This section introduces the randomness concept and some useful definitions regarding this topic.
Randomness is a desired property for several algorithms, as it is a measurement of uniqueness and
unpredictability, very suitable for solving various problems in the field of cryptography. The higher known
level of randomness is theoretical, the next best thing however is extracted from physical elements,
for instance the movement of electrons. Throughout the years, researchers have been trying to develop
software algorithms that behave like a true random generator, but to no avail: true randomness is a prop-
erty yet out of reach by modern algorithms. This entails the well-known fact that hardware randomness
is better than software’s.
Definition 2.4.1. A random number generator (RNG) is an algorithm that generates an unpredictable9
sequence of values, i.e., if one uses a RNG to generate a sequence a1 · · · an for ai ∈ Σ, then a third
party cannot guess ai with probability non-negligibly greater than1
|Σ|.
9Infeasible to be computed by a polynomial time algorithm.
18
When the hardware at hand lacks a RNG and one needs to implement a random behaviour in a
certain algorithm, the only solution is to implement a RNG based on the entropy generated by software
available features. The problem is that there is no such mechanism providing true randomness: in com-
puter programming, the random number generators are pseudo-random number generators (PRNGs)
since the stream of values produced by these algorithms is only seemingly non-deterministic for the
whole process requires an input value, called seed, which makes the algorithm deterministic. So, when-
ever using a PRNG it is demanding that an adversary cannot feasibly obtain the used seed which means
that not all PRNG are suitable to be used in cryptographic algorithms. In order to make use of a PRNG
for cryptographic primitives, it must satisfy two very important properties:
1. Given an initial state of a sequence of numbers generated by the PRNG, say the first k bits of
the sequence, it is infeasible to compute the (k + 1)th bit with probability of success non-negligibly
greater than 1/2 (see next-bit test [22]).
2. It is infeasible to reconstruct the stream of numbers generated by the PRNG based on a known
internal state of the algorithm.
A PRNG satisfying the above properties and therefore suitable for cryptographic applications is named
a cryptographically secure pseudo-random number generator (CSPRNG). It is, however, extremely dif-
ficult to find a CSPRNG, since most PRNG are either vulnerable to extended personalized statistical
attacks or leak information upon the unveiling of some internal state. Many cryptographic algorithms
are very sensitive with respect to predictability, meaning that a CSPRNG is usually used in steps where
randomness is of essence, as for instance the generation of cryptographic keys or salts.
There is yet another definition [23] that needs to be addressed in order for the reader to efficiently
understand the concepts descripted in section 2.4.1.
Definition 2.4.2 (PRF). A family of functions {Fk : X → Y }k∈{0,1}∗ is a pseudo-random function (PRF)
if, for a randomly chosen instance function Fk, its output is indistinguishable (for a polynomial-time
algorithm) from the output of a random function R : X → Y , where X and Y are the domain and range
sets of the functions of the family, respectively.
PRFs are applicable in a wide variety of solutions as their properties are eximious.
2.4.1 Key Derivation
Regardless of the level of security of an underlying cryptographic algorithm, if one is able to obtain
the secret key used in the process then it becomes unreliable, with the danger of compromising all
the data that has been and/or is to be processed by it. In fact, it is possible for a cryptographic key
associated with a cryptosystem to be compromised without compromising any of the prior messages
encrypted under that cryptosystem. These systems are said to provide forward secrecy. Nevertheless,
one wants to always prevent adversaries from discovering the envisaged keys.
Throughout the years researchers have been using CSPRNGs to create and refine algorithms for the
generation of stronger cryptographic keys. These methods are called key derivation functions (KDFs)
19
and output an enhanced key for a given input secret. The increased resistance to attacks of the resulting
cryptographic keys makes them suitable for most real-world applications.
In 2000, RSA Laboratories published a specification [24] in which the PBKDF2 key derivation function
was introduced; it became quite popular and one of the most widely used nowadays. The password-
based key derivation function 2 (PBKDF2) takes as input five parameters:
PBKDF2(PRF, pass, salt, iter, len) (2.24)
where PRF is a pseudo-random function, pass and salt are octet strings such that the former is the
secret password and the latter the cryptographic salt, both to be used in the inherent PRF; iter is an
integer value corresponding to the number of iterations of the PRF and len is the length, in octets, of
the envisaged output key. The number of iterations is directly related with the level of security of the
procedure. The steps that describe the PBKDF2 algorithm can be found more specifically in [25].
There are several known KDFs but, among the most secure of its kind, PBKDF2 is considered to be
the better suited for using in real-world applications for it is the one with better performance [26].
2.5 Communication Protocols in Wireless Networks
Institute of Electrical and Electronic Engineers Standards Association (IEEESA) is an association that
develops standards for several technological fields, namely telecommunication and information technol-
ogy. They have been developing standards for over ten decades and among the published works is a
family of network protocols for parties trying to connect to a local area network (LAN) or wireless local
area network (WLAN), specified by the set S = {IEEE 802.1X : X is a unique identifier for the standard}.
The most relevant elements of S are going to be discussed, as one of them (WPA2) is considered the
most suitable protocol for wireless communication and is used nowadays throughout the world to provide
indirect access to the Internet to either personal or corporate devices without a cable connection.
2.5.1 WEP
The standard IEE 802.11 ∈ S [27] contains the description of WEP [28], an algorithm to provide data
secrecy and integrity to wireless networks such that the level of security granted would be equivalent to
the level of security of a wired network. WEP was proved insecure mainly due to the IV space being so
small for busy networks, since k is usually fixed in practice10 (recall that a stream cipher is vulnerable
against key reuse attacks). After the proof regarding WEP’s security break being publicly published
automated tools have been developed in order to recover the key used in the algorithm and nowadays a
WEP encryption can be broken in less than a minute.
10A personal computer is usually connected to a router acting as an AP and the password for the router is fixed, unless the user
changes it manually.
20
2.5.2 WPA/WPA2
IEEE 802.11i [29], an amendment for IEEE 802.11, was put into effect due to the WEP’s exploitable
flaws. The standard includes two new security protocols for communicating over a wireless channel:
WPA and WPA2, which were intended to replace WEP. Both provide authentication either by a PSK or
by an EAP, the latter requiring an authentication server. WPA’s encryption process differs from WEP’s
such that the former does not suffer from the same fragilities as its predecessor. Nevertheless, it has
been deprecated in 2012, due to its vulnerability to a message integrity code (MIC) recovery attack
[30]. This specification was mainly created as a preventive measure for hardware mechanisms that
could not support WPA2, the most recent version of WPA, which includes a CCM-AES-based encryption
mode named CCMP [29] as a replacement for the TKIP [27] encryption and grants data confidentiality,
authentication and access control.
WPA2 communication protocol is composed of three main stages:
1. Initial authentication;
2. 4-way handshake;
3. Group-key handshake;
There is an entity called Authenticator whose role is to authenticate the parties that intend to join
the network. If WPA2-PSK is in effect then it solely communicates with the client, but for EAP mode it
acts as an intermediate point between each supplicant and the authentication server. This is the general
layout of wireless networks that make use of the WPA2 communication protocol and it is illustrated in
Figure 2.2.
AuthenticatorClient Server
Figure 2.2: Network Layout.
As abovementioned, there are two distinct modes for the WPA2 protocol: WPA2-PSK and WPA2-
EAP. These two modes only differ in the first stage of the protocol, the initial authentication step, whose
objective is to derive the PMK, a mid-step key that is used in the 4-way handshake to derive the PTK
and GTK. These two unique-per-session keys contain sub-keys that are necessary for encrypting and
decrypting the data flow between the client and authenticator. The group-key handshake is used for
updating the GTK and such that the authenticator can securely distribute it over all the authenticated
clients in the network; this key is used by the clients to decrypt multicast or broadcast data sent by the
authenticator.
2.5.2.1 Initial Authentication
The PSK mode of the WPA2 protocol has the advantage of being faster than EAP because it does not
need to go through an initial authentication step. In fact, the authenticator has a pre-defined password
pass and a SSID ssid (usually the name of the network) and computes PBKDF2(HMAC-SHA-1, pass, ssid, 4096)
21
AuthenticatorClient ServerRequest
Response
Accept Client
Confirm acceptance
Authentication protocol between Client and Server(Client and Authenticator derive PMK)
Figure 2.3: Extensible Authentication Protocol (EAP).
in order to build the PSK. The client must follow the same procedure but in order to do so he must pos-
sess the (private) pass and the (public) ssid. This option is usually chosen in personal networks where
each client trusts any other client that is able to successfully authenticate itself and connect to the net-
work. On the other hand, there may be situations where each client do not completely trust in other
clients that may connect to the network. Consider, for instance, the case of a corporate wireless network
such that there is a router authenticating several employees who dislike each other; there must be a
way to prevent each and every one of them to tamper with the data that is not intended for themselves.
Simply illustrated in Figure 2.3, EAP mode must be used for those situations, as its initial authentication
step provides pairwise authentication by deriving a PMK for each client.
Independently of the chosen mode for WPA2, at the end of the initial authentication step both the
client and the authenticator possess the PMK, which is the PSK for the case of WPA2-PSK.
2.5.2.2 4-way Handshake
After the initial authentication step, the authenticator will confirm that the client possesses the correct
PMK by asking for the decryption of certain data. Moreover, the GTK is also transmitted to the client.
The 4-way handshake is depicted in Figure 2.4 and comprises the following steps:
1. Client and Authenticator each generate a nonce nonce1 and nonce2, respectively;
2. Authenticator sends nonce2 to Client;
3. Client derives the PTK such that
PTK =PRF(gen)
and
gen = PMK ‖ nonce1 ‖ nonce2 ‖MAC1ADDRESS ‖MAC2
ADDRESS
(2.25)
where PRF is a pseudo-random function and MAC1ADDRESS and MAC2
ADDRESS are the MAC addresses
of the client and authenticator, respectively.
4. Client sends nonce1 and a MIC of that nonce to Authenticator.
22
AuthenticatorClient
Generate nonce1 Generate nonce2nonce2
Derive PTK(nonce1,MIC)
Derive PTK and generate GTK
(eGTK,mGTK)
Decrypt eGTKAcknowledgement
Figure 2.4: WPA2 four-way handshake.
5. Authenticator derives the PTK and generates the GTK.
6. Authenticator encrypts the GTK with PTK (eGTK) and computes a MIC of the encrypted GTK (mGTK)
and sends the pair (eGTK,mGTK) to Client.
7. Client decrypts eGTK and sends an acknowledgement to Authenticator consisting of a MIC of the
decrypted GTK.
After performing the four-way handshake, both client and authenticator have the PTK and GTK with-
out ever disclosing these two keys and each of them knows that the other party also possesses the
same keys.
2.5.2.3 Group-key Handshake
The GTK needs to be updated every time a client disconnects from the AP (authenticator) or upon
the expiry of a timer as a security measure. The group-key handshake is a two-way handshake as
depicted in Figure 2.5 and comprises of the following steps:
1. Authenticator updates GTK;
2. Authenticator encrypts GTK with PTK (eGTK) and generates a MIC (mGTK);
3. Authenticator sends the pair (eGTK,mGTK) to Client;
4. Client decrypts eGTK and sends an acknowledgement message to Authenticator consisting of a MIC
of the decrypted GTK.
AuthenticatorClient
Encrypt GTK and generate MIC(eGTK,mGTK)
Verify MIC and decrypt GTK
Acknowledgement reply
Figure 2.5: WPA2 group-key handshake.
23
2.6 Known Attacks
Cryptographic systems are as trustworthy as their robustness to attacks, which means that a cryp-
tosystem that has survived to a countless number of distinct attacks is considered to be reliable for
practical use, while systems that have not suffered such tests do not transmit such confidence.
2.6.1 Brute Force and Dictionary Attacks
Given a certain cryptographic algorithm, a brute force attack consists in trying all possible inputs to
the algorithm and checking whether each input leads to a desired output. For example, a hacker might
try to break into a third party’s personal computer by trying all possible passwords, each at a time.
A dictionary attack is no more than a brute force attack which narrows the space of possible words by
only considering specific words, or subsets of those words, based on some alphabet. Dictionary attacks
have proven to be deadly for some cryptographic systems, especially for password-hacking purposes.
This is one of the main threats to passwords that are dictionary-based.
Countermeasures
In order to prevent brute force attacks, the number of words that can be written with alphabet Σ must
be as high as possible without compromising the computational capability of the system at hand. As for
dictionary attacks, not only the previous requirement must be met, but also the passwords or secret keys
must be long enough and as much randomized as possible.
2.6.2 Man In The Middle Attack
In a Man-in-the-middle (MiM) the adversary is able not only to eavesdrop the communications, but
also to actively participate in the exchange of messages, in such a way that the other parties are not
aware of the adversary’s true identity.
Example 2.6.1. Consider the following case scenario represented in Figure 2.6: Alice and Bob are
communicating through a non secure channel C, which is being eavesdropped by Eve, a third party
for which none of the messages sent in the channel are directed for. Since C is not secure, Eve can
listen to the communication and may be able to impersonate either Alice or Bob (or even both of them,
in a worst case scenario). Alice may send a message M intended for Bob, but Eve is able to intercept
the message M , change it for a malicious message M ′ and send M ′ to Bob, who has no idea that the
original message was M instead of M ′.
A MiM attack can even be performed by an adversary which does not gain any additional information
on the ciphertext and whose solely purpose is to disrupt all the communications by jamming the chan-
nel(s) with junk data and therefore preventing the reception of any message by the targeted end user.
Nevertheless, most MiM attackers intend to extract information by acting as the other end party for each
of the communication entities.
24
Alice Eve BobM M ′
M M ′
Figure 2.6: Man in the middle attack. Eve is able to intercept the message and/or jam the communication
channel at will.
Countermeasures
It is very hard to detect every intrusion in a wireless network, especially one where the elements of the
network are under restrictions of power consumption and activity extent. The most common procedure of
preventing an active MiM attack is to always verify each received message’s integrity and authentication.
Even though there are intrusion detection systems for alerting such undesired interference in wireless
networks and methods to grant extra layers of security such as VPN connections, their consideration is
out of the scope of this text for their usage is not suitable to the discussed problem due to constraints
imposed to some parties.
2.6.3 Birthday Attack
Categorized into the set of collision attacks, a birthday attack is no more than a brute force attack
where the attacker has some useful probabilistic insight that reduces the set of possible outputs for the
same bit security, making it more efficient than a simple brute force.
Problem 2.6.1 (Birthday Problem). Given a room with n people, what is the probability that k of those
people have the same birthday?
Let Pk(n, d) be the value of the probability that holds the answer to the Birthday Problem, where d is
the number of possible values for each element, i.e., d = 365 for this specific case. Birthday attacks for
cryptographic hash functions are based on Problem 2.6.1 with k = 2. In fact, the probability that any two
out of n people have the same birthday is given by
P2(n, 365) = 1− 365!
(365− n)!× 365n
which follows trivially from a probabilistic analysis of Problem 2.6.1.
Consider an arbitrary hash function h : X → Y , such that ∀y∈Y : |y|2 = b, where |y|2 is the number
of bits of y. One can adapt the Birthday Problem to ask the following question: ”Providing n randomly
chosen inputs {x1, . . . , xn} =: Xn such that xi ∈ X for all 1 ≤ i ≤ n, what is the probability that
h(xi1) = h(xi2), for some xi1 , xi2 ∈ Xn?”. The answer to the previous question consists simply in the
value of P2(n, 2b) and from [7, 31] one can conclude that P2(n, 2b) ≈ 1 − e−n2/2b+1
, which is no more
than the probability of finding a collision for a cryptographic hash function whose outputs are b-bit words.
Let mb be the expected number of distinct outputs of h such that P2(mb, 2b) ≥ 0.5. Then mb = 2b/2
represents a lower-bound to the number of outputs of h to be computed such that a collision is expected
to occur and is usually referred to as birthday bound.
25
This probabilistic approach entails a reduction of every cryptographic hash function’s bit security
and in order to prevent these types of attacks one has to ensure that it is computationally infeasible to
compute 2b/2 distinct outputs, for hash functions that return b-bit digests.
2.6.4 Replay Attack
A replay attack is a special case of a MiM attack. Here an adversary is able to gain additional
information by eavesdropping and saving a transmitted message or part of a message, either in another
protocol, or in another run of the same communication protocol which was eavesdropped. This attack is
based on re-transmission of data.
Countermeasures
There are several possible procedures to prevent a system of being vulnerable to a replay attack.
In general, for a cryptographic system to be resistant to this type of attack, one must assure each
communication session to be uniquely identified, what can be achieved by granting each message a
session identifier.
Another method for preventing replay attacks consists in timestamps. Suppose that Bob has a clock
from which he periodically broadcasts its real value time t, together with a message authentication code
(MAC), for authentication purposes. Whenever Alice wants to send Bob a message x, she encrypts x
into y with some cipher and then generates a guess timestampt′ of Bob’s real current time t′, based on t,
which is then authenticated with a MAC. Upon receiving the whole package at time t′′, Bob only accepts
the message for further checking its authentication and integrity if t′′ − timestampt′ < ε, for some ε > 0.
Note that if Eve wants to replay a message she is able to do it for as long as t′′ − timestampt′ < ε, i.e.,
this procedure is not completely reliable for replay attack protection, given that if either ε is not small
enough or if an attacker can replay quickly enough regardless of epsilon’s value, then the cryptographic
scheme is compromised.
2.6.5 Padding Oracle Attack
To perform a padding oracle attack one must have access to a padding oracle11 O. Among the
BCMO discussed in chapter 2.2.2, the modes that require padding (ECB and CBC) are vulnerable to a
padding oracle attack, but while this attack does not completely break ECB, for CBC it’s lethal due to its
encryption and decryption mechanisms. Nevertheless, note that if padding is used in either CFB, OFB
or CTR, then these encryption schemes also become vulnerable to this type of attack as well, provided
that no authentication layer is associated with it.
A padding oracle attack for the CBC encryption is going to be exemplified and discussed throughout
this section. Consider the following situation:
• Alice (A) and Bob (B) want to communicate through a communication channel and share a com-
mon secret s, with which they decide to use as key for a block cipher of their choice whose block11A padding oracle is a system that binarily answers to the question: ”Is this message properly padded?”.
26
size is b together with CBC mode to provide secrecy to their messages.
• A and B agree on padding the messages with padding as in [32].
• Victor (V), an adversary, has access to the communication channel and is able to perform a MiM
attack. He has also access to a padding oracle that answers whether an encrypted message y is
correctly padded.
Consider that A encrypts a message x and sends the encrypted result y along with the IV through
the communication channel , which is intercepted by V. Assume w.l.o.g. that n = 2, i.e., y = y1 ‖ y2,
where
yi =
bn
j=1
yij ,∀i∈I2
and
O(y) = true.
(2.26)
Recall CBC decryption:
dCBCs (C) =
nn
i=1
Pi
where Pi = dbs(Ci)⊕ Ci−1 and C0 = IV .
V now decides to change the last byte of C1 in the following way: C∗1b = C1b ⊕ zb ⊕ 0x01, where
zb is a guess for the last byte of P2 and 0xh is the hexadecimal representation of a byte such that
h ∈ {00, 01, . . . , FE, FF}. After having replaced C1b for C∗1b, V has C∗ = C∗1 ‖ C2, where C∗1 = C11 ‖
. . . ‖ C1(b−1) ‖ C∗1b. Then V makes use of the oracle in order to know if C∗ is properly padded by calling
O(C∗). Note that
dbs(C∗1 )⊕ IV = P ∗11 ‖ . . . ‖ P ∗1b 6= P1 (2.27)
due to the avalanche effect of the block cipher; on the other hand,
dbs(C2)⊕ C1 = P21 ‖ . . . ‖ P2(b−1) ‖ P ∗2b (2.28)
meaning that only that last byte of P2 is changed while the remaining b− 1 bytes have not been altered.
Let db correspond to the last byte of dbs(C2). Since P ∗2b = C1b ⊕ zb ⊕ 0x01 ⊕ db and C1b ⊕ db = P2b then
P ∗2b = P2b ⊕ zb ⊕ 0x01 by the associativity property of the XOR operation.
At this point, there are two possible cases:
1. P was not padded prior to encryption.
2. P was padded prior to encryption.
In case 1, the conclusion is straightforward:
(a) (zb = P2b) ⇒ (P ∗2b = 0x01), which corresponds to correct padding for PKCS7 and in this case
O(C∗) = true. This means that V has found the last byte of P2.
27
(b) (zb 6= P2b) ⇒ (P ∗2b 6= 0x01), meaning that O(C∗) = false, so V chooses a fresh12 guess zb and
repeats the process.
Thus O(C∗) = true if and only if zb = P2b.
In case 2, situation (b) is still valid, but O(C∗) = true 6⇒ zb = P2b, because now P2 = x1 ‖ . . .‖xr ‖x′1 ‖
. . . ‖ x′k where r + k = b, k ∈ (Z256 \ {0}) and x′i represent the padded bytes such that x′i = k, ∀i∈Ik . For
k > 1 there are two possible values of zb that entail an acceptance of the modified word by the oracle,
which are:
(2a) (zb = P2b)⇒ (P ∗2b = 0x01);
(2b) (zb = t)⇒ (P ∗2b = k), for some t ∈ (Z256 \ {P2b});
Therefore, in order for V to differentiate which of these two acceptances is the true last byte of P2, V
modifies the second to last byte of C1 by flipping a positive arbitrary number of bits, which will definitely
yield a distinct value than before. After doing so, V runs the previous procedure for the iterated guess zb
until he finds zb ∈ Z256 such that O(C∗) = true, in which case V is sure to have found the value of P2b.
After discovering the last byte P2b, V proceeds in trying to find the next (second to last) byte of P2,
i.e., P2(b−1). The same arguments can be applied to this situation, as follows:
C∗1b = C1b ⊕ P2b ⊕ 0x02
C∗1(b−1) = C1(b−1) ⊕ z(b−1) ⊕ 0x02(2.29)
thus C∗1 = C11 ‖ . . . ‖C1(b−2) ‖C∗1(b−1) ‖C∗1b and V keeps making calls to the oracle O(C∗) until it accepts
the input, in which case V has found the second to last byte of P2.
Note that in this case dCBCs (C∗) = P ∗1 ‖ P ∗2 where P ∗1 6= P1 due to the avalanche effect but P ∗2 =
P21 ‖ . . . ‖ P2(b−2) ‖ P ∗2(b−1) ‖ P′2b, where P ′2b = 0x02 is fixed independently of the guess z(b−1) and on the
other hand P ∗2(b−1) = C1(b−1) ⊕ z(b−1) ⊕ 0x02⊕ d(b−1), where d(b−1) is the second to last byte of dbs(C2).
Again, the same arguments 1 and 2 can applied to P ∗2(b−1) and the attacker only needs 511 attempts
in a worst case scenario in order to find the correct byte P2(b−1). Algorithm 2 contains the pseudocode
for the whole procedure and one can easily see that V is able to recover the whole plaintext P in time
O(mb), where m is the size of the ciphertext and b the block size. Since these values are usually not
large, this algorithm runs in efficient time. Lastly, note that the algorithm does not take into account case
2 but one can easily adapt it for this situation.
2.6.6 Stream Cipher Attacks
The underlying security of stream ciphers is based on their good usage, as an adversary can take
advantage if certain precautions are not taken. In general, stream ciphers are considered to be very
secure, provided that one does not reuse the key and run an authenticity check on every encrypted
message.
12By fresh, one means a value in the set of possible values that has not yet been chosen.
28
Algorithm 2 Padding Oracle Attack on CBC encryption
1: procedure POA(C,O) . Discovering P without s
2: Initialize zero valued array P with length nb bytes;
3: for i = n to 1 do
4: x← 1;
5: for j = b to 1 do
6: z ← 0;
7: A← false;
8: for k = j + 1 to b do
9: Cik ← Cik ⊕ Pik ⊕ x;
10: end for
11: while A == false do
12: z ← z + 1;
13: Cij ← Cij ⊕ z ⊕ x;
14: A← Ask oracle O if C is properly padded;
15: end while
16: Pij ← z;
17: x← x+ 1;
18: end for
19: end for
20: end procedure
Let S be a stream cipher with encryption and decryption functions ek and dk, for a given key k ∈ K
and assume that Eve is an adversary.
2.6.6.1 Key Reuse
Suppose that Eve is able to perform a MiM attack and let m1 and m2 be two messages such that
m1 6= m2 and assume w.l.o.g. that |m1| = |m2| = l. Since S is a stream cipher, there is a keystream
generator function g which produced the keystream k1k2 · · · based on some internal state. Let k′ be the
substring k1 . . . kn, n ≥ 1 such that y1 = ek(m1) = m1⊕k′ and y2 = ek(m2) = m2⊕k′. Upon intercepting
y1 and y2, Eve is able to compute y1⊕ y2 = m1⊕m2 due to the commutative and self-inverse properties
of the XOR operator.
Statistical analysis can now be applied to recover m1 and m2 with high degree of confidence. Let
Σ be the alphabet at hand, let m3 := m1 ⊕ m2 = m31 . . .m3l such that m3i ∈ Σ ∀1≤i≤l, let Xi be
a random variable representing the value of the ith element of an arbitrary plaintext x and consider
P (Xi = m3i) = pi, ∀i∈{1,...,l}. The set
Ci = {(a, b) : a⊕ b = m3i ∧ a, b ∈ Σ} (2.30)
contains the (possibly many) pairs whose XOR yields the intended ith element of m3. Assuming that the
29
probability distribution of the alphabet elements is not homogeneous, a simple approach is to choose
the pair (a, b) ∈ Ci that satisfies
max(a,b)∈Ci
P (Xi = a)P (Xi = b) (2.31)
By applying this method, Eve is able to recover the plaintexts with high confidence without knowing
the secret key used in the stream cipher’s encryption procedure. More complex, probabilistic relations
may be required to increase the confidence degree on the choosing of the pairs (a, b) that satisfy equa-
tion 2.31, for all 1 ≤ i ≤ l.
Countermeasures
The only countermeasure for this situation is to never use a key more than once. For ciphers that
include an IV as part of their input, the pair (IV, k) can be seen as the general key to the cryptosystem
and the key k may be used more than once, as long as the initialization vector IV does not repeat, which
is done in practice by randomly choosing an IV out of the set of possible IVs. Given the generally high
cardinality of the latter, CSPRNGs are the most common choice in order to maximize the underlying
algorithm’s performance.
2.6.6.2 Bit-flipping
Suppose that Alice and Bob communicate through a communication channel C on which Eve is
able to perform a MiM attack. Moreover, assume that Alice wants to send Bob a message m such that
[m]2 = m1 . . .mn and that ∃i,j∈N : 1 ≤ i ≤ j ≤ n for which Eve knows mi . . .mj . Alice encrypts m using
S encryption and sends it to Bob through C and Eve upon intercepting the message y = m⊕ k′, where
k′ is the most significant n-bit substring of the resulting keystream produced by the keystream generator
function, computes
y′ := y ⊕ (mE ⊕ v) (2.32)
where [mE ]2 = 0i−1mi . . .mj0n−j and [v]2 = 0i−1vi . . . vj0
n−j , such that the bitstring vi . . . vj is an
evil bitstring chosen by Eve. After performing the operation in (2.32) she sends the resulting message
through C to Bob, who upon receiving y′ decrypts it as follows
dk(y′) =k′ ⊕ y′
=k′ ⊕ (m⊕ k′)⊕ (mE ⊕ v)
=(m⊕mE)⊕ v
=m⊕ v
(2.33)
where [m]2 = m1 . . .mi−10j−i+1mj+1 . . .mn, thus
[m⊕ v]2 = m1 . . .mi−1vj . . . vimj+1 . . .mn (2.34)
Note that Eve does not know the secret key k shared between Alice and Bob but the knowledge of
bits of the message m being sent makes it possible to alter them at will, without Bob noticing.
30
Countermeasures
At first glance one might think that an integrity check would suffice to prevent this attack, but since
the assumption of Eve knowing bits of the message being sent holds (possibly the whole message),
then considering to produce a simple digest on the message is not enough, for if she has knowledge on
the entire message m she can easily compute h(m) for any cryptographic hash function h. Therefore
an authentication tag is needed in this situation given that Eve will not be able to silently tamper the
message without leading to the mismatch of the tag and the corrupted message.
2.6.7 Weaknesses of Block Cipher Modes of Operation
Even if a block cipher is considered to be secure and there is seemingly no way to break the cryp-
tosystem by an analysis on the cipher itself, one can make use of that block cipher repeatedly in order
to encrypt or decrypt messages of size larger than the input block size and do so in such a way that
compromises the security of the whole system. This subsection discusses the advantages and disad-
vantages of some of the BCMOs.
ECB
ECB mode is considered to be the less secure block cipher mode of operation, as it is not semanti-
cally secure. An adversary can indeed gain information of the plaintext based solely on the ciphertext
since the a given plaintext block is always encrypted to the same ciphertext block.
CBC
Recall the CBC block encryption function present in Figure A.2 and according to equation 2.14 for an
n-block message p = p1 . . . pn. Allowing the IV to be predicted by an adversary gives room to a feasible
chosen-plaintext attack on the cryptosystem at hand, where the adversary can efficiently recover any
previously sent message.
Assume that Alice and Bob communicate through a channel C and that Eve is an adversary eaves-
dropping C with access to an encryption oracle O, such that O(p) = eCBCk (p) for any b-bit block p.
Moreover, the oracle has an intrinsic IV generator function (equal to Alice’s) that produces a random
new initialization vector used in each call. Now, Alice intends to send a word m = m1 . . .mn to Bob and
in order to do so, she computes an initialization vector IV1 and encrypts m as in equation 2.14, yielding
the encrypted message y. Then she sends the pair (y, IV ) through C such that Bob is able to decrypt
the message. Consider that Eve is able to predict the next IV used by Alice (therefore by the oracle
as well); upon intercepting (y, IV ), she can recover m according to Algorithm 3 by applying n calls to
procedure PREDICT(yi, yi−1), where y = y1y2 . . . yn, |yi|2 = b ∀i∈N : 1 ≤ i ≤ n, pictured below:
1: procedure PREDICT(yi, yi−1) . yi, yi−1 are b-bit values
2: Initialize a b-bit value y′;
3: Predict the initialization vector used in the next encryption: IVp;
31
4: while y′ 6= yi do
5: Guess b-bit value m′;
6: Compute M := yi−1 ⊕ IVp ⊕m′;
7: Call oracle: y′ ← O(M)
8: end while
9: return m′;
10: end procedure
The capability of Eve to predict the IV successfully is the key to the feasibility of the attack. Indeed,
suppose that it’s highly probable tht Eve is not able to correctly predict the IV, for her guess IVp is such
that P (IVp 6= IVnew) → 1, where IVnew is the new random IV generated by the oracle O. Note that for
any b-bit block x, the query O(x) returns EBk (IVnew ⊕ x). By taking a guess m′, Eve wants to compute
M such that m′ ⊕ IVold = IVp ⊕M ⇒M = m′ ⊕ IVold ⊕ IVp, where IVold is either the value of the IV of
the pair (y, IV ) for when Eve is trying to find the first plaintext block, or the value of yi−1 for when Eve is
trying to find the ith plaintext block. Then, by calling the oracle with input M , the following holds:
O(M) =EBk (IVnew ⊕M)
=EBk (IVnew ⊕m′ ⊕ IVold ⊕ IVp)(2.35)
however, since [IVnew⊕IVp]2 6= 0b, even if equation 2.35 yields the same result of the intended ciphertext
block yi one cannot conclude that m′ is the original plaintext block mi, because it only implies that
IVnew ⊕m′ ⊕ IVp = mi.
On the other hand, if Eve is able to correctly predict the IV used in the next encryption, then
O(M) =EBk (IVnew ⊕M)
=EBk (IVnew ⊕m′ ⊕ IVold ⊕ IVp)
=EBk (m′ ⊕ IVold)
(2.36)
where the last equality holds because IVp = IVnew. Lastly, note that EBk (m′ ⊕ IVold) = yi ⇒ m′ = mi.
Algorithm 3 CBC Predictable IV attack
1: procedure PREDICTATTACK(y, IV ) . Discovering x : eCBCk (x, IV ) = y
2: Split y into n blocks y1, . . . , yn;
3: Initialize empty bitstring x;
4: for i = 1 to n do
5: a← PREDICT(yi, yi−1); . y0 = IV
6: x← x ‖ a;
7: end for
8: return x;
9: end procedure
Apart from IV predictability, CBC mode is also susceptible to POA: provided the absence of cipher-
text stealing methods, every message must be padded prior to encryption. Algorithm 2 describes the
procedure for attacking CBC given a padding oracle.
32
CTR
Let ti, pi and ci be the ith counter block, plaintext block and ciphertext block, respectively. Due to
CTR’s construction, changing the last byte of ci results in changing only that last byte of pi and the same
attack using a padding oracle for CBC can be herein applied.
Let x be an m-bit message and y an n-bit message, with n < m. Then the following holds:
x⊕ y = x|n ⊕ y
where x|n represents x truncated to its first n bits. This observation makes it clear that there is no need
for padding messages that are encrypted via CTR and the resulting ciphertext will have exactly the same
length as the original plaintext, as the XOR operation is performed bitwise.
As already mentioned in section 2.2.2.4, the pair (ti, k) needs to be unique for all i ∈ N, otherwise
CTR mode’s security is compromised. Consider the following scenario: using CTR mode with an arbi-
trary block cipher B and standard incrementing function, Alice encrypts two messages p and m (of the
same length, w.l.o.g.) such that the nonces chosen for each encryption, nonce1 and nonce2, satisfy
nonce1 + i = nonce2 + j (2.37)
for some i, j ≤ n, where n is the number of b-bit blocks, yielding the ciphertexts w and z such that
w =eCTRk (p, nonce1)
z =eCTRk (m,nonce2)
(2.38)
that are available to Eve. Given the nonce equality, the following holds
EBk (nonce1 + i) = EBk (nonce2 + j) (2.39)
meaning that Eve, who is in possession of the ciphertexts w and z is able to compute
wi ⊕ zj =(EBk (nonce1 + i)⊕ pi)⊕ (EBk (nonce2 + j)⊕mj)
=pi ⊕mj
(2.40)
where the last equality follows from (2.39) and from the XOR properties of commutativity and self-
inverse. Now, a statistical analysis technique would be the most straightforward approach to find both pi
and mj .
CFB
Generally speaking, CFB suffers from the same fragilities as CTR mode: the IV must be unique
and it may be susceptible to a POA. Furthermore the construction of each block of CFB encryption is
fully-dependent on the previous hence there is no way of parallelizing the process.
2.6.8 Side-Channel Attack
Whenever the cryptographic systems are embedded within devices that are physically exposed in
such a way that third parties can extract information from its electromagnetic field, temperature, sound,
energy consumption, or any kind of physical element variation, one says that they are vulnerable to
side-channel attacks.
33
Countermeasures
The countermeasures for side-channel attacks can be categorized into two main activity clusters:
1. Prevent the leak of information;
2. Remove or smoothen the relation between secret data and environmental changes.
Both of these actions impend a considerable amount of resources, especially access and knowledge to
the hardware development, hence it is not usually easy to prevent side-channel threats.
2.6.9 Attacks on AES
Since it was published as a standard in 2001, AES has been target to non-ceasing break attempts
throughout the years. It is yet unbreakable in terms of direct security, i.e., there is no efficient known
practical attack on the cipher itself. AES overcomes the weaknesses of DES that were exploited by
differential cryptanalysis but the development of the concept of integral cryptanalysis [33], which instead
of XOR differences is based on sets of chosen plaintexts that have some common fixed part, raised the
first attack on this robust standard apart from the brute-force approach. The latter is simply infeasible for
any of the possible key lengths. The first theoretical key recovery attack on AES [34] was published in
2011 and it was approximately four times faster than a brute-force attack. Even with the improvements
to this attack dating to the current days, it is not yet possible to efficiently implement these attacks due
to their time complexity.
There are many possible ways to break a cipher and some of the most devious attacks do not
directly target the cipher but instead work around it and try to gain information leaked by the behaviour
of external components. The type of attack that deviates the most from the cryptographic features
related with the privacy provider encryption scheme is a side-channel attack and in 2016 a very efficient
attack of this kind was created [35] that relies on aspects related with the central processing unit (CPU)’s
cache memory and can break AES in less than a minute. Notwithstanding, most modern-day CPUs are
already resilient to this category of time-based side-channel attacks.
34
Chapter 3
Network
The present technological advancements entail an increasing complexity in computational security.
Given a network under certain restrictions on both its elements’ autonomy, capacity and connectivity,
arises the problem of transmitting data with integrity, authenticity and non-repudiation, using the nowa-
days’ cryptographic standards.
The goal of this chapter is to provide a topological solution together with a communication protocol
for a specific network. The problem being studied is addressed in section 3.1, followed by a description
of the components and restrictions imposed to the network in section 3.2. Then, some possible solutions
for the network’s topology are compared and a choice is made in section 3.3 and section 3.4 contains a
step-by-step description of the communication protocol under the chosen option. Lastly, in section 3.5
some concepts are introduced for the global characterization of the network’s inherent encryption and
decryption mechanisms.
3.1 The Problem
The main purpose of this work is to develop an optimal solution for the topological and cryptograph-
ical components of a restricted network. Basically, upon being provided with network requirements,
which are either constraints to the network’s elements and their connections or to the capabilities of the
communication channels, one is intended to choose the encryption and authentication schemes and
analyze their level of security for fitting state-of-the-art properties and definitions. The goal of these
schemes is to provide the data cryptographic properties that will strengthen the resilience of the data
stored in the network elements’ memory against possible threats. The security layer of any of the pro-
tocols used for the transmission of data between any two network parties is also a relevant subject of
study for it will determine the level of security of the communications.
Consider the scenario depicted in figure 3.1. Certain measurable elements from the environment
are processed into digital data by a specific type of device, who stores the information after processing.
Then, the data is to be transmitted to a secure database, where it is stored and used as required. The
problem is to come up with a secure mean of transmitting the data from the device to the database, pro-
35
Measurable element
Detects activity
Device
Processes and stores information
Database
Data transmission
Data storage
Figure 3.1: General purpose and activity of the envisaged network.
vided restrictions to the device’s lifetime, autonomy, capacity of processing and memory space. Thus, a
network for the transmission of the data is to be constructed, which consists of distinct clusters such that
each is composed by a fixed number of elements, each element of the same group shares a set of fea-
tures and the connections between clusters are restrained under some pre-defined rules. The previously
mentioned parties’ features range from the computational power scope and available cryptographic al-
gorithms and their respective keys to the assigned mission of the network element. More specifically,
the purpose of the network is to gather real-time data and transmit it to a secure database while granting
the collected evidence confidentiality, integrity, authenticity and non-repudiation properties.
3.2 Details
Let the network be composed by three main components:
• Gathering devices (GDs): field-deployable parties that gather the raw data (with a maximum
threshold of 256kB/s), process it and subsequently send it to an authorized party via an asyn-
chronous channel. The length of each of the generated messages is a multiple of a minimum
defined length l1 and is maximized by 256 octets. These elements are restricted with respect to
memory (256kB RAM) and autonomy, as their energy source is a non-rechargeable battery and
remain in the same geographic location throughout the extent of their lifetime.
• Middle-point party (MPP): a gateway party who is near the deployed gathering devices in order to
wirelessly receive the data and/or send command messages. May also possess a serial connec-
tivity option for posteriorly physically transmitting the sensitive data to an authorized party.
• Mission and data manager (MnDM): headquarters’ positioned device that receives the data from
the middle-point party, makes the necessary verifications and stores it in a secure centralized
database. It is also capable of generating and sending command messages, whose length is a
multiple of a pre-defined minimum length l2.
Figure 3.2 summarizes the interactions between the abovementioned components. Note that the
GDs do not communicate directly with the MnDM and vice-versa. The data flow from the GDs to the
36
MPP and subsequently to the MnDM is denoted by Upstream Data Lifecycle (UDL) and in the inverse
direction is denoted by Downstream Data Lifecycle (DDL).
DATA
COLLECTION
Gathering
Devices
DATA
TRANSMISSION
Middle-Point
Party
DATA
TRANSMISSION
Mission and
Data Manager
Upstream Data Lifecycle
Downstream Data Lifecycle
DATA
STORAGE
Figure 3.2: General layout of the desired network.
The description of the network entities uproots the term of command message. These are pre-
defined formatted messages whose contents are intended to give an instruction to another network
party and can be generated by the MPP and MnDM. It is important to note that both the (binary) length
of the messages generated by the GDs and the command messages herein introduced is a multiple of
a value l1 and l2, respectively, for some l1, l2 ∈ N : l1 ≤ 2048 ∧ l2 ≤ 2048. For every message having
length L = m× l it is equivalent to have m messages of length L with respect to the gathering process.
That is, for every message x and command message m
∃L1,L2∈N : (L1 = m1l1 ∧ L2 = m2l2) ∧ (|x| = L1 ∧ |m| = L2) (3.1)
for some m1,m2 ∈ N : mi ≤ 211/li, ∀i∈I2 .
For secrecy, authentication and non-repudiation purposes some cryptographic algorithms are going
to be used, for which are compelled cryptographic keys. Prior to deployment there must be a setup
stage, in which the required keys are generated, transmitted to the envisaged target and stored in solid
memory. These keys are generated inside secure headquarters, called the pre-mission system (PMS)
and will be discussed further on. Figure 3.3 contains a simple diagram that depicts the whole step:
at the PMS, a family of keys K = (K1,K2,K3) is generated such that K2 and K3 are the sets of keys
transmitted to MPP and MnDM, respectively, and K1 =⋃n
j=1Kj1, where n is the total number of GDs
and Kj1 is the set of keys transmitted to GDj , ∀j∈N : 1 ≤ j ≤ n. When the setup stage is concluded the
devices meet the required constraints for the set up of the network.
Pre-Mission System Generate key family K
GDj
MPP
MnDM
Kj1
K2
K3
Store keys
Store keys
Store keys READY FOR DEPLOYMENT STAGE
Figure 3.3: Pre-deployment stage
37
The GDs, upon deployed, can be in one of two states: active mode or sleep mode. When the devices
are in active mode they keep on gathering evidence according to their data collecting schedule and
send the processed data to the envisaged end party via an asynchronous communication channel [36]
according to the data flow schedule. The sleep mode, in turn, is a low power consumption state in which
the devices are not actively performing any activity other than periodically searching for a connection
to an asynchronous communication channel. There must be an activity schedule on which the GDs’
actions rely on and it can be one of the components affected by the command messages that either the
MPP or the MnDM send to the GDs.
Due to the restrictions imposed on the gathering devices, there must be a device within range that
generates the WLAN on which the devices share information. There are two options for this situation,
which are discussed in detail in the next section:
1. The network is generated by a field-deployed AP;
2. The network is generated by the MPP directly.
When in active mode, the gathering devices are intended to be collecting and internally storing rele-
vant data around the clock, but the transmission of this information to another party does not need to be
performed at the same time, meaning that the WLAN can be periodically created and at that timespan
the transmission of data should be prioritized. For this reason, the data stored in solid memory of the
gathering devices is expected to be encrypted. The GDs are connected to the MPP via Wi-Fi, which
becomes a security vulnerability since the messages are transmitted as radio signals hence an attacker
may be able to eavesdrop on the communication channel and attempt to break the cryptosystem. For
these reasons, the chosen protocol for the communication between the GDs and the MPP has been
decided to be WPA2-PSK, since WPA2 is the most robust option for Wi-Fi communication channels. The
choice of PSK over EAP follows from two facts: firstly, the gathering devices do not need to hide infor-
mation from one another and secondly the EAP mode of the WPA2 protocol requires more computations
for the initial authentication step and therefore it increases the energy consumption when compared with
PSK. As for the Internet communication protocol between the MPP and the MnDM, it’s been decided
that the TCP protocol [37] should be used for the Transport Layer along with IP for the Internet Layer.
3.3 Network Topology
There are several possibilities to be considered for the network’s topology and their suitability differs
on the purpose that weighs more and passively implies the remaining options to become cumbersome.
Following, some proposals will be presented and their practicality discussed.
AP-based network
This approach considers that the gathering devices solely communicate with the access point. There
are two possibilities for this infrastructure mode, both represented in Figure 3.4.
38
MPPACCESS POINT
GD1
GD2
GD3
Wi-FiMnDM DB
Wi-Fi Internet
(a) Including a deployed router on the field.
MPP
GD1
GD2
GD3
Wi-FiMnDM DB
Internet
(b) Middle-point party performs the AP role.
Figure 3.4: Topology of AP-based networks.
Proposal A: The first choice is represented in Figure 3.4a and considers the deployment of a fixed
AP on the field, which would be able to maintain the WLAN in effect continuously. The main advantage
of this option lies on the gathering device’s energy consumption reduction entailed by the fact that the
encryption and subsequent internal storing of the gathered data would be performed by the AP itself,
i.e., the GD would only need to spend energy on the communication protocol and not on the storing
mechanism, the latter being performed by the AP. Notwithstanding, the latter would not only require a
high amount of energy to be running, but would also be very difficult to be hidden due to its dimensions,
which would allow an adversary to easily detect it on the field and eventually destroy it or try to crack the
communications. These two very strong arguments lead to the deemphasizing of this possible solution,
since the AP is required to be a centralized element of the network.
Proposal B: The second possible option is represented in Figure 3.4b for a total of 3 gathering
devices. The arrowed edge represents the uni-directional data flow between the MnDM and the DB,
whilst the simple edges represent a bi-directional data flow. It features the MPP as the access point of
the network. Since the MPP is considered to be a versatile element in the sense that it is not deployed
on the field, this option is considered to be very suitable for the features at hand.
So far, the topology presented in proposal A has been deemphasized, leaving B as the only viable
option. Notwithstanding, another possibly highly reliable solution is going to be discussed.
39
Ad-hoc network
Proposal C: Considering the case where the gathering devices may communicate with one another,
one is presented with an ad-hoc network (Figure 3.5). The dashed lines in the previously mentioned
figure represent a possible communication, i.e., upon agreeing on a certain frequency for the commu-
nication channel, the devices can either broadcast a message or send it to a number of targets of their
choice. An advantage of this option over the previously mentioned AP-based case is the network’s
scalability and self-management.
This layout is indeed a strong option for the topology of the network, since it may allow the GDs
to never broadcast and therefore save energy. More explicitly, each GD can communicate with a finite
chosen number of other GDs and/or the MPP. This would imply that the GD would have previously set
up targets, allowing low energy consumption communications. However, the more possible connections
the more keys one needs to store in each GD due to the required secrecy layer on the data stored
within the gathering devices. Moreover, the end-to-end transmission in an ad-hoc network is usually
slower given that the message will have to be transmitted from one party to the next until it arrives to the
desired target, in the absence of broadcast. This step back may have a relevant impact on the system
at hand, not due to the time needed for the MPP to gather the data, but to the energy spent by the GDs
in the transmission process (recall that the GDs must save as much energy as possible for an extended
autonomy on the field).
MPP
GD1
GD2
GD3
Wi-FiMnDM DB
Internet
Figure 3.5: Topology of the ad-hoc network.
In this case, the use of Elliptic Curve Cryptography (ECC) would be very useful on the grounds that
the memory savings (as a result to the smaller key sizes), the lower computational complexity on both
the encryption and authentication processes and the reduction of the GDs’ power consumption are all
demanding features, given the restrictions at hand. However, as opposed to standardized cryptosys-
tems, it is not yet common to find embedded systems hardware-programmed with ECC, which is why
proposal B has been chosen over C in practice.
40
3.4 Protocol
This section describes the protocol that specifies the key generation stage and the data processing
that occurs on the network’s data lifecycle in both directions, i.e., UDL and DDL.
Recall that the pre-deployment stage copes with Figure 3.3 and the network’s topology is as pre-
sented in Figure 3.4b. Let n be the number of gathering devices and GDi be the GD at hand for some
fixed index i ∈ I. Moreover, consider the following notation:
• (kGMPA )i: The 128-bit key shared between the GDi and the MPP, used by the AES cipher.
• kMPMA : The 128-bit key shared between the MPP and the MnDM, used by the AES cipher.
• (kGMPH )i: The 256-bit key shared between the GDi and the MPP, used in the HMAC-SHA-256 algo-
rithm.
• (kGMDH )i: The 256-bit key shared between the GDi and the MnDM, used in the HMAC-SHA-256
algorithm.
• kMPMH : The 256-bit key shared between the MPP and the MnDM, used in the HMAC-SHA-256
algorithm.
These abbreviations are to be considered throughout the entire text.
3.4.1 Setup
The setup stage must take place in a secure location and be performed by trusted users, given that
herein all the necessary keys are generated and inserted into the corresponding targets. There are three
types of keys that need to be generated and distributed: keys for encryption, keys for authentication and
a single key for the Wi-Fi communication protocol.
The key generation protocol occurs at PMS and comprises the following steps:
1. The user inputs (n, pass), a tuple consisting in the number of gathering devices that are going to
be deployed and a password, respectively;
2. The key generation algorithm is applied with input (n, pass) and outputs K = (K1,K2,K3)1;
3. Export the keys to the envisaged devices according to the following distribution:
• Kj1 = {(kGMPA )j , (kGMPH )j , (kGMDH )j};
• K2 = {kMPMA , kMPMH } ∪ {(kGMPA )j}j∈I ;
• K3 = K1 ∪ K2;
4. The keys in Kj1 are inserted into GDj , ∀j ∈ N : 1 ≤ j ≤ n;
5. The keys in K2 are inserted into the MPP;
1As defined in section 3.2
41
6. The keys in K3 are inserted into the MnDM.
After all the steps are concluded the devices are ready for the deployment stage, in which the GDs
are distributed among the desired initial locations li0,∀i∈I . Each device GDi will remain in li0 for its entire
lifetime without any key schedule algorithm to update the keys.
3.4.2 Communication Protocol
In this section, only the steps of the protocol are described. The utility of each of the components of
the ciphertexts is explained in chapter 4.
Subsequent to the setup stage, the GDs are deployed into the field of action and start gathering the
data to be sent to the MPP. This data is encrypted and saved in the GD solid memory, waiting to be
sent through the communication channel to the MPP, via Wi-Fi and using the WPA2-PSK protocol. The
communication protocol of the whole network encompasses the two directions of the data flow: UDL
and DDL.
Consider the devices to be already deployed in the field. Let GDi be one of the deployed gathering
devices for some i ∈ {1, . . . , n}, let fid ∈ Z256 be a unique identifier of GDi stored in its solid memory
and consider the following abbreviations:
• eIV1 ≡ Encryption mode AES-CTR-128 with key (kGMPA )i using IV as the initialization vector.
• dIV1 ≡ Decryption mode AES-CTR-128 with key (kGMPA )i using IV as the initialization vector.
• eIV2 ≡ Encryption mode AES-CTR-128 with key kMPMA using IV as the initialization vector.
• dIV2 ≡ Decryption mode AES-CTR-128 with key kMPMA using IV as the initialization vector.
• h1 ≡ HMAC-SHA-256 with key (kGMPH )i.
• h2 ≡ HMAC-SHA-256 with key (kGMDH )i.
• h3 ≡ HMAC-SHA-256 with key kMPMH .
Upstream Data Lifecycle
The data is gathered by GDi, encrypted and stored in solid memory. Then, subject to the wireless
channel’s communication protocol, it is sent to the MPP where its integrity and authenticity are verified
and another layer of security is applied prior to being saved in the solid memory of the MPP. Lastly,
the package is sent from the MPP to the MnDM either via Internet or serial connection and if all the
verifications succeed at the MnDM then the plain data is sent to a secure database (DB).
The operations that are carried out in each device will now be listed.
Gathering Devices
1. GDi transforms the gathered analog raw data to digital data D and assigns to it a 4-octet message
identifier mid;
42
2. Compute h1(D);
3. Compute h2(D);
4. Compute inner pack := h1(D) ‖ h2(D) ‖D;
5. Generate a 16-octet initialization vector IV1;
6. Perform an encryption: eIV11 (inner pack);
7. Build the final package Pack1 := fid ‖mid ‖ IV1 ‖ eIV11 (inner pack);
8. Store Pack1 in solid memory.
The MPP gets in range and starts listening to incoming requests. GDi attempts to connect to the WLAN
hosted by the MPP and as soon as the connection is established, the messages that the GDi had stored
in memory are sent, subject to the WPA2-PSK protocol.
Middle-Point Party
9. Pack1 is parsed into its main components: fid, mid, IV1 and eIV11 (inner pack);
10. Decrypt eIV11 (inner pack), i.e., compute dIV1
1 (eIV11 (inner pack)) in order to obtain inner pack;
11. Parse inner pack into its main components: h1(D), h2(D) and D, such that h1(D) =
inner pack|256, h2(D) = (inner pack \ inner pack|256)|256 and D = inner pack \ inner pack|512;
12. Verify the message’s integrity, i.e., compute h1(D)′ and check whether h1(D)′ = h1(D), where
h1(D)′ is a new instance of the function h1 applied to the data D found in the decrypted package.
If the verification is unsuccessful, then consider the message at hand as compromised and abort
at this step by clearing all the memory associated with it.
13. Send the 32-bit word message identifier mid as an acknowledgement2 related to the message at
hand back to GDi;
14. Compute Pack2 := fid ‖ IV1 ‖ h1(D) ‖ h2(D) ‖ eIV11 (inner pack), where h1(D) and h2(D) are
extracted from step 11;
15. Generate a 16-octet initialization vector IV2;
16. Encrypt the previously built package: eIV22 (Pack2);
17. Generate a digest of the encrypted data: h3(eIV22 (Pack2));
18. Build the final package Pack3 := h3(eIV22 (Pack2)) ‖ IV2 ‖ eIV2
2 (Pack2);
19. Store Pack3 in solid memory.2The message acknowledgement is subject to the WPA2-PSK protocol and thus is protected while travelling through the net-
work. Upon receiving this information, the GDi will trust that this information has been successfully delivered to the intended
party.
43
After the data has been gathered the MPP closes the WLAN and connects to the MnDM via Internet,
transmitting all the recently stored data subject to the TCP/IP protocol.
Mission and Data Manager
20. Parse Pack3 into its main components: h3(eIV22 (Pack2)), IV2, and eIV2
2 (Pack2), where
h3(eIV22 (Pack2)) = Pack3|256;
IV2 = (Pack3 \ Pack3|256)|128;
eIV22 (Pack2) = (Pack3 \ Pack3|384);
21. Verify the integrity of the encrypted data by computing a new HMAC instance h3(eIV22 (Pack2))′
and checking whether h3(eIV22 (Pack2))′ = h3(eIV2
2 (Pack2)). If successful, proceed to the next
step, otherwise consider the message as compromised and abort at this step.
22. Perform the decryption dIV22 (eIV2
2 (Pack2)) in order to obtain Pack2;
23. Parse Pack2 into its main components:
fid = Pack2|8;
IV1 = (Pack2|136)|128;
h1(D) = (Pack2|264)|256;
h2(D) = (Pack2|392)|256;
eIV11 (inner pack) = Pack2 \ Pack2|392;
24. Perform the decryption dIV11 (eIV1
1 (inner pack)) in order to obtain inner pack;
25. Parse inner pack into its main components: h1(D)∗, h2(D)∗ and D∗.
26. Check whether hi(D) = hi(D)∗, ∀i∈I2 . If successful, then proceed to the integrity check in the next
step. Otherwise, consider this message to be incorrect and abort the execution at this step;
27. Compute two new HMAC instances of the data D: h1(D)′ and h2(D)′ and check whether hi(D)′ =
hi(D), ∀i∈I2 . If successful, send Pack3 to the DB, for it can be assumed with a high level of trust
that the message has not been tampered with during the whole course. Otherwise, consider the
message as compromised and abort the execution.
Downstream Data Lifecycle
Both the MnDM and the MPP can generate messages, usually called command messages, whose
purpose is to give an instruction to another network element; for instance they can order a GD to change
its data gathering time frame. The format of the command message is pre-defined and varies according
to the type of inherent command.
All messages generated and sent by the MnDM fall into the category of command messages and one
can discriminate two distinct clusters: the commands intended for the MPP and the commands intended
44
for the GDs. Either way, the messages with origin at the MnDM are ciphered prior to being stored in the
MnDM’s solid memory and sent to the MPP. Then the data is sent via Internet to the MPP and subject
to a verification process upon arrival, after which it is either encrypted and saved in the MPP’s solid
memory while waiting to be sent to the envisaged GD via Wi-Fi or read and applied on the fly. Moreover,
the MPP can also generate local instructions intended for GDi, thus any command message that leaves
the MPP via Wi-Fi must be flagged according to its sender. When the command message reaches the
target GDi, its authenticity and integrity are verified and it is saved in a stack while waiting to be read
and applied by the internal command manager protocol.
The list of steps that are carried out in each device is now presented.
Mission and Data Manager
1. Generate a message M ;
2. If M is a command for a GD then generate a HMAC of the message: h2(M) and proceed to step
4. Otherwise proceed to step 3;
3. Consider inner pack := M and fid a zero valued 8-bit identifier. Proceed to step 5;
4. Prepend the HMAC to the message: inner pack := h2(M) ‖M and choose a GD identifier fid ∈
(Z2)8 : fid 6= 0. Proceed to step 5;
5. Generate a pseudo-random 16-octet initialization vector IV1;
6. Encrypt inner pack by computing eIV11 (inner pack);
7. Generate a HMAC of the encrypted package: h3(eIV11 (inner pack));
8. Build the package Pack0 = h3(eIV11 (inner pack)) ‖ fid ‖ IV1 ‖ eIV1
1 (inner pack). If the command is
not intended for a GD, then the device’s identifier field will hold a full zero 8-bit array, in which case
it will flag that the recipient of the message is the MPP;
9. Generate a pseudo-random 16-octet initialization vector IV2;
10. Perform the encryption eIV22 (Pack0);
11. Build Pack1 = IV2 ‖ eIV22 (Pack0);
12. Store Pack1 in solid memory.
Pack1 is now sent via Internet to the MPP subject to the TCP/IP protocol.
Middle-Point Party
Steps 13 to 27 (block of execution A) represent the phase of the protocol where the MPP receives
the MnDM’s command message, proceeds to the necessary verifications and processes it accordingly,
whereas steps 28 to 36 (block of execution B) describe the case where the MPP generates the com-
mand message to be sent directly to the GDi. Both blocks may not be executed synchronously; for
45
instance, the MnDM may send a message to the MPP while the latter is processing its own command.
Nevertheless, at the end of both execution blocks (A and B) the protocol follows to step 37.
13. Parse Pack1 into its main components:
IV2 = Pack1|128;
eIV22 (Pack0) = Pack1 \ Pack1|128;
14. Perform the decryption: dIV22 (eIV2
2 (Pack0)) in order to obtain Pack0;
15. Parse Pack0 into its main components:
h3(eIV11 (inner pack));
fid;
IV1;
eIV11 (inner pack);
16. Compute a new HMAC instance h3(eIV11 (inner pack))∗;
17. If h3(eIV11 (inner pack))∗ 6= h3(eIV1
1 (inner pack)) then assume the message to be compromised,
discard it and abort the execution. Otherwise continue;
18. Perform the decryption: dIV11 (eIV1
1 (inner pack)) in order to obtain inner pack;
19. If fid = 0 then apply the corresponding command and finish the execution at this step, otherwise
continue3.
20. Initialize a single bit flag ∈ Z2 : flag = 1;
21. Parse inner pack and generate a HMAC of the message: h1(M);
22. Build inner pack1 := flag ‖ h1(M) ‖ h2(M) ‖M ;
23. Generate a pseudo-random 16-octet initialization vector IV3;
24. Perform the encryption eIV31 (inner pack1);
25. Generate a unique message identifier m∗id;
26. Build the final package Pack2 := m∗id ‖ IV3 ‖ eIV31 (inner pack1);
27. Store Pack2 in solid memory;
As already stated, the following steps 28 to 36 are related with the case in which the command message
M is generated at the MPP instead of the MnDM. This block of execution does not necessarily follow
from the previous one (steps 13 to 27) and may be triggered either by a user interaction on the MPP or
by a scheduled command.
3Note that in this case inner pack := h2(M) ‖M .
46
28. Generate a command message M ;
29. Compute a HMAC of M : h1(M);
30. Initialize a single bit flag ∈ Z2 : flag = 0;
31. Build inner pack2 := flag ‖ h1(M) ‖M ;
32. Generate a pseudo-random 16-octet initialization vector IV4;
33. Perform the encryption eIV41 (inner pack2);
34. Generate a unique message identifier m∗∗id ;
35. Build Pack2 := m∗∗id ‖ IV4 ‖ eIV41 (inner pack2);
36. Store Pack2 in solid memory;
At the end of either block of execution A or block of execution B, the package Pack2 is sent to the GDi
via Wi-Fi and subject to the WPA2-PSK protocol.
Gathering Devices
37. Parse Pack2 into its main components:
mid = Pack2|32;
IV = (Pack2|160)|128;
eIV1 (inner pack) = Pack2 \ Pack2|160;
38. Perform the decryption dIV1 (eIV1 (inner pack)) in order to obtain inner pack;
39. Read flag = inner pack|1. If flag = 0 go to step 40, if flag = 1 go to step 44, otherwise abort the
execution and consider the message as corrupted.
40. Parse inner pack into its main components
h1(M) = (inner pack|257)|256;
M = inner pack \ inner pack|257;
41. Compute a new HMAC of M : h1(M)′;
42. If h1(M)′ 6= h1(M) consider the message to have been tampered with and abort the execution,
otherwise continue;
43. Store M in the commands’ FIFO stack, waiting to be applied as soon as possible. Successfully
exit the downstream data protocol after applying the envisaged command.
44. Parse inner pack into its main components
h1(M) = (inner pack|257)|256;
h2(M) = (inner pack|513)|256;
M = inner pack \ inner pack|513;
47
45. Compute two new HMAC instances of M : h1(M)′ and h2(M)′;
46. If hi(M)′ 6= hi(M) for some i ∈ {1, 2} then consider the message to have been tampered with and
abort the execution, otherwise acknowledge the integrity of M and continue;
47. Store M in the commands’ FIFO stack, waiting to be applied as soon as possible. Successfully
exit the downstream data protocol after applying the envisaged command.
3.5 Message Formats
This section aims to identify the distinct message formats built in the protocol description of section
3.4.2 and formally define the network’s packing and unpacking mechanisms. The five distinct mes-
sage formats comprised in the protocol are visually presented in Appendix C according to the following
specification:
F1 : encrypted by a GD and decrypted by the MPP.
F2 : encrypted by the MPP and decrypted by a GD (plaintext generated by the MPP).
F3 : encrypted by the MPP and decrypted by a GD (plaintext originally generated by the MnDM).
F4 : encrypted by the MPP and decrypted by the MnDM.
F5 : encrypted by the MnDM and decrypted by the MPP.
Hereinafter this notation is to be considered. All the abovementioned message formats are built based on
two other category of message formats F∗i and F∗∗i , which are, respectively, the formats correspondent
to the outer and inner layers of encryption within Fi for every i ∈ I5. These can also be consulted in
Appendix C. The following definitions are useful for the upcoming discussion.
Definition 3.5.1 (Confidential plaintext). The raw data gathered by the GDs as well as the command
messages are denominated of confidential plaintext.
The previous definition highlights the piece of information within the packages that is of utmost impor-
tance and envisaged to be transmitted to and read by the desired parties. The aforementioned figures
may clarify the subject.
The only cryptosystem involved in the processing of the packages is the AES cipher. Since it is a
128-bit block cipher and the encrypted data does not necessarily have 128 bits, a BCMO is required
and, as already stated, it has been decided to be the CTR mode of operation. Let G be the ciphertext
generator operator such that
G(C,M, v, k, x) = y (3.2)
where y is the result of encrypting x via block cipher C using mode of operation M with initialization
vector v (if applicable) and key k. Analogously,G−1 is the inverse operator and returns the corresponding
plaintext:
G−1(C,M, v, k, y) = x (3.3)
48
Consider the set of all words with format Fi,
Si = {y ∈ Σ∗ : y is of format Fi} (3.4)
Moreover consider a family of functions
Ei : I × Σ∗ × Σ∗ ×K × P → Si (3.5)
such that Ei(j, a, b, k, x) represents the instance that outputs an element of Si, for some confidential
plaintext x, key k, GD’s identifier j and global parameters a and b. For instance, for the format F1 the
following expression is satisfied
E1(j,m, v, k, x) = j ‖m ‖ v ‖G(AES,CTR, v, k, h1(x) ‖ h2(x) ‖ x) (3.6)
where the HMAC functions h1 and h2 are defined as according to section 3.4. This function is called of
package ciphertext generator function (PCgF) and hereinafter will be addressed accordingly. For a fixed
key j, the function
Eji : Σ∗ × Σ∗ ×K × P → Si (3.7)
is an instance function from the family of PCgF with the same expression.
Definition 3.5.2 (Package ciphertext). Let x be a confidential plaintext, j ∈ I, k a key and a, b ∈ Σ∗
two parameters of choice. The data resulting from the computation of Ei(j, a, b, k, x) is designated as
package ciphertext.
The inverse function for the PCgF is defined by
Di : I × K × Si → P (3.8)
where K is the set of keys and P is the set of all confidential plaintexts and is called of package ciphertext
unpacking function (PCuF). This means that for every z ∈ P:
Dji (k,Ej
i (a, b, k, z)) = z (3.9)
for every j ∈ I, where
Dji : K × Si → P (3.10)
is an instance of the family Di of PCuFs.
The following definition is based on the previously presented ones and will be useful to the security
analysis presented in chapter 4.
Definition 3.5.3 (Packing Scheme). A packing scheme (PSch) is defined by a 3-tuple (E,D,K) where
E is a family of PCgF, D is a family of PCuF and K is the key set.
Let PiS be the PSch associated with message format Fi such that ∀i∈I5 :
PiS =(Ei, Di,K)
and
Ei =
n⋃j=1
Eji and Di =
n⋃j=1
Dji
(3.11)
49
where Eji and Dj
i are as according to expressions 3.7 and 3.10, respectively.
For every i ∈ I5, the packing schemes PiS define the distinct message formats and their security will
be thoroughly studied in chapter 4.1.
Consider x to be an n-bit confidential plaintext generated by a given GD. The construction of the CTR
mode of operation entails restrictions to the pair (v, k) used in CTR mode of operation, where v stands
for the initialization vector and k for the key. Even though it is not required the initialization vector v to be
unpredictable by an adversary, it is mandatory that the pair (v, k) does not repeat for the same block of
plaintext. Thus, there is an upper bound on the length of the message to be encrypted using the CTR
mode: the number of blocks of the message must not be greater than 2m, where m is the block size of
the block cipher at hand, in bits4. Since AES is a 128-bit block cipher, the bit-length n of the plaintext
must satisfy
n ≤ 128× 2128 = 2135 (3.12)
which is a really large number and does not restrain the set of possible plaintexts in practice, for n may
take larger values than the number of atoms in the universe. However, since the GDs are assumed to
only have their RAM upper bounded by 256kB, then the plaintext messages’ length is bounded by this
value, i.e.,
n < 218 − ai (3.13)
where ai represents the binary length of the data that was added in order to build the message with
format Fi, that is
ai = |Eji (u, v, x)|2 − n, ∀j∈I,i∈I5 (3.14)
where u and v are the required external variables for the construction of the package Fi. Moreover, note
that for fixed |Eji (u, v, x)|2 = |Ek
i (u, v, x)|2|,∀j,k∈I . The strictly lesser operator is due to the usage of
some of the RAM by internal processes of the system at hand, that are needed for it to properly execute
certain required background tasks.
4Recall that there are 2m distinct values for a bitstring of length m.
50
Chapter 4
Security Analysis
In real-world projects where computer security is of essence, there are always limitations directly
caused by one or more of many factors, such as available funding or ethical restrictions. This means
that one is not provided with unlimited resources in practice and choosing the best possible scenario
comes both as an unavoidable consequence and arduous task. In general there is an inverse relation
between security and performance but each system must be analyzed individually, as its features may
entail that the former relation does not hold, in which case an optimal solution in terms of security may
imply a greater amount of computational resources.
This chapter aims to analyze the considered most important cryptographic properties of the packing
schemes descripted in section 3.5 with respect to security. The strengths and weaknesses of the pro-
posed message formats and protocols are scrutinized followed by suggestive solutions for the observed
flaws.
4.1 Strengths and Weaknesses
The network considered in section 3.3 is not perfectly secure as one would expect and there are
some ingrained fragilities induced by the chosen topology or by the communication protocol descripted in
section 3.4. This section discusses some of those flaws and key aspects related with the key generation
stage and the packing schemes built within the scope of the communication protocol.
4.1.1 Key Generation
In the key generation stage, as described in 3.4.1, all the required cryptographic keys are generated.
In order to increase the resilience of the key against key-recovery attacks it must be generated as
randomly as possible. Based on the results presented in [26] PBKDF2 is a good option for the generation
of the keys, namely with HMAC-SHA-1 because it is the keyed-hash function with better performance
and provides enough security [38], even though the inherent hash function is not strong with respect to
collision resistance [20].
The password serves both as the seed for the pseudo-random generator that constructs the salt byte
51
array as well as the password passed as argument to the PBKDF2 algorithm. Ideally, one wouldn’t want
to make the key generation process depend solely on a single password input by the user because it
clearly lowers the security level of the key generation process, since to break the key generator comes
down to finding a single password and replicate the process. Nevertheless, this simplistic approach was
the one agreed to be used because of its simplistic features and the high level of trust placed in the user
U operating of the MnDM. One very straightforward solution to increase the security level of this very im-
portant stage of the protocol would be for U to provide two passwords: one to be used in the construction
of the seed for the pseudo-random algorithm and the other to be used as the password for the PBKDF2
algorithm. However, both these approaches require a complete trust on the user that is generating the
keys; if U is evil-intended, then the whole network becomes compromised. The answer to overcome this
problem lies in the Two-Person Concept, which is a mechanism based in the following requirement: to
launch a nuclear missile, there must be two distinct and unique individuals, each possessing a distinct
key that is not known by the other party, inserting their credentials in the launching computer at the same
time. By adapting this concept, one could build a similar behaviour for the generation of keys in the PMS,
where a pre-processing stage would take place to construct a master key out of the keys of each of the
k chosen parties, for some k > 1 (e.g., out of an XOR operation). This key would then be used by the
key generation process and no party could ever single-handedly replicate the construction of the whole
key set and attack the system. Figure 4.1 briefly depicts this procedure for k = 3.
Key Constructor Algorithm
User1
User2
User3
Key Generation Algorithm
secret1
secret2
secret3
master keyKey Generation Algorithm
Figure 4.1: Key generation based on k users
One can also exploit flaws embedded in the distribution of the keys. Theoretically speaking, one
does not even question the integrity of the MnDM but in practice all situations must be considered. Let
Tom be a malicious user monitoring both the sent data and the results received at the MnDM. Since
he is in possession of all the keys, he is able to generate a corrupted message M and produce a
package E4(M) to be inserted into the DB, whose invalid authenticity is not traceable by anyone that
attempts to make the verifications. That is, using the keys (kGMPA )j , kMPMA , (kGMPH )j , (kGMDH )j and kMPMH , for
some 1 ≤ j ≤ n, Tom would be able to produce a package P with format F4, holding M as its core
message and such that any party who would attempt to trace the message would conclude that P is a
package sent by the MPP to the MnDM whose principal components originated in GDj without being
tampered with in the process. A possible solution for this issue would be to not provide the MnDM with
52
the keys (kGMPH )i ∀1≤i≤n. This way, Tom would still be able to read and send messages from and to the
MPP, respectively, but he would not be able to produce a malicious package P and send it to the DB
without being flagged as compromised by a trusted authority. Apart from the aforestated problem, the
current key distribution does not provide a whole control of the network by the MPP on the grounds that
this device does not contain any of the keys (kGMDH )i ∀1≤i≤n. Even if the MPP becomes compromised, it
does not possess all the necessary knowledge to trick the GDs nor the MnDM into accepting corrupted
messages. Nevertheless, the compromisability of any of the active elements of the network is a subject
of utmost concern.
4.1.2 Packing Schemes and Protocols
Apart from the IVs and HMACs there are two elements that may occur in the packages’ headers:
the gathering device identifier did and the message identifier mid. The former is intended to provide
the MPP a mean to know which key of the set K1 = {(kGMPA )i : 1 ≤ i ≤ n ∧ i ∈ N} was used in the
encryption so that the same key is used in the decryption and/or in order for the MPP to know to which
GD must the message be sent to, while the latter is helpful in the GD’s memory management. According
to the protocol descripted in section 3.4.2, upon receiving a message, the MPP checks its integrity and
authenticity and if this verification succeeds then an acknowledgment message containing mid is sent
back to the envisaged GD. When the latter receives the ACK, it can release the memory associated with
the message having identifier mid, using a look-up table for instance.
Even though both did andmid are not immune against tampering or unintentional errors, the following
proposition holds.
Proposition 4.1.1. It is infeasible for an adversary to perform replay attacks or trick the MPP into as-
suming the gathering material is located elsewhere.
Proof. Let i 6= j and Fj and Fi be two gathering devices with identifiers djid and diid, and located at
positions X and Y , respectively. Consider that an adversary wants to display malicious activities at
location X, which would be detected by Fj . If he is aware of the presence of Fj , he might attempt to
tamper with the reports on the location by changing djid for diid; this way the information transmitted to
the MPP would be that the malicious activity is in effect in location Y instead of X. However, changing
the GD identifier value to diid will result in the MPP calling the PCuF of P1S using the keys shared with
Fi, meaning that the resulting decrypted text would be distinct from the original plaintext, due to the
injectivity of AES. Thus such an attack becomes infeasible since the adversary has a very thin margin
of success. The resistance to replay attacks follows directly from the WPA2 protocol [29].
Hereinafter, due to simplicity purposes, consider Ej1(x) to represent the PCgF with omitted global
parameters a and b and for which the keys used in the encryption scheme are associated with GDj .
Proposition 4.1.2. PSch PiS grants secrecy, integrity and authenticity to the confidential plaintext, for
every i ∈ I5.
53
Proof. Let A be an adversary and j ∈ I a fixed identifier representing the operational gathering device
GDj . For some unknown confidential plaintext x suppose that A is in possession of y := Eji (x), the
associated package ciphertext.
• i = 1: The secrecy of the plaintext follows directly from the secrecy property of the AES-CTR encryp-
tion scheme; A cannot obtain x simply with the knowledge of y because the former is encrypted with
AES-CTR using the key (kGMPA )j ∈ K1, which is of private knowledge uniquely to the MPP and GDj .
Based on figure C.2, recall that Ej1(x) = did ‖ mid ‖ IV ‖ g(AES,CTR, IV, (kGMPA )j , w), where w =
h1(x) ‖ h2(x) ‖ x. Clearly there is no integrity nor authenticity protection to the encrypted message
and the malleability property of the AES-CTR encryption scheme gives the attacker an opportunity of
tampering the ciphertext. In case A interferes with the ciphertext, then after decryption one ends up
with the word w∗ = h1(x)∗ ‖ h2(x)∗ ‖ x∗, due to the properties of AES-CTR. It is certainly very unlikely
that ∀i∈I2 : hi(x)∗ = hi(x∗) because of the avalanche property of cryptographic hash functions, which
means that the adversary has a negligible probability of corrupting an encrypted message without
compromising the plaintext’s integrity check. The integrity and authenticity follows from the fact that
the HMAC keys (kGMPH )j and (kGMDH )j are uniquely distributed among the pairs (GDj , MPP) and (GDj ,
MnDM), respectively.
• The same arguments can be applied analogously for i = 2, . . . , 5.
The previous proposition expresses that the data going through the UDL is authenticated and not
tampered with in the process. However, when the data reached the DB how can an umpire prove to a
third party that the data is legit? The following proposition addresses this question.
Proposition 4.1.3 (Non-repudiation). The data stored in the DB is granted the non-repudiation property,
provided complete trust on the user of the PMS.
Proof. Let U be the user of the PMS, J the umpire, V the entity asking for the verification and y an
arbitrary message stored at DB. In order for J to show V that y corresponds to a package ciphertext
of some message that originated at the GDj , he requires of U a simulation of the key generation stage.
Because U is the only one with access to the password used at the process then, having access to the
correct PCuF f , V just needs to execute the function f with the keys that were assigned to GDj . If the
output is a valid message then all the verifications for the unpacking procedure succeeded and J has
proved to V that y is legitimate, as well as its secret contents.
Even though an adversary is not able to directly find the exact plaintext from its corresponding cipher-
text it doesn’t mean the cryptosystem provides full security to the message’s secrecy. It could be the
case that A would gain information on the plaintext if the IV is reused1, since the key remains static for
the entire device’s lifetime. At the GDs the IVs are being generated through the standard incrementing
function which means that the leak of information would only occur if the threshold of the state space1See section 2.6.6.1 for key-reuse attack.
54
had been reached. Given that the IVs are fixed-sized words of 16 octets (128 bits), the total number of
distinct IVs is 2128 − 1, which is a very large number.
Every GD is limited to the maximum processing of 256 kB of data per second and each plaintext’s
length is upper bounded by 256 octets therefore the maximum number of data packages that a GD
can process per second is 1024. Continuously gathering data at such rate, a GD would overextend the
space of possible message identifiers (mid) in around 48 days. Thus, the recommended lifetime for a
GD under this conditions is 1 month in average, which means that the danger of reusing a key under the
current encryption scheme is negligible. This conclusion confirms that the absence of a key scheduling
algorithm for the GDs was a good option, for it would be a waste of energy to perform the computations
to update the keys when there is virtually no threat against a brute-force attack.
4.1.2.1 Semantic security
The leak of partial information of the plaintext from the ciphertext is an undesired property that must
be seen as a very dangerous threat when exploited by a capable adversary. This is the notion of
semantic security introduced in chapter 2. The following proposition is very important for it expresses
the level of security of the message formats inherent to the packing schemes PiS ∀i ∈ I5.
Proposition 4.1.4. For every i ∈ I5, the PSch PiS is semantically secure against chosen-plaintext at-
tacks.
Proof. The fields did andmid within message format F1 are independent of the plaintext and soA cannot
infer any relation with the associated plaintext. As has already been stated, the IV must be exposed as
plaintext for the security of the AES-CTR encryption scheme and it does not leak any information for the
adversary to exploit. Therefore, for i = 1, . . . , 5 the semantic security of PiS follows from the IND-CPA
security of AES-CTR proved in [2] and from the equivalence between IND-CPA and SEM-CPA proved in
[39].
Informally and assuming every confidential plaintext to be equally-sized, it means that the package
ciphertext does not reveal any information whatsoever about the confidential plaintext. That is, for any
two adversaries A and B who are given two confidential plaintexts x0 and x1 where (x0 6= x1) ∧ (|x0| =
|x1|) and such that B is also given the package ciphertext Ei(x)j for any i ∈ I5 and j ∈ I, B has no
advantage over A when trying to discover relevant information about xb ∀b∈Z2.
However, it so happens that the confidential plaintexts are not all of equal length and no padding
method is used in the packing schemes, thus POAs are infeasible and the process is more efficient but
it means that the ciphertext transmits the plaintext’s length information to the attacker. This unfortunate
leak of information makes the cryptosystem vulnerable to chosen-plaintext attacks in which the adver-
sary is able to pick plaintexts of non-equal length, since an adversary with knowledge of two plaintexts
p0 and p1 such that |p0| 6= |p1| would be able to decide which of the them corresponds to the oracle’s
answer cb for some random b ∈ {0, 1} with probability deviated from 1/2 given that |cb| = h+ |pb| where
h is the message header’s length.
55
Let A be the event corresponding to the storage in memory of package ciphertexts2 whose corre-
sponding confidential plaintexts have lengths Lj for j ∈ Ik according to equation 3.1 and let B be the
event analogous to A but for which the confidential plaintexts are all partitioned into equal-length sub-
words of size l prior to being ciphered and stored. Let ∆ represent the difference between the memory
payoff for k messages in event A and the memory payoff associated with event B. Then
∆ = h
k∑j=1
(mj − 1) (4.1)
The memory restriction imposed to the GDs is very unforgiving, as presented in the following example
analysis of a worst-case situation: first, note that the PSchs associated with the GDs are PiS for i = 1, 2, 3
and let h = maxi∈I3hi where hi = |y|2 − |x|2 for any y ∈ Si and x ∈ Σ∗. Furthermore, consider a
worst-case scenario where l = 1 byte and Lj = 211 ∀j∈Ik , thus ∀j∈Ik : mj = 28 . This means that
∆ = (28− 1)kh. The GDs’ total allocated memory for storing the messages equals 221 bits as according
to section 3.2. If event A is executed instead of B then the maximum number of messages that the GD
at hand can store at a time in its solid time is
k =221
211= 210 (4.2)
On the other hand, on the occurrence of event B, one has
221 = ∆ + k211 ⇒ k ' 11.95 (4.3)
which means that the maximum number of messages that the GD can store in this case is 11, under the
same restrictions. This example highlights the relevance of memory optimization within the GD.
Following, an analysis with respect to a stronger type of attack is performed: the adversary not only
has access to an encryption oracle but also to a decryption oracle. This is considered to be the stronger
level of semantic security [40].
Proposition 4.1.5. For every i ∈ I5, PSch PiS is not semantically secure against chosen-ciphertext
attacks.
Proof. Let A be an adversary playing the IND-CCA game [2] with the words w0 = 0n and w1 = 1n for
some n ∈ N and w.l.o.g. fix j ∈ I. Let Oe and Od be the oracles with access to the PCgF and PCuF of
P1S . The following strategy grants A a non-negligible IND-CCA advantage:
1. A queries Oe with (w0, w1);
2. Oe encrypts wb into y := Ej1(wb), for b ∈ Z2 and returns it to A;
3. A flips the last bit of y and obtains y′;
4. A queries Od with y′;
5. Od returns w′b;
2In this case it is irrelevant which PCgF was used to generate the package ciphertext.
56
6. If w′b = 0n−11 then A chooses b′ = 0. If w′b = 1n−10 then A chooses b′ = 1.
The last step is only feasible due to the properties of error propagation of the CTR mode of operation
[14]. Thus, A can tell to which confidential plaintext belongs the package ciphertext with probability far
from 1/2, meaning that P1S is not IND-CCA secure. The result follows from the equivalence between
semantic security and ciphertext indistinguishability in [3].
For i = 2, 3 and 5 the proof is analogous, with a single remark for the case i = 5, in which the
associated PCgF makes a double call to the ciphertext generator operator G whose cryptosystem and
BCMO arguments are AES and CTR, respectively (see figures C.7 to C.11). In this case, there are two
layers of encryption on the confidential plaintext but the result is the same as in the previous cases due
to the direct error transmission property of CTR. That is, flipping the last bit of the package ciphertext
will entail that the last bit of the decryption of the outer layer of CTR encryption is also flipped, which will
imply the last bit of the plaintext to be wrong after the final decryption, leading to the same outcome.
For i = 4 one faces an encrypt-then-MAC procedure which is the only layout susceptible to be IND-
CCA secure. However, the fact that the IV is not targeted by the HMAC entails that P4S is also IND-CCA
insecure. In fact, note that if an adversary has access to a decryption oracle and the HMAC does not
include the IV then the adversary can change the value of the IV at will in order to claim the keystream
for the new IVs. Hence A will be able to decrypt any message that was encrypted using any of these
new IVs.
The previous result states that an adversary with (temporary) access to a decryption oracle may gain
knowledge to perform a partial or, in a worst-case scenario, complete break of the cryptosystem. This
could be prevented if an encrypt-then-MAC mechanism was adopted and the header of the message
was targeted by the HMAC.
4.1.2.2 Encryption Schemes
Now that the details about the chosen packing schemes have been presented it is time for the discus-
sion of whether the encryption schemes involved in their construction were the right choices for the job.
In order to make use of asymmetric cryptosystems in the development of digital signatures there is the
need for a certificate authority (CA) whose role is to evaluate the authenticity of the messages travelling
throughout the nodes of the network. The need for such an entity immediately deprecates this option
because it is required that this trustable entity provides all the demanding certificates on the fly and as
discussed in chapter 3, the chosen network topology does not include any permanent party on the field
other than the GD. RSA is the only asymmetric cryptosystem implemented in the device’s hardware
and there is currently no middleware developed for the call of this mechanism in software applications.
Moreover the overhead in terms of memory entailed by the usage of asymmetric cryptographic systems
and the increased key size makes these types of systems not suitable under the current restrictions,
either for authentication or privacy purposes, when compared with methods associated with the usage
of cryptographic hash functions. Hence the choice of symmetric methods is considered to be the best
option.
57
CTR CFB
Encryption Parallelizable Non-parallelizable
Decryption Parallelizable Parallelizable
Transmission
Errors
Only the wrong
bits are affected
Affects the wrong bits in the current block;
Completely destroys the following blocks.
IVMay be predictable,
must be unique.
May be predictable,
must be unique.
Table 4.1: Comparison between CTR and CFB features.
AES was the chosen cipher to provide secrecy to the confidential messages. Since this is an old
standard published by NIST, a very fast algorithm and a it is hardware-implemented in the GDs makes it
the most suitable choice for the case. CTR was the chosen mode of operation to deal with messages of
variable length and it turns the encryption scheme into a stream cipher3. Since ECB is not semantically
secure it was a right decision to have chosen CTR over ECB. As for the CBC mode, it requires an
unpredictable IV because it is susceptible to a predictable IV attack as descripted in algorithm 3 while
CTR only requires its unicity, a relevant fact for the decision made because it uplifts the execution time for
the encryption procedures in the devices that have critical issues on the power consumption. Table 4.1
specifies some features of CTR and CFB in order to discuss their applicability to the presented problem
and, as one can observe, the critical consequences of transmission errors and the non-parallelizable
encryption are facts that lead to the deprecation of CFB over CTR.
As discussed so far, CTR mode of operation is the best option over the classical block cipher modes
of operation ECB, CBC and CFB under the constraints at hand. The fact that CTR mode with HMAC-256
checksum (CTR-H) mode was chosen over CCM and GCM is not a very straightforward outcome since
the latter are two authenticated modes which have been well studied and have better performances es-
pecially in terms of memory optimization; for CTR to achieve the same level of security as the previously
mentioned authenticated modes it will definitely perform poorly with respect to memory optimization due
to the added length of the MAC. Notwithstanding, it is in fact faster to execute the CTR-H mode of oper-
ation rather than any of the other two modes because the former is implemented in the chosen device’s
hardware, opposite to the latter; CCM is in fact a very slow mode due to its double block encryption
procedures. Also, note that in this case it is preferable to choose an encryption algorithm with lower
execution time than lower memory usage due to the device’s energy consumption for there is an upper
bound on the data gathering rate and the memory associated with the gathered messages is deleted
upon confirmation of its arrival to the recipient, meaning that the available 256kB suffices for the storage
of the messages. Moreover, the deployment of the devices onto the field of action requires not only
financial but also human resources, hence the durability of the devices is of utmost importance. Another
strong argument for the choice of CTR mode alongside a HMAC is the fact that it makes use of an
extra key (the authenticity and privacy keys are distinct) when compared with the authenticated modes
3See section 2.6.6 for attacks on stream ciphers.
58
of operation GCM and CCM, which use a single key for the whole process. The devices’ short lifetime
entails a high resilient system against brute-force attacks and is the feature that prompts the inexistence
of a key scheduler algorithm.
With this in mind the choice of CTR-H over all the other presented modes is considered to be suitable
for the situation, provided a good usage of the authentication mechanism, i.e., a usage that prevents an
attacker of exploiting weaknesses on the encryption scheme at hand, such as for example malleability.
4.1.3 Attacks
This subsection discusses some of the aspects that may be exploitable by an adversary in practice,
under the assumption that all the restrictions imposed to the network and its elements hold. In practice,
the GDs are deployed onto a fixed location in order to keep on gathering data. It is then possible for an
adversary to physically tamper the devices, which is why the data is encrypted in solid memory in the
first place. Let A be a polynomial-time bounded adversary with physical access to the GDs. If A tries
to read the memory with the ambition of directly retrieving the secret data gathered by GDs then he/she
will have a bad time in doing so because these are encrypted using the PSch PiS defined in equation
3.11, and as seen in section 4.1 these are semantically secure apart from chosen-ciphertext attacks.
Moreover, the adversary will not be able to retrieve the keys from memory because these are stored in a
memory location of restricted access [41]. Now assume that the GDs are deployed at time 0 and that at
time t there is an event E which will directly or indirectly provide information to be gathered by some GD.
The latter will get to know this information upon collecting the data within the time interval T = [t, t+ ε[,
where ε > 0 is the duration of the intelligence leak of the event E. Suppose also that A is aware of both
E and ε. Then, the adversary just needs to interfere with the envisaged GDs in the time interval T in
order to prevent them from collecting any of the relevant data. Thus, for an intelligent adversary who is
able to physically interfere with the GDs, this technique is less likely to be spotted by a tamper detection
mechanism, while allowing A to optimize his/her energy consumption on the attack.
Another attack that could be performed is the reading of volatile memory [42] by a capable adversary.
No message is ever stored as plaintext in solid memory, but both before encryption and after decryp-
tion, the confidential plaintext is automatically stored in volatile memory, even if for a short time frame.
Nonetheless, this time frame may suffice for the adversary to harvest the secret information.
Assuming that A cannot physically harm or interfere with the GDs a straightforward attack would be
for the adversary to perform a MiM attack known as wireless denial of service attack that consists in
jamming the communication channel with junk data, making the exchange of data between the GDs and
the MPP impossible, provided that he finds the correct frequency. This method can somewhat be seen
as a last resource for an adversary who is unable to partially or completely break the cryptosystem for it
would be in his interest to stealthy eavesdrop on the communications in order to eventually tamper the
messages or acquire information and change his strategy accordingly.
59
4.2 Possible Solutions
Some suggested solutions for the problems discussed in the previous section are now presented.
It is important to note that these are just suggestions that aim to improve the selected choices. Even
though they may seem better in a theoretical point of view it could just so happen that in practice these
approaches are not fit for a variety of possible reasons. The reader should notice that the possible
solutions to the issues related with the key generation stage have already been discussed in section
4.1.1.
A reasonable approach to prevent an attack where the adversary takes advantage on the physical
exposure of the device is to choose a device whose hardware provides a tamper detection mechanism
and memory management [43] such that in case of memory compromise it clears all the memory as-
sociated with the confidential data. This is a last resource and leads to the disablement of the device
at hand. With respect to the jamming attack on the wireless communication channel, there is no way
to prevent it from happening. The only solution would be to wire the connection, where the adversary
would be unable to jam the channel without being targeted by the MPP.
4.2.1 Chosen-plaintext attack
Let x1, . . . , xk be confidential plaintexts for some k ∈ N and y1, . . . , yk their correspondent package
ciphertexts, respectively. Recall that ∀i∈I5 : PiS is not secure against chosen-plaintext attacks in which
the adversary is able to pick plaintexts of distinct length. It is known from equation 3.1 that the length Lj
of confidential plaintext xj is a multiple of a minimum length l, i.e.,
Lj = lmj ∀j∈Ik (4.4)
Furthermore, let h be the length of the header of a package ciphertext, which is assumed to be fixed for
any PSch, for simplicity purposes. If one does not consider the header’s overhead, then it is equivalent
to partition each message of length Lj into mj messages of length l and the system would become
resistant to variable-length chosen-plaintext attacks. This is uniquely a theoretical solution for this prob-
lem because the assumption of the absence of the header does not hold in practice. The only way to
overcome this issue is to define all confidential plaintexts to have the same length.
4.2.2 Chosen-ciphertext attack
Proposition 4.1.5 refers to the fragility of the PSchs PiS with respect to chosen-ciphertext attacks.
The previous result holds under the assumption that the adversary A has access to a decryption oracle.
However, in practice there is no feasible way for A to be admitted such resources. The only way A
would be able to succeed would be to impersonate an element of the network, use the correct4 PCgF
and then send it to the end party and be able to extract the plaintext from its volatile memory at the
moment of decryption. This approach is infeasible in practice because no element of the network can
be impersonated by a polynomial-time bounded adversary, as shown in proposition 4.1.2.4By correct one means the correct function using the correct keys.
60
Chapter 5
Implementation Details
This chapter covers the majority of the developed code by illustrating some pieces of pseudocode
and in some cases performing its complexity analysis. Two mock-up application examples were cre-
ated: one is related with the key generation step and the other with the data flow of the discussed
network. Appendix B contains a user guide manual for the first case. The application with regard to the
communication is not presented in this text due to the confidential nature of its execution requirements.
The code for the GDs has been developed in C programming language, whilst all the remaining code
is in Java language. Clearly the low-level programming allows a more versatile working environment,
but this advantage is evened by the inherent problem of performance and memory optimization.
5.1 Key Generation
As described in section 3.4.1 all the keys are generated in the PMS. This process is triggered by a
user-input on the key generation program specified by Algorithm 5. The latter depends on Algorithm 4,
which is the atomic procedure for securely generating keys of a given size.
The PSK for the WPA2 protocol is generated as in section 2.5.2:
PBKDF2(HMAC-SHA-1, wipass, ssid, 4096, 32) (5.1)
where ssid is the wireless network’s SSID and wipass is the password shared by the authorized parties,
i.e., the GD and the MPP. The primitives within these two devices used for the WPA2 secured wireless
network require the SSID and password as input, since the generation of the PMK occurs internally.
Hence, for the generation of the Wi-Fi pre-shared key it suffices for the tuple (ssid, wipass) to be pro-
vided. Therefore the trusted user starts by inputting the SSID and password for the WLAN as well as the
number of GDs and a seed, which can be seen in an informative point of view as a password that makes
the pseudo-random key generation process behave deterministically, meaning that the knowledge of
that seed may serve as proof for the authenticity of the keys and validate any future report, if needed.
Let ΣK = Z256 be the key alphabet and consider A = {k : k ∈ Σ16K } and H = {k : k ∈ Σ32
K } to be
the set of keys generated for encryption and authentication, respectively. The key generation process
for the elements of A and H is very similar, with the exception of the key length. According to [44], the
61
length of the keys in H must be at least equal to the length of the output of the hash function used, i.e.,
32 octets.
Algorithm 4 contains a high level representation (in pseudocode) of the generation step for the en-
cryption and authentication keys. The main layout of the algorithm is as follows: a customized process is
applied to the input secret in order to output a value that will be used as a seed to the cryptographic hash
function SHA-1. This function is used as a PRF in order to produce the salt used in the PBKDF2 function
that generates the key. Following, two important methods of the algorithm are thoroughly explained.
InitRand(pass) is a customized deterministic procedure that makes use of the user-input pass ∈ Σ∗
in order to seed a CSPRNG (namely the SHA-1 function) which produces the salt used in the password-
based key derivation function. The general idea behind it is to process the input in order to increase
the password-complexity in such a way that an adversary who is able to obtain the password chosen by
the user does not know immediately the seed used in the generation of the salt. Let n = bl2/32c be the
number of 32-bit blocks of pass where l2 is the binary length of pass, let pi be the binary representation
of the ith block of 32 bits ∀i∈In and let pn+1 be the last block with k bits, for some 0 ≤ k < 32. It
is useful to convert each of the blocks pi into their decimal representation xi ∈ Z256 to perform the
necessary addition operations. The seed s of the SHA-1 pseudo-random function is computed out of
two sub-seeds s1 and s2. The first is given by:
s1 =
n∑i=1
xi + x∗n+1 (5.2)
where x∗n+1 is the decimal representation of pn+1 ‖ 0m for m = 32 − k whenever k > 0, or x∗n+1 = 0 if
k = 0. Figure 5.1 describes the aforementioned method.
pass CHARACTER STRING
BINARY STRING
DECIMAL STRING
· · ·p2p1 pn pn+1
32 bits 32 bits 32 bits k bits
pn+1 ‖ 0m
32 bits
· · ·x2x1 xn x∗n+1
+
0 < k < 32
s1
Figure 5.1: Pre-processing steps of the secret pass for the generation of the seed of the SHA-1 pseudo-
random function.
As for the computation of the sub-seed s2, let l be the length of pass with respect to elements of Σ
62
and consider
X = (
nn
i=1
xi) ‖ x∗n+1 (5.3)
Clearly, X ∈ Z is a value whose number of digits is greater or equal than l because in the worst case
each xi contains a single digit, which means that one can compute the second sub-seed as s2 = X|land the value used for seeding the SHA-1 cryptographic hash function is given by
s = s1 + s2 (5.4)
GenerateSalt() is a procedure that calls the previously seeded SHA-1 CSPRNG in order to produce
a value composed by 16 octets. The value outputted by this method is used as salt for the PBKDF2
function.
Algorithm 4 Key generation for a fixed length
1: procedure KEYGEN(n, pass, len) . Generating n keys with len bits under the password pass
2: keystream← null;
3: InitRand(pass);
4: for r = 1 to n do
5: salt← GenerateSalt();
6: key ← PBKDF2(hmac-sha-1, pass, salt, 10000, len);
7: keystream← keystream ‖ key;
8: end for
9: return keystream;
10: end procedure
After calling GLOBALKEYGEN(n, pass) there are 3n + 2 keys in stack memory, out of which the first
n+1 are designated to be used by the AES algorithm and the remaining 2n+1 to be used in the HMAC-
SHA-256 algorithm. Let k1, . . . , k3n+2 be all the keys generated, ordered as they were generated, i.e.,
k1 was the first and k3n+2 the last. Then, the following correspondences have been defined:
• (kGMPA )i := ki for 1 ≤ i ≤ n.
• kMPMA := kn+1.
• (kGMPH )i := kj ∀i,j : (n+ 2) ≤ j ≤ (2n+ 1) and i = j − (n+ 1).
• (kGMPH )i := kj ∀i,j : (2n+ 2) ≤ j ≤ (3n+ 1) and i = j − (2n+ 1).
• kMPMH := k3n+2.
Algorithm 5 Key generation algorithm
1: procedure GLOBALKEYGEN(n, pass) . Generating n keys with len bits under the password pass
2: a← KEYGEN(n+ 1, pass, 128);
3: b← KEYGEN(2n+ 1, pass, 256);
4: return a ‖ b;
5: end procedure
63
Considering n to be the total number of GDs, the time complexity of algorithm 4 is O(n). In fact,
note that InitRand(pass) ∈ O(m) for m = |pass|. However, the restriction imposed on the password
(8 ≤ |pass| ≤ 63) entails an upper bound to the run time of this procedure, independently of the chosen
password. Thus, asymptotically under this constraint, the time of InitRand is upper bounded by a
constant, i.e.,O(1). The password-based key derivation function PBKDF2 runs in timeO(1) therefore the
n-step cycle will run in time O(n). The graphics presented in figures 5.2a and 5.2b show the behaviour
of the time spent in the key generation process with the increasing of the total number of GDs. The
y-axis contains the average time for the generation of the key set and the x-axis the number of GDs;
each point in figure 5.2a and 5.2b was taken out from a set of 100 and 20 values, respectively. The first
image intends to expose the results obtained for a real-world situation, which would become impractical
if the number of deployed devices happened to overextend the maximum presented magnitude (100)
and the second shows the results for a theoretical large number of GDs (5000).
(a)
(b)
Figure 5.2: Scatter plots of the average key generation time per number of gathering devices.
Note that there is a slightly variation in the average key generation time for when the number of
64
GDs is near 85, depicted in figure 5.2a. This variation can be explained by the usage of the CPU by
some external processes (e.g. operational system processes). Both scatter plots are in agreement with
the stated time complexity for the generation of the keys. Of course, these graphics do not show the
asymptotically linear time complexity behaviour but give an insight on what happens in practice.
5.2 Data Processing
In order to make use of the keys generated according to the previous section all the elements of the
network must agree on algorithms that pack and unpack the data into each of the formats Fi, i ∈ I5
while making the required verifications. These can be seen as implementations based on the protocol
descripted in chapter 3.4.2.
Let I = {k ∈ Z : k is a GD identifier} and consider the function valid : Z→ Z2 given by
valid(f) =
1 if f ∈ I
0 otherwise(5.5)
Clearly |I| equals the total number of deployed GDs, meaning that there are only |I| possible values
out of the domain of valid for which this functions returns 1. This function is very useful in the upcom-
ing discussion because it can filter whether a given 4-octet field represents a valid gathering device’s
identifier and the associated message should be discarded whenever it returns 0. Moreover, consider
CTR-ENCCv (k, x) to be the result of the encryption of x with the block cipher mode of operation CTR us-
ing the block cipher C, initialization vector v and key k, where the initial blocks given as input to C are
built according to equation 2.18. Analogously, CTR-DECCv (k, y) represents the decryption operation on y.
Lastly, let HMAC-SHA-256(k,m) be the result of the HMAC-SHA-256 function applied to m using k as key,
coping with equation 2.22.
5.2.1 Upstream Algorithm Lifecycle
The algorithms that represent the implementation of the data processing throughout the UDL are
now illustrated and discussed. After gathering the relevant data the GDs will proceed to its processing,
which is descripted in algorithm 6.
GenIvGd takes as input n ∈ N and generates an array with n bits according to the standard incre-
menting function, starting with a hardcoded value for the first initialization vector. That is, whenever a
new message is processed by the GD, the new IV vnew is computed as follows:
[vnew]2 = [vold]2 + 1 (5.6)
This procedure is done in time O(m), where m is the size of the initialization vector. Since the latter
is always fixed independently of the size of the confidential plaintext given as input to the procedure
PACKGD, the GenIvGd method runs in constant time, i.e., O(1).
PACKGD runs in time O(n) where n is the size of its input, because the methods HMAC-SHA-256 and
CTR-ENC run in time O(n), as well as the concatenation operator.
65
Algorithm 6 GD’s data packing algorithm
Input: x . Confidential plaintext
Output: Ej1(x) . For some j ∈ I
1: procedure PACKGD(x) . Pack the data into format F1
2: h1 ← HMAC-SHA-256((kGMPH )j , x);
3: h2 ← HMAC-SHA-256((kGMPH )j , x);
4: inner ← h1 ‖ h2 ‖ x;
5: v ← GenIvGd(128);
6: enc← CTR-ENCAESv ((kGMPA )j , inner)
7: return fid ‖mid ‖ v ‖ enc
8: end procedure
Upon taking as input a confidential plaintext, the algorithm PACKGD transforms it into an element
of S1 using the keys that are stored in the GD’s solid memory with restricted access. After the data
has been packed it is sent to the MPP and upon reception the algorithm PROCESSMPPGD is triggered,
illustrated in Algorithm 10. The latter makes a call to each of the procedures descripted in algorithms 7
to 9.
Algorithm 7 MPP’s data unpacking algorithm: sender GD
1: procedure UNPACKMPPGD(y) . y ∈ S1
2: t← Parse(y,F1); . t is a 4-tuple
3: if !CheckMsgId(t[1]) or valid(t[0]) == 0 then
4: return error;
5: else
6: p← CTR-DECAESt[2] ((kGMPA )j , t[3]); . The key (kGMPA )j follows from t[0] in time O(1)
7: end if
8: return (t, p);
9: end procedure
The first procedure called by Algorithm 10 is UNPACKMPPGD, which receives as input y ∈ S1 and
returns a 2-tuple (t, p) such that t is the 4-tuple for which each element corresponds to the parsed
principal components of y (represented in step 7 of algorithm 6) and p is the result of the decryption of
the body1 of y.
Parse(x, f) is a function that parses the main components of x according to format f . The first of
the input parameters x is the piece of data to be parsed and f is the message format to be considered,
i.e., f ∈ F , where F = {Fi}i∈I5 ∪ {F∗i }i∈I5 . It returns a tuple such that the jth element of the tuple
corresponds to the jth element according to the format f . The time complexity for the parsing of the
word x is linear in the size of the input, that is O(n) where n is the length of x.
CheckMsgId(m) is a function that looks for the value m in a look-up table and in case the value is
1By ’body’ one means apart from the header.
66
found it returns false, otherwise returns true. This table contains all the identifiers of the messages
already received and serves a memory management purpose, especially useful at the GDs for memory
optimization. The time complexity for the search of a value in the table is O(1) in average.
Moreover, valid is also executed in constant time. Therefore, algorithm 7 runs in time O(n).
Algorithm 8 Authenticity and integrity verification algorithm
Input: (h, x, k)
Output: true or false . Success or insuccess, respectively
1: procedure VERIFY(h, x, k) . Verifies whether h is a HMAC of x using key k
2: h∗ ← HMAC-SHA-256(k, x);
3: if h∗ == h then
4: return 0;
5: else
6: return 1;
7: end if
8: end procedure
Upon performing the required decryption and splitting the package into its main components, it’s
necessary for the MPP to call VERIFY in order to make the required authenticity and integrity checks.
If this verification succeeds then it can proceed with building the final message Ej4(x), where x is the
confidential plaintext sent by the GD with identifier j ∈ I. To this end, a call to PACKMPPMNDM is made.
Algorithm 9 MPP’s data packing algorithm: recipient MnDM
1: procedure PACKMPPMNDM(z) . Desired recipient: MnDM
2: v ← PrGenIv(128);
3: e← CTR-ENCAESv (kMPMA , z);
4: h← HMAC-SHA-256(kMPMH , e);
5: return h ‖ v ‖ e;
6: end procedure
PrGenIv(n) is a method that includes the function SHA-1 as a PRNG to construct a unique2 n-bit
value and its execution time is linear on the size of the input. PACKMPPMNDM calls this method in order
to generate the IV used in the AES-CTR encryption. As one can easily observe PACKMPPMNDM ∈ O(n),
where n is the size of the input.
Thus, the whole process of unpacking and verifying the data that reaches the MPP and is sent from
a GD is performed in time O(n), where n is the size of the data given as input to PROCESSMPPGD.
2For a negligible probability of collision.
67
Algorithm 10 MPP’s data processing algorithm: recipient MnDM
Input: y . y ∈ S1
Output: z . z ∈ S4
1: procedure PROCESSMPPGD(y)
2: (t, p)← UNPACKMPPGD(y);
3: u← Parse(p,F∗1 );
4: if VERIFY(u[0], u[2], (kGMPH )j) then
5: r ← t[0] ‖ t[2] ‖ u[0] ‖ u[1] ‖ t[3];
6: return PACKMPPMnDM(r);
7: else
8: return error;
9: end if
10: end procedure
Subsequent to the construction of the package, its delivery to the envisaged recipient takes place.
Upon the arrival of the message, the MnDM executes the steps specified in Algorithm 13 in order to
process and extract the confidential plaintext. This procedure comprises of 2 methods UNPACKMNDM
and VERIFYMNDM specified in algorithms 11 and 12, respectively.
Algorithm 11 MnDM’s data unpacking algorithm
Input: y . y ∈ S4
Output: z . z ∈ S∗41: procedure UNPACKMNDM(y)
2: u← Parse(y,F4);
3: if VERIFY(u[0], u[2], kMPMH ) then
4: return CTR-DECAESu[1] (k
MPMA , u[2]);
5: else
6: return error;
7: end if
8: end procedure
In the UNPACKMNDM method, first the data is unpacked into its principal components (see figure
C.8), then the authenticity and integrity verifications take place and lastly a decryption is performed to
retrieve the plain data within the outer layer of encryption, with format F∗4 (see figure C.7). This procedure
runs in time O(n).
68
Algorithm 12 MnDM’s data verification algorithm
Input: y . y ∈ S∗4Output: t . 3-tuple corresponding to the decrypted part of F∗4
1: procedure VERIFYMNDM(y)
2: u← Parse(y,F∗4 );
3: d← CTR-DECAESu[1] ((k
GMPA )j , u[4]);
4: t← Parse(d,F∗1 );
5: if t[0]! = u[2] or t[1]! = u[3] or !VERIFY(t[0], t[2], (kGMPH )j) or !VERIFY(t[1], t[2], (kGMDH )j) then
6: return error
7: else
8: return t;
9: end if
10: end procedure
The VERIFYMNDM procedure is expected to be given an input with format F∗4 ; any other input will
result in an error returning message and the input data will be discarded. This function translates the
behaviour presented in steps 23 to 27 within the UDL in chapter 3.4.2. Moreover, VERIFYMNDM ∈ O(n).
Algorithm 13 MnDM’s data processing algorithm
Input: y . y ∈ S4
Output: x . Confidential plaintext
1: procedure PROCESSMNDM(y)
2: o← UNPACKMNDM(y);
3: t← VERIFYMNDM(o);
4: return t[3];
5: end procedure
As the name states, PROCESSMNDM ∈ O(n) combines algorithms 11 and 12 in order to process the
data that arrives at the mission and data manager entity. Figure 5.3 contains a high-level visualization of
the interaction between the specified algorithms used in the UDL, thus it has been named of Upstream
Algorithm Lifecycle (UAL).
PACKGD PROCESSMPPGD
• UNPACKMPPGD
• VERIFY
• PACKMPPMNDM
PROCESSMNDM
• UNPACKMNDM
• VERIFYMNDM
Figure 5.3: Upstream Algorithm Lifecycle
69
5.2.2 Downstream Algorithm Lifecycle
In this section the algorithms that represent the implementation of the data processing throughout
the DDL are described.
MnDM generates a command message, processes it according to algorithm 14 and sends it to the
envisaged recipient (MPP)
Algorithm 14 MnDM’s data packing algorithm
Input: x . Confidential plaintext
Output: y . y ∈ S5
1: procedure PACKMNDM(x)
2: h1 ← HMAC-SHA-256((kGMDH )j , x);
3: inner ← h ‖ x;
4: v1 ← PrGenIv(128);
5: e← CTR-ENCAESv1 ((kGMPA )j , inner);
6: h2 ← HMAC-SHA-256(kMPMH , e);
7: v2 ← PrGenIv(128);
8: outer ← h2 ‖ fid ‖ v1 ‖ e;
9: r ← CTR-ENCAESv2 (kMPMA , outer);
10: return v2 ‖ r;
11: end procedure
PackMnDM treats the input x to be the confidential plaintext and builds Ej5(x). This algorithm runs in
time O(n) where n is the size of x, due to the time complexity of all the inherent methods, which are
described in section 5.2.1.
Algorithm 15 MPP’s data unpacking algorithm: sender MnDM
Input: y . y ∈ S5
Output: p or error . p ∈ S∗∗51: procedure UNPACKMPPMNDM(y)
2: t← Parse(y,F5);
3: d← CTR-DECAESt[0] (kMPMA , t[1]);
4: f ← Parse(y,F∗5 );
5: if !VERIFY(f [0], f [3], kMPMH ) or valid(f [1]) == 0 then
6: return error;
7: else
8: return CTR-DECAESf [2] ((k
GMPA )j , f [3]);
9: end if
10: end procedure
Upon receiving the data, the MPP will make a call to the first procedure represented in algorithm
70
17. PROCESSMPPMNDM ∈ O(n) calls two methods (presented in algorithms 15 and 16), both with time
complexity O(n), where n is the size of the input. Algorithm 15 contains the pseudocode associated with
the steps required for unpacking the data with format F5 into the plaintext with format F∗∗5 and algorithm
16 contains the packing method for the MPP, which takes as input a 2-tuple and packs the first entry of
that tuple into format F2 or F3, depending on the second element of the input.
Algorithm 16 MnDM’s data packing algorithm: recipient GD
Input: (x, flag)
Output: y . y ∈ S2 or y ∈ S3, depending on x
1: procedure PACKMPPGD(x, flag)
2: if flag == 1 then
3: t← Parse(x,F∗∗5 );
4: h1 ← HMAC-SHA-256((kGMPH )j , t[1]);
5: else
6: h1 ← HMAC-SHA-256((kGMPH )j , x);
7: end if
8: inner ← flag ‖ h1 ‖ x;
9: v ← PrGenIv(128);
10: e← CTR-ENCAESv ((kGMPA )j , inner);
11: mid + +; . Global value
12: return mid ‖ v ‖ e;
13: end procedure
However, it could also happen that the MPP would generate the comand message x, instead of
acting solely as a communication bridge between the GD and MnDM. In this situation, the procedure
PROCESSMPP ∈ O(n) descripted in algorithm 17 is called and returns a package ciphertext belonging
to the set S2.
Algorithm 17 MPP’s data processing algorithm
Input: x
Output: y . y ∈ S2 or y ∈ S3, depending on the procedure called
1: procedure PROCESSMPPMNDM(x) . Sender MnDM
2: p← UNPACKMPPMNDM(x);
3: return PACKMPPGD(p, 1); . y ∈ S3
4: end procedure
1: procedure PROCESSMPP(x) . MPP generates the confidential plaintext x
2: return PACKMPPGD(x, 0); . y ∈ S2
3: end procedure
Regardless of the method used by the MPP to process the data, it will send it to the envisaged GD
and upon reaching the recipient GDi the procedure descripted in algorithm 20 is triggered. PROCESSGD
71
∈ O(n), where n is the size of the input data that was received through the asynchronous communication
channel established between the MPP and the GD. This method calls the procedures descripted in
algorithms 18 and 19.
Algorithm 18 GD’s data unpacking algorithm
Input: y
Output: d or error . d ∈ S∗2 or d ∈ S∗3 , depending on y
1: procedure UNPACKGD(y)
2: t← Parse(y,F2); . Parsing with format F3 would have the same effect
3: if !CheckMsgId(t[0]) then
4: return error;
5: end if
6: return CTR-DECAESt[1] ((kGMPA )j , t[2]);
7: end procedure
UNPACKGD parses the input with format F2 into its main components, decrypts the ciphered compo-
nent and returns the obtained plaintext; it runs in time O(n). Note that parsing for F3 achieves the same
result since these two formats only differ in the inner format layers F∗2 and F∗3 .
Algorithm 19 GD’s verification algorithm
Input: d
Output: x or error . Confidential plaintext
1: procedure VERIFYGD(d)
2: if msb(d) == 1 then
3: t← Parse(d,F∗3 );
4: if VERIFY(t[1], t[3], (kGMPH )j) and VERIFY(t[2], t[3], (kGMPH )j) then
5: return t[3];
6: end if
7: else
8: t← Parse(d,F∗2 );
9: if VERIFY(t[1], t[2], (kGMPH )j) then
10: return t[2];
11: end if
12: end if
13: return error;
14: end procedure
After the ciphered contents are revealed, the system proceeds to the required verification. To this
end, the procedure VERIFYGD ∈ O(n) descripted in algorithm 19 is called.
msb(d) returns the most significant bit of the word d and is performed in constant time (O(1)) because
it only needs to extract the first bit in memory from the required field.
72
Algorithm 20 gathering device’s data processing algorithm
Input: y
Output: x or error . Confidential plaintext
1: procedure PROCESSGD(y)
2: d← UNPACKGD(y);
3: return VERIFYGD(d);
4: end procedure
Figure 5.4 contains a high-level visualization of the interaction between the specified algorithms used
in the DDL, thus it has been named of Downstream Algorithm Lifecycle (DAL).
PROCESSGD PROCESSMPPMNDM
• UNPACKMPPMNDM
• PACKMPPGD
PACKMNDM
• UNPACKGD
• VERIFYGD
Figure 5.4: Downstream Algorithm Lifecycle
73
74
Chapter 6
Results
With this work one can easily observe that what is better in theory may not always be more suitable
for the specific practical case at hand, where the real constraints must be thoroughly taken into account.
The study, decisions and analysis of this specific network were performed under the supervision of
analysts and developers of the company GMVIS Skysoft, S.A..
The keys’ generation process is very reliable in the sense that it is not only performed within secured
headquarters but also a very efficient method regarding security and time. More specifically, it is a
linear-time process with respect to the number of gathering devices that are to be deployed.
The selected network topology is considered to be the one that better suits the practical needs of
the mission, whilst in theory an ad-hoc network might have a better performance when combined with
elliptic curves [45].
The set of chosen packing schemes is considered to be a robust and secure option for the case,
but would achieve a higher level of security if adopting an encrypt-then-MAC method of encryption
and authentication with addition to including the header in the input to the HMAC; this approach would
assure the system to be IND-CCA secure. However, even though it would strengthen the theoretical
level of security, it would have no impact in practice because the variable size of the plaintexts induce an
inexorable fragility. As for the encryption scheme, AES-GCM should be preferred over AES-CTR-HMAC
in order to grant authenticity, integrity and privacy to the plaintexts in a theoretical point of view. The
former is underqualified simply because it is not implemented in the hardware of this particular type of
devices. Would any other devices with distinct characteristics have been chosen, the outcome would
certainly differ from the one presented. All packing schemes are vulnerable to chosen ciphertext attacks
which is a fact of some concern because an attacker with access to a decryption oracle might be able to
break the system, even if just partially. There is virtually no way of preventing an adversary of performing
a lunchtime attack [40] when the devices are in sleep mode.
All the data processing methods are linear in the size of the input. Note that the size of each of the
inputs to the data processing algorithms descripted in section 5.2 is dependent of the size of the confi-
dential plaintext. It is indeed the only dependency since the size of the confidential plaintext is the only
variable term when computing the size of each of the message formats. Given the GDs’ memory limita-
75
tion, the size of the confidential plaintexts generated by these elements has an upper bound according
to equation 3.13, which implies that the running time of the previously mentioned data processing algo-
rithms is also upper bounded due to this constraint, for their time complexity is O(n). Thus, in practice,
these algorithms are time-efficient.
6.1 Future Work
One possible improvement to the amplitude of the given network would be to allow the parallel activity
of more than one MPP. This would require more keys to be generated not only for privacy and integrity
purposes on the data, but such that all the MPPs are uniquely recognizable by the network parties (that
is, provide an authentication mechanism).
Another subject with good prospects is the hardware improvement of the devices such that their
capabilities allow more efficient and secure packing schemes. By efficient one means both in terms of
time and space complexity. A good example is to implement in hardware a randomized primitive that
makes use of analogue entropy sources in order to obtain fairly randomized values. This feature would
be extremely useful for the IV scheduler within the GDs and, in the event of increasing the devices’
battery lifetime, it would also be very fruitful for the development and maintenance of a key scheduler
algorithm.
In addition, a potential improvement would be to implement in hardware standardized authenticated
modes of operation such as GCM. Thus, adopting AES-GCM instead of AES-CTR-H would optimize
the system’s memory usage and therefore allow the GDs to be able to store more messages as well as
shorten the GDs’ sleep mode time-frame.
76
References
[1] G. Bertrand. Enigma: ou, La plus grande enigme de la guerre 1939-1945. Plon, 1973.
[2] Bellare, Mihir and Rogaway, Phillip. Course Notes: Introduction to Modern Cryptography. University
of California, San Diego.
[3] Yodai Watanabe, Junji Shikata, and Hideki Imai. Equivalence between Semantic Security and
Indistinguishability against Chosen Ciphertext Attacks. RIKEN Brain Science Institute, 2003.
[4] Claude Shannon. https://en.wikipedia.org/wiki/Claude_Shannon.
[5] Shannon, Claude. Communication Theory of Secrecy Systems. 1949.
[6] Matsui, Misturu. Linear Cryptanalysis Method for DES Cipher. Computer and Information Systems
Laboratory.
[7] Douglas Stinson. Cryptography: Theory and Practice,Third Edition. CRC/C&H, 3rd edition, 2005.
[8] National Institute of Standards and Technology. FIPS PUB 46-3: Data Encryption Standard (DES).
National Institute of Standards and Technology, Gaithersburg, MD, USA, October 1999. Super-
sedes FIPS PUB 46-2 1993 December 30.
[9] Michael Luby and Charles Rackoff. How to construct pseudorandom permutations from pseudo-
random functions. SIAM Journal on Computing, 17(2):373–386, 1988.
[10] Rijmen, Vincent Daemen, Joan. AES Proposal: Rijndael. April 2003.
[11] Douglas Stinson. Substitution-permutation networks. In Cryptography: Theory and Practice,Third
Edition, pages 74–79. CRC/C&H, 2005.
[12] National Institute of Standards and Technology. FIPS PUB 197: Advanced Encryption Standard
(AES). National Institute of Standards and Technology, Gaithersburg, MD, USA, November 2001.
[13] M.J.B. Robshaw. Stream ciphers. Technical report, RSA Data Security, Inc. ftp://ftp.
rsasecurity.com/pub/pdfs/tr701.pdf.
[14] National Institute of Standards and Technology. Recommendation for Block Cipher Modes of Op-
eration. National Institute of Standards and Technology, Gaithersburg, MD, USA, 2001.
77
[15] Dworkin, Morris. Recommendation for Block Cipher Modes of Operation: The CCM Mode for
Authentication and Confidentiality. National Institute of Standards and Technology, Gaithersburg,
MD, USA, May 2014.
[16] Dworkin, Morris. Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode
(GCM) and GMAC. National Institute of Standards and Technology, Gaithersburg, MD, USA,
November 2007.
[17] Phillip Rogaway, Mark Wooding, and Haibin Zhang. The Security of Ciphertext Stealing, pages
180–195. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
[18] R. Housley. Cryptographic message syntax (cms). STD 70, RFC Editor, September 2009. http:
//www.rfc-editor.org/rfc/rfc5652.txt.
[19] National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. National
Institute of Standards and Technology, Gaithersburg, MD, USA, April 1995. Supersedes FIPS PUB
180-3 2012 March 6.
[20] Stevens, Marc and Bursztein, Elie and Albertini, Ange and Markov, Yaric. The first collision for full
SHA-1. 2017.
[21] Hugo Krawczyk, Mihir Bellare, and Ran Canetti. Hmac: Keyed-hashing for message authentication.
RFC 2104, RFC Editor, February 1997. http://www.rfc-editor.org/rfc/rfc2104.txt.
[22] Andrew Chi-Chih Yao. Theory and applications of trapdoor functions. In 23rd IEEE Symposium on
Foundations of Computer Science, 1982.
[23] Oded Goldreich. Pseudorandom functions. In Foundations of Cryptography: Volume 1, pages
106–113, New York, NY, USA, 2006. Cambridge University Press.
[24] B. Kaliski. PKCS #5: Password-Based Cryptography Specification Version 2.0. RFC 2898, RFC
Editor, September 2000. http://www.rfc-editor.org/rfc/rfc2898.txt.
[25] B. Kaliski. PBKDF2. In PKCS #5: Password-Based Cryptography Specification Version 2.0, pages
9–11. RFC Editor, 2000.
[26] Ertaul, Levent and Kaur, Manpreet and Gudise, V. A. K. R . Implementation and Performance
Analysis of PBKDF2, Bcrypt, Scrypt Algorithms. http://www.mcs.csueastbay.edu/~lertaul/
PBKDFBCRYPTCAMREADYICWN16.pdf.
[27] Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications.
IEEE Std. 802.11i-2004, August 1999.
[28] National Institute of Standards and Technology. FIPS PUB 800-48: Guide to Securing Legacy
IEEE 802.11 Wireless Networks. National Institute of Standards and Technology, Gaithersburg,
MD, USA, July 2008.
78
[29] Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications
Amendment 6: Medium Access Control (MAC) Security Enhancements. IEEE Std. 802.11i-2004,
July 2004.
[30] Tews, Erik Beck, Martin. Practical attacks against WEP and WPA. November 2008.
[31] Kohno, Tadayoshi Bellare, Mihir. Hash Function Balance and its Impact on Birthday Attacks. May
2004.
[32] Burt Kaliski. Pkcs #7: Cryptographic message syntax version 1.5. RFC 2315, RFC Editor, March
1998. http://www.rfc-editor.org/rfc/rfc2315.txt.
[33] Lars Knudsen and David Wagner. Integral Cryptanalysis, pages 112–127. Springer Berlin Heidel-
berg, Berlin, Heidelberg, 2002.
[34] Andrey Bogdanov, Dmitry Khovratovich, and Christian Rechberger. Biclique Cryptanalysis of the
Full AES. August 2011.
[35] A. C, R. P. Giri, and B. Menezes. Highly efficient algorithms for aes key retrieval in cache access
attacks. In 2016 IEEE European Symposium on Security and Privacy (EuroS P), pages 261–275,
March 2016.
[36] Z. Manna and A. Pnueli. The Temporal Logic of Reactive and Concurrent Systems: Specification.
Springer New York, 2012.
[37] Jon Postel. Transmission control protocol. STD 7, RFC Editor, September 1981. http://www.
rfc-editor.org/rfc/rfc793.txt.
[38] Bellare, Mihir. New Proofs for NMAC and HMAC: Security without Collision-Resistance. June 2006.
[39] A.C.A. Nascimento and P. Barreto. Information Theoretic Security: 9th International Conference,
ICITS 2016, Tacoma, WA, USA, August 9-12, 2016, Revised Selected Papers. Lecture Notes in
Computer Science. Springer International Publishing, 2016.
[40] P. Rogaway, D. Pointcheval, A. Desai, and M. Bellare. Relations Among Notions of Security for
Public-Key Encryption Schemes. June 2001.
[41] Douglas Cook. Measuring memory protection. In Proceedings of the 3rd International Conference
on Software Engineering, ICSE ’78, pages 281–287, Piscataway, NJ, USA, 1978. IEEE Press.
[42] Kristine Amari. Techniques and Tools for Recovering and Analyzing Data from Volatile Memory.
March 2009.
[43] Grand, Joe. Practical Secure Hardware Design for Embedded Systems. In Proceedings of the
2004 Embedded Systems Conference. CMP Media, April 2004.
[44] Krawczyk, Hugo and Bellare, Mihir and Canetti, Ran. RFC 2104: HMAC: Keyed-Hashing for Mes-
sage Authentication. 1997.
79
[45] Douglas Stinson. Elliptic curves. In Cryptography: Theory and Practice,Third Edition, pages 254–
266. CRC/C&H, 2005.
[46] The Java Tutorials. https://docs.oracle.com/javase/tutorial, 1995.
[47] The ASCII Character Set. http://ee.hawaii.edu/~tep/EE160/Book/chap4/subsection2.1.1.
1.html, August 1994.
80
Appendix A
Schemes of Block Cipher Modes of
Operation
EBk
p1P p2 · · · pn
c1C c2 · · · cn
EBk EBk
input
output
(a) ECB encryption.
DBk
c1C c2 · · · cn
p1P p2 · · · pn
DBk DBk
input
output
(b) ECB decryption.
Figure A.1: ECB mode encryption and decryption procedures using an arbitrary block cipher B.
81
EBk
IV
p1P p2 · · · pn
c1C c2 · · · cn
EBk EBk
· · ·
· · ·
(a) CBC encryption.
DBk
c1C c2 · · · cn
IV
p1P p2 · · · pn
DBk DBk
· · ·
· · ·
(b) CBC decryption.
Figure A.2: CBC mode encryption and decryption procedures using an arbitrary block cipher B.
82
IV EBk
p1 p2 · · · pn
c1 c2 · · · cn
EBk EBk
k1 k2 kn
· · ·
· · ·
(a) CFB encryption.
EBk
c1C c2 · · · cn
IV
p1P p2 · · · pn
EBk EBk
· · ·
· · ·
(b) CFB decryption.
Figure A.3: CFB mode encryption and decryption procedures using an arbitrary block cipher B.
83
EBk EBk EBk
t1IV t2 tn
p1 p2 pn
c1 c2 cn
· · ·
(a) CTR encryption.
EBk EBk EBk
t1IV t2 tn
c1 c2 cn
p1 p2 pn
· · ·
(b) CTR decryption.
Figure A.4: CTR mode encryption and decryption procedures using an arbitrary block cipher B.
84
Appendix B
User Manual: Key Generation
Application
An application has been developed with the objective of providing the reader a close look on how
the keys are generated in practice. KeyGeneratorApp is the executable JAR file [46] that contains the
mock-up program for the key generation. This application is able to upload the keys directly into a serial-
connected device or into an encrypted file with a pre-defined data format. The first case is out of the
scope of this text and thus only the second is going to be discussed step-by-step.
When running KeyGeneratorApp, one should see a window similar to Figure B.1a so that the user
can fill the required fields. Figure B.1b contains a suggestive filling of the fields and it is going to be
considered hereinafter given that these fields define the produced keystream. Upon clicking on the
”Generate Keys” button the program generates the whole keystream and saves it in volatile memory
while waiting for the next order. A pop-up window should appear similarly to Figure B.1c; by clicking
”Yes” the program proceeds.
85
(a)
(b) (c)
Figure B.1: KeyGeneratorApp’s initial screen.
The environment of the interface should now change according to Figure B.2a. There are several
options to be chosen, one of which is triggered by the button ”Generate another key set”; it goes back
to the previous key generation step in order to overwrite the current keystream with a new keystream
based on new inputs chosen by the user. This option is advised if the user wants to change some of the
previous inputs. In the last mentioned figure there are three options to be selected for the destination
target, i.e., the user chooses the option where to upload the keys. The first two options are dependent
on a serial-connected device and as previously mentioned this section will not discuss such scenario,
so the only remaining viable field to be selected in this situation is ”Encrypted File”. Figure B.2b details
the chosen sequence: the keys associated with GD with identifier did = 6 in the program’s memory are
going to be exported to a file.
86
(a)
(b)
Figure B.2: KeyGeneratorApp’s target choice screen.
Upon clicking ”Next” a window similar to Figure B.3a should appear and the user can now choose
the name and path in the file system of the (encrypted) file that will hold the keys, and the password that
is used by the password based key derivation function which outputs the key used in the CTR mode of
operation with the AES cipher. The length n of this password must satisfy 8 ≤ n ≤ 63. The filling of
the fields in this image are merely illustrative. Nevertheless the same options can be used apart from
the file path, which must be chosen according to the user’s local file system. After filling the fields, the
button ”Generate File” creates the .enc encrypted file with the chosen password in the desired location,
containing the envisaged keys.
87
(a)
(b)
Figure B.3: KeyGeneratorApp’s file details.
After all these steps are concluded the button ”Finish” is enabled and its action triggers the image
depicted in Figure B.4. Here, the user has several options:
• Communication Application Test: starts the communication mock-up application for the message
interaction between the GD and the MPP. This application is out of the scope of this text, since to
run this executable there are additional requirements uniquely in the possession of the developers;
• Key Checker Application Test: decrypts and exports the keys within a previously encrypted file in
the file system into a file with extension .txt ;
• Export to another target: goes back to the selection of the destination target for the current key set
(Figure B.2);
• Generate another key set: resets the program by clearing the memory associated with the current
keystream and goes back to the initial screen (Figure B.1);
• Exit: safely exits the program.
88
Figure B.4: KeyGeneratorApp’s key export final step.
By selecting ”Key Checker Application Test” as the next step, a window similar to Figure B.5a shall
appear. In the upper right corner the ”Show Help” button drops down a description of the behaviour of
the program. The user can now choose one of the files previously created lying in the file system and fill
the password field with the password that matches the one used in the file’s encryption. The type of file
being decrypted is also a required parameter to be chosen since the program needs to parse the file’s
contents. The parameters chosen throughout this guide are the following:
• WLAN SSID: MISSION2801WINET ;
• WLAN password: secretpassword ;
• Seed: myrandomseed ;
• Number of GD: 73;
• Target file type: GD;
• ID: 6;
• File name: GD6Keys.enc;
• File path: C:\Users \Ricardo \Desktop;
• Password for encrypting the file: fileEncPass;
(a) (b)
Figure B.5: KeyGeneratorApp’s key checker example screen.
89
Figure B.6: Pre-deployment stage secret information’s revealment.
Therefore the chosen options for this case should cope with Figure B.5a. The file with extension .txt
is created in the same location as the file with extension .enc in the file system. Figure B.6 illustrates
the contents of the GD6Keys.txt file for the abovementioned parameters, which can be opened by the
reader in any way of his choice; in this case the source code editor Notepad++ was used. Each entry
corresponds to an element of the keystream and is represented by an array of byte values, that is each
element a of the array is such that a ∈ Z256, according to the ASCII character set [47].
90
Appendix C
Message Formats
0 63
256-bit HMAC h1(D)
HMAC-SHA-256
with key (kFHH )i
256-bit HMAC h2(D)
Header
HMAC-SHA-256
with key (kFMH )i
D
...
confidential
plaintext2
{
Figure C.1: Message format F∗1
0 7 8 39 63
fid mid
128-bit initialization vector IVHeader
Data with format F∗1
AES-CTR
encrypted
with key (kFHA )i
and IV
Figure C.2: Message format F1
The gray field in figure C.2 represents the absence of elements in that position. It was chosen to be
pictured this way for a better visualization of the fields.
2Length may be variable.
91
0 1 31
f
h1(D)HMAC-SHA-256
with key (kFHH )i
D
...
confidential
plaintext
Figure C.3: Message format F∗2
0 31
mid
128-bit IV
Header
Data with format F∗2
AES-CTR
encrypted
with key (kFHA )j
Figure C.4: Message format F2
92
0 1 31
f
h1(D)HMAC-SHA-256
with key (kFHH )i
h2(D)HMAC-SHA-256
with key (kFMH )i
D
...
confidential
plaintext
Figure C.5: Message format F∗3
0 31
mid
128-bit initialization vector IVHeader
Data with format F∗3
AES-CTR
encrypted
with key (kFHA )j and IV
Figure C.6: Message format F3
93
0 7 8 63
fid
128-bit initialization vector IV
h1(D)
HMAC-SHA-256
with key (kFHH )i
h2(D)
Header
HMAC-SHA-256
with key (kFMH )i
Data with format F∗1
AES-CTR
encrypted
with key (kFHA )i
and IV
Figure C.7: Message format F∗4
0 63
h3(enc pack)
HMAC-SHA-256
with key kHMH
128-bit initialization vector IV
Header
enc pack:
Data with format F∗4
AES-CTR
encrypted
with key kHMA
and IV
Figure C.8: Message format F4
0 63
h2(D)
HMAC-SHA-256
with key (kFMH )i
D
...
}Confidential
plaintext
Figure C.9: Message format F∗∗5
94
0 7 8 63
h3(enc pack)
fid
128-bit initialization vector IV
Header
enc pack:
Data with format F∗∗5
AES-CTR
encrypted
with key (kFHA )i
and IV
Figure C.10: Message format F∗5
0 63
128-bit initialization vector IV
}Header
Data with format F∗5
AES-CTR
encrypted
with key kHMA
and IV
Figure C.11: Message format F5
95