latincrypt2019.cryptojedi.org · 2019-10-17 · title: fast white-box implementations of dedicated...

Post on 10-Mar-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Fast White-Box Implementationsof Dedicated Ciphers onthe ARMv8 Architecture

Felix Carvalho Rodrigues,H. Fujii, A. C. Serpa, G. Sider, R. Dahab, J. Lopez

October 3, 2019

Laboratory of Security and Cryptography,Institute of Computing,University of Campinas (Unicamp)

This research was partially supported by Samsung Eletronica da Amazonia Ltda., throughthe “White Box Cryptography” project, within the scope of the Informatics Law No. 8248/91.

This research was partially supported by Samsung Eletronica da Amazonia Ltda.,through the “White Box Cryptography” project, within the scope of the Informatics Law

No. 8248/91.

1

Index

Introduction

Dedicated Ciphers

SPACE

WEM

SPNbox

Implementation

Optimizing SPNbox

Results

2

Introduction

White-box threat model: direct access to environment

Black-Box:

• Access only toplaintexts andciphertexts

• No leakagefromimplementation

• No access toexecutingenvironment

3

White-box threat model: direct access to environment

Grey-Box (hardware side-channel):

• Some leakagefromimplementationavailable

• Timing analysis,Power analysis

• No (direct)access toexecutingenvironment

3

White-box threat model: direct access to environment

In a White-Box context, an attacker can:

• Access thememory

• Manipulate theexecution

• Analyze binarycode

3

White-box threat model: direct access to environment

(a) Black Box Model (b) Grey Box Model (c) White Box Model

Access to:◦ plaintext◦ ciphertext

Access to:◦ plaintext◦ ciphertext

◦ side channel

0101010111110101010000011101101110101010

Access to:◦ plaintext◦ ciphertext

◦ side channel

◦ executionenvironment

informationinformation

In a White-Box context, an attacker can:

• Access thememory

• Manipulate theexecution

• Analyze binarycode

How to protect against such powerful adversaries?

3

White-Box Cryptography: first attempts

White-Box Cryptography:

• Design and secure the implementations of cryptographicalgorithms running in untrusted environments

First attempt: standard block cipher (e.g., AES):

• Protect implementation through a network of lookup tables

• Several proposed implementations [Chow et al., 2003,Bringer et al., 2006, Xiao and Lai, 2009, Karroumi, 2011]

• Academic proposals mostly broken: [Billet et al., 2005,Goubin et al., 2007, Michiels et al., 2009, De Mulder et al., 2010,Lepoint et al., 2014, De Mulder et al., 2013]

• Some hope? CHES 2019 white-box challenge [WhibOx, 2019]had three implementations which are still unbroken!

4

White-Box Cryptography: first attempts

White-Box Cryptography:

• Design and secure the implementations of cryptographicalgorithms running in untrusted environments

First attempt: standard block cipher (e.g., AES):

• Protect implementation through a network of lookup tables

• Several proposed implementations [Chow et al., 2003,Bringer et al., 2006, Xiao and Lai, 2009, Karroumi, 2011]

• Academic proposals mostly broken: [Billet et al., 2005,Goubin et al., 2007, Michiels et al., 2009, De Mulder et al., 2010,Lepoint et al., 2014, De Mulder et al., 2013]

• Some hope? CHES 2019 white-box challenge [WhibOx, 2019]had three implementations which are still unbroken! BROKEN!

AES seems to be hard to protect in a white-box context...

4

Dedicated Ciphers

Dedicated White-Box Block Ciphers

Idea:

• Design a block cipher from the ground up to be “secure” in awhite-box context

Focus of currently proposed dedicated ciphers:

• Unbreakability• Protection against key extraction• Given access to a white-box implementation, the attacker must not

be able to extract the secret key embedded in the cipher

• Incompressibility

5

Dedicated White-Box Block Ciphers

Idea:

• Design a block cipher from the ground up to be “secure” in awhite-box context

Focus of currently proposed dedicated ciphers:

• Unbreakability• Incompressibility

• Mitigation against code lifting• Given full access to a white-box cipher implementation, the

attacker must not be able to produce a smaller implementation• Given “almost full” access to a white-box cipher implementation,

the attacker must not be able to encrypt or decrypt any messagewith a significant probability

5

Proposals considered in this work

Proposals:

• SPACE [Bogdanov and Isobe, 2015]

• SPNbox [Bogdanov et al., 2016a]

• WEM [Cho et al., 2017].

Parameters:

• nin: determines lookup table input size of ciphers:• SPACE-16 ≡ SPACE instantiated with nin = 16• We consider nin either 8 or 16 in this work

• R: number of rounds of cipher

No complete comparisons were made in relation to each other inprevious works!

6

SPACE Family of Ciphers

• Based on a Feistel Network

• In each round, the state isrotated left

• Number of rounds:• R = 128 for nin = 16• R = 300 for nin = 8

Example: single round of SPACE-167

SPACE-16: Feistel function

In the black-box:

• Extract 16 bits from state andconcatenate with zero vector

• Encrypt with AES and master key

• Discard 16 bits from output and returnthe remaining 112 bits

In the white-box:

• Extract 16 bits from state

• Encrypt with a lookup table of 16-to-112bits

For both: after result add round constant

8

White-box Even-Mansour ciphers

⇒Based on the Even-Mansourscheme:

• Keys are replaced byincompressible S-boxes

• Public permutation isdefined as 5 rounds ofAES with a zeroed key

• The proposed cipher uses12 rounds

9

WEM: S-box generation

To generate each m-to-m S-box:

• Generate long sequence ofpseudo-random bits fromsecret key k

• Generate sequenceT = (0, . . . ,2m)

• Shuffle T using the generatedpseudo-random bits from k

10

SPNbox Family of Ciphers

Algorithm:

• Substitution Layer (Snin ): dividestate into nin-bit blocks and run amini block cipher for each blockwith secret key

• Permutation Layer (θ): multiplystate by a matrix in GF (2nin)

• Affine Layer (σ): add roundconstant

• Repeat for 10 rounds

One round of SPNBox-16’souter cipher

11

SPNbox: small inner cipher

One round of SPNbox-32’sinner cipher • Smaller SPN guarantees the

substitution phase of its biggercounterpart

• Repurposes some AES operations:• SB uses SubBytes operation from

AES• MC uses the MixColumns operation

from AES• AK is a simple key addition

• Number of rounds depends on itssize nin

• In a white-box context, this innerSPN cipher becomes a lookup table

12

Implementation

ARMv8-A Architecture

• 32 SIMD/NEON 128-bit registers(NEON mode):

• each register can be interpreted as16 bytes, 8 halfwords, 4 words or 2doublewords

• Cortex-A75:• Two 8 stage NEON instruction

pipeline• One separate load/store pipeline for

NEON instructions

• Important NEON instructions: tbl/tbx,rev, ext;

13

Pipeline Vs Cache Optimization

Pipelined implementation (4-way):

14

Pipeline Vs Cache Optimization

Horizontal implementation (“all blocks”-way):

15

SPACE and WEM Implementation details

SPACE:

• Benefits from a pipelinedmemory access

• For a single block, only acouple of NEON operations:

• A couple of eor additions• A byte rotation with an ext

• Favors large pipelinedimplementations, less suitablefor H-way

• Pad lookup table for bettermemory alignment

WEM:

• Both horizontal andintercalated strategies havemerit

• For pipelined implementation,we separated the 16-to-16lookup tables as two 16-to-8tables:

• This allows for a lookuptable to fit into a L2 cacheon lower end hardware

• Use hardware cryptoextensions (AES functions)

16

SPNbox Optimizations

• Allows for greater optimization opportunities

• Four main implementations: One block, multiple blocks(transposed), lookup table multiplications and horizontal

• Main point for optimization: its matrix multiplication (θ layer)

17

Single Block: Permutations and Multiplications

• Let Ti(S) = S × i :• Multiplication of state S by a polynomial i

translates to a series of constant-timepolynomial additions and multiplications byx in GF(216):

sshr v2.8h,v0.8h, #15 // v0 is the state

shl v0.8h,v0.8h, #1 // v1 is the mask 0x002B

and v2.8h,v1.8h,v2.8h

eor v0.8h,v0.8h,v2.8h

• The result of R = M16×S can be written as:

R = T1(S)⊕ P1(T3(S))⊕ P2(T4(S))⊕ P3(T5(S))⊕

P4

(T6(S)⊕ P1(T8(S))⊕ P2(TB(S))⊕ P3(T7(S))

)

where

18

Transposing Multiple Blocks

By transposing blocks we can eliminate permutations:

The result Ri , for i from 0 to 7, can be seen as:

Ri = Tai,0(S′0)⊕ Tai,1(S

′1)⊕ Tai,2(S

′2)⊕ Tai,3(S

′3)⊕

Tai,4(S′4)⊕ Tai,5(S

′5)⊕ Tai,6(S

′6)⊕ Tai,7(S

′7),

19

SPNbox8: Constant-Time Vs Lookup Tables

Constant-Time:

• Use tbl operation to performconstant-time lookups

Apply masks 0x40, 0x80, 0xC0 into v1,v2,v3.Then:

tbl v0.16b, {v16.16b -- v19.16b}, v0.16b

tbl v1.16b, {v20.16b -- v23.16b}, v1.16b

tbl v2.16b, {v24.16b -- v27.16b}, v2.16b

tbl v3.16b, {v28.16b -- v31.16b}, v3.16b

Sum (eor) up v0,v1,v2,v3 to obtain final lookup

• Require lots of registers: 16for lookup table alone

• Use horizontal strategy toprevent register spill

• About two times slower thanbest implementation

Lookup θ layer (LUT):

• Similar to lookup tableimplementations of AES,decompose matrix intocolumn multiplications

• Store precomputedmultiplications into lookuptables, composed with γ layer

• Great when the lookup tablesfit into cache

• Can be adjusted based oncache size: permutations canreplace lookups

20

Results

Experimental Setup

CPUs:

Cortex-A57 Exynos7420 Cortex-A57 core clocked at 2100MHz,equipped with 2MiB of L2 cache shared across all A55cores (from a Samsung Galaxy S6)

Cortex-A75 SDM845 Cortex-A75 core clocked at 2803MHz,equipped with 256KiB of L2 cache for each core and2MiB of shared L3 cache (from a Samsung Galaxy S9)

• Used CTR mode of operation

• Average of 215 iterations for messages of 2KiB

• Used C code with NEON intrinsics, compiled with clang

21

Performance on different strategies

For SPNBox8 and 16 (Cortex-A75):

• Different nin leadsto difference inperformance ofstrategies

• LUT clearly betterfor SPNBox8

• 8-way strategyslightly better forSPNbox16

22

Performance comparison

Performance for Cortex A75 and A57:

23

SPNbox16 Analysis

Additional tests for SPNbox16:

• Use ECB to remove possibleinterference from CTR

• Make a version removing impactof cache misses

• Make “partial” ciphers:• ECB-Gamma: only the γ layer• ECB-Theta: both θ and σ

Found difference in measured times when compared to Bogdanov etal. implementation [Bogdanov et al., 2016b].

24

Conclusions

• First comparison of the three ciphers in the same hardware(ARMv8 Cortex-A75, A57, etc)

• While still far from hardware-aided AES implementations,dedicated ciphers can be competitive when considering softwareimplementations

• On a Cortex-A75:• Optimized AES in software ≈ 20 CPB• Best dedicated cipher (SPNBox) ≈ 30 CPB• Worst dedicated cipher (SPACE) ≈ 100 CPB

• Cache matters! Some ciphers presented better results in aCortex-A57 because of the different cache configurations

25

Thank you. Any questions?

Fast White-Box Implementationsof Dedicated Ciphers onthe ARMv8 Architecture

Felix Carvalho Rodrigues,H. Fujii, A. C. Serpa, G. Sider, R. Dahab, J. Lopez

October 3, 2019

Laboratory of Security and Cryptography,Institute of Computing,University of Campinas (Unicamp)

This research was partially supported by Samsung Eletronica da Amazonia Ltda., throughthe “White Box Cryptography” project, within the scope of the Informatics Law No. 8248/91.

26

References i

Billet, O., Gilbert, H., and Ech-Chatbi, C. (2005).Cryptanalysis of a white box aes implementation.In Handschuh, H. and Hasan, M. A., editors, Selected Areas inCryptography, pages 227–240, Berlin, Heidelberg. SpringerBerlin Heidelberg.

Bogdanov, A. and Isobe, T. (2015).White-box cryptography revisited: Space-hard ciphers.In Proceedings of the 22Nd ACM SIGSAC Conference onComputer and Communications Security, CCS ’15, pages1058–1069, New York, NY, USA. ACM.

27

References ii

Bogdanov, A., Isobe, T., and Tischhauser, E. (2016a).Towards practical whitebox cryptography: Optimizingefficiency and space hardness.In Cheon, J. H. and Takagi, T., editors, Advances in Cryptology –ASIACRYPT 2016, pages 126–158, Berlin, Heidelberg. SpringerBerlin Heidelberg.

Bogdanov, A., Isobe, T., and Tischhauser, E. (2016b).Towards practical whitebox cryptography: Optimizingefficiency and space hardness.In Cheon, J. H. and Takagi, T., editors, Advances in Cryptology –ASIACRYPT 2016, pages 126–158, Berlin, Heidelberg. SpringerBerlin Heidelberg.

28

References iii

Bringer, J., Chabanne, H., and Dottax, E. (2006).White box cryptography: Another attempt.Cryptology ePrint Archive, Report 2006/468.https://eprint.iacr.org/2006/468.

Cho, J., Choi, K. Y., Dinur, I., Dunkelman, O., Keller, N., Moon, D.,and Veidberg, A. (2017).Wem: A new family of white-box block ciphers based on theeven-mansour construction.In Handschuh, H., editor, Topics in Cryptology – CT-RSA 2017,pages 293–308, Cham. Springer International Publishing.

29

References iv

Chow, S., Eisen, P., Johnson, H., and Van Oorschot, P. C. (2003).White-box cryptography and an aes implementation.In Nyberg, K. and Heys, H., editors, Selected Areas inCryptography, pages 250–270, Berlin, Heidelberg. SpringerBerlin Heidelberg.

De Mulder, Y., Roelse, P., and Preneel, B. (2013).Cryptanalysis of the xiao – lai white-box aes implementation.

In Knudsen, L. R. and Wu, H., editors, Selected Areas inCryptography, pages 34–49, Berlin, Heidelberg. Springer BerlinHeidelberg.

30

References v

De Mulder, Y., Wyseur, B., and Preneel, B. (2010).Cryptanalysis of a perturbated white-box aesimplementation.In Gong, G. and Gupta, K. C., editors, Progress in Cryptology -INDOCRYPT 2010, pages 292–310, Berlin, Heidelberg. SpringerBerlin Heidelberg.

Goubin, L., Masereel, J.-M., and Quisquater, M. (2007).Cryptanalysis of white box des implementations.In Adams, C., Miri, A., and Wiener, M., editors, Selected Areas inCryptography, pages 278–295, Berlin, Heidelberg. SpringerBerlin Heidelberg.

31

References vi

Karroumi, M. (2011).Protecting white-box aes with dual ciphers.In Rhee, K.-H. and Nyang, D., editors, Information Security andCryptology - ICISC 2010, pages 278–291, Berlin, Heidelberg.Springer Berlin Heidelberg.

Lepoint, T., Rivain, M., De Mulder, Y., Roelse, P., and Preneel, B.(2014).Two attacks on a white-box aes implementation.In Lange, T., Lauter, K., and Lisonek, P., editors, Selected Areasin Cryptography – SAC 2013, pages 265–285, Berlin,Heidelberg. Springer Berlin Heidelberg.

32

References vii

Michiels, W., Gorissen, P., and Hollmann, H. D. L. (2009).Cryptanalysis of a generic class of white-boximplementations.In Avanzi, R. M., Keliher, L., and Sica, F., editors, Selected Areasin Cryptography, pages 414–428, Berlin, Heidelberg. SpringerBerlin Heidelberg.

WhibOx (2019).WhibOx Contest Edition 2.https://whibox.cyber-crypt.com/.Accessed: 2019-09-30.

Xiao, Y. and Lai, X. (2009).A secure implementation of white-box aes.In 2009 2nd International Conference on Computer Science andits Applications, pages 1–6.

33

top related