morus - nanyang technological university | ntu · morus a fast authenticated cipher hongjun wu tao...

25
MORUS A Fast Authenticated Cipher Hongjun Wu Tao Huang Nanyang Technological University DIAC 2016, Nagoya 26 Sep 2016

Upload: trinhkien

Post on 06-Jun-2019

219 views

Category:

Documents


0 download

TRANSCRIPT

MORUSA Fast Authenticated Cipher

Hongjun Wu Tao Huang

Nanyang Technological University

DIAC 2016, Nagoya26 Sep 2016

MORUS

DIAC 2016 MORUS

Different Design Approaches:

Fast

Lightweight

DIAC 2016 MORUS

AES-NI (AEGIS)

SIMD (MORUS)

Mode (JAMBU)

Dedicated (ACORN)

Design Motivation and Main Features

• To design a high-speed authenticated cipher: • No AES-NI • Make use of the SIMD (SSE2, AVX2) instructions

• Features• Fast in software: 0.69 cpb on Haswell• Fast in hardware: 95.8 Gbps on Xilinx Virtex 7

250 Gbps on 65 nm ASIC (ETH implementation)

• Nonce-based

DIAC 2016 MORUS

Changes in MORUS v2

• Tweaks are only applied to the finalization of MORUS• Remove register 𝑆3 in the message word of finalization

• Change the tag generation to the same way as the keystream generation• Increase the number of steps from 8 to 10 (compensating the change in tag generation)

• Rationale for tweaks• Improve the hardware efficiency of MORUS

DIAC 2016 MORUS

MORUS: Parameters

State size(bits)

Key size(bits)

Tag size(bits)

Plaintext size(bits)

AD size(bits)

MORUS-1280-128 1280 128 128 <264 <264

MORUS-640-128 640 128 128 <264 <264

MORUS-1280-256 1280 256 128 <264 <264

DIAC 2016 MORUS

MORUS: State and Operations

• State organization • MORUS-1280: five 256-bit words

• MORUS-640 : five 128-bit words

•Operations:

• XOR, AND, SHIFT

• 𝑅𝑜𝑡𝑙_128_32(𝑥, 𝑛): Divide a 128-bit block 𝑥 into 4 32-bit words, rotate each word left by 𝑛 bits.

• 𝑅𝑜𝑡𝑙_256_64(𝑥, 𝑛): Divide a 256-bit block 𝑥 into 4 64-bit words, rotate each word left by 𝑛 bits.

DIAC 2016 MORUS

MORUS: State Update (Overview)

One step: 5 rounds

DIAC 2016 MORUS

MORUS: Initialization

•Load IV, key and constants into the initial state

•Update state: 16 steps

•Key is XORed to the state at the end of the initialization

DIAC 2016 MORUS

MORUS: Keystream Generation

• State 𝑆 = {𝑆0, 𝑆1, 𝑆2, 𝑆3, 𝑆4}

• For MORUS-640: • 𝑘𝑒𝑦𝑠𝑡𝑟𝑒𝑎𝑚 = 𝑆0 ⊕ 𝑆1 <<< 96 ⊕ (𝑆2 & 𝑆3)

• For MORUS-1280• 𝑘𝑒𝑦𝑠𝑡𝑟𝑒𝑎𝑚 = 𝑆0 ⊕ 𝑆1 <<< 192 ⊕ (𝑆2 & 𝑆3)

DIAC 2016 MORUS

MORUS: Finalization (Tweaked!)

MORUS v1

• State update: 8 steps

• Message𝑆3 ⊕ (𝑎𝑑𝑙𝑒𝑛||𝑚𝑠𝑔𝑙𝑒𝑛)

• Tag generation𝑆1 ⊕𝑆2 ⊕𝑆3 ⊕𝑆4

MORUS v2• State update: 10 steps• Message

(𝑎𝑑𝑙𝑒𝑛||𝑚𝑠𝑔𝑙𝑒𝑛)

• Tag generation𝑆0 ⊕ 𝑆1 <<< 𝐶

∗⊕

(𝑆2 & 𝑆3)

* 𝐶 = 96 for MORUS-640; 𝐶 = 192 for MORUS-1280

DIAC 2016 MORUS

MORUS: Security Goal

Confidentiality (bits) Integrity (bits)

MORUS-640-128 128 128

MORUS-1280-128 128 128

MORUS-1280-256 256 128

DIAC 2016 MORUS

Security of MORUS: Initialization

•Algebraic degree•After 10 steps, the algebraic degree exceeds 256

•Differential cryptanalysis•differential probability < 2-256

DIAC 2016 MORUS

Security of MORUS: Encryption

•Guess-and-determine attack • state size of MORUS is at least five times of key size• keystream generation function

• state bits are not directly known to the adversary

DIAC 2016 MORUS

Security of MORUS: Finalization

• Internal state collision •Probability < 2-128

•Differential forgery attack on the finalization •10 steps, differential probability < 2-256

DIAC 2016 MORUS

Security of MORUS

•Remark on the analysis by Mileva et al. in BalkanCryptSec 2015•Not that relevant to the security of MORUS

• Collison on the state update function: assuming special difference in the state – unrealistic

• Distinguisher in nonce-reuse scenarios – excluded in our security claim

• “differential bias” – becomes invalid when a different key is used

DIAC 2016 MORUS

MORUS: Hardware Performance

•State update function of MORUS is designed to be fast in hardware•AND and XOR gates are used• Short critical path

DIAC 2016 MORUS

MORUS: Hardware Performance

•Current implementation on FPGA using CAESAR API• Virtex 7, Xilinx Vivado 2016.2

Area(Slice)

Area(LUT)

Frequency(MHz)

TP(Gbps)

TP/LUT(Mbps/LUT)

MORUS-640 681 2129 342.4 43.8 20.6

MORUS-1280 1045 3746 370.4 95.8 25.6

DIAC 2016 MORUS

MORUS: Hardware Performance

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Area(LUT)

MORUS v1 MORUS v2

0

10

20

30

40

50

60

70

80

90

100

Throughput(Gbps)

MORUS v1 MORUS v2

0

5

10

15

20

25

30

Throughput/LUT(Mbps/LUT)

MORUS v1 MORUS v2

Comparison between MORUS-1280 v1 and MORUS-1280 v2

DIAC 2016 MORUS

MORUS: Hardware Performance

• Performance on ASIC: high throughtput/area (Michael Muehlberghuber and Frank K. Gürkaynak, DIAC 2015)

DIAC 2016 MORUS

• Performance on ASIC: high throughput (250Gbps) (Michael Muehlberghuber and Frank K. Gürkaynak, DIAC 2015)

DIAC 2016 MORUS

MORUS: Software Performance

16B 64B 512B 1024B 4096B 16384B

MORUS-640(EA) 40.64 10.35 2.30 1.72 1.30 1.19

MORUS-640(DV) 38.47 10.13 2.30 1.72 1.29 1.18

MORUS-1280(EA) 45.32 10.38 1.85 1.24 0.80 0.69

MORUS-1280(DV) 45.74 10.66 1.91 1.28 0.81 0.70

• Speed on Haswell, AVX2 is used in MORUS-1280

DIAC 2016 MORUS

MORUS: Software Performance

•Faster than AES-GCM on Haswell (1.03 cpb)

•Almost the same as MORUS v1 for long message

•Reasons: •Benefits from SIMD•Removed the redundant operations in the cipher

DIAC 2016 MORUS

Conclusion

•MORUS• The fastest candidate on the platforms with SIMD but

with no AES-NI (0.69 cpb with AVX2)• The most efficient candidate in hardware

MORUS-1280: 95.88 Gbps, 3764 LUTs, 25.6 Mbps/LUT

•MORUS v2• Tweaked finalization to reduce hardware area.

Throughput/Area is increased by 28%

DIAC 2016 MORUS

Thanks for your attention!

DIAC 2016 MORUS