implementation of finite field inversion debdeep mukhopadhyay chester rebeiro dept. of computer...
TRANSCRIPT
![Page 1: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/1.jpg)
IMPLEMENTATION OF
FINITE FIELD
INVERSION
Debdeep Mukhopadhyay Chester Rebeiro
Dept. of Computer Science and Engineering
Indian Institute of Technology Kharagpur
INDIA
![Page 2: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/2.jpg)
Finite Field Inverse
23-27 May 2011 Anurag Labs, DRD0 2
![Page 3: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/3.jpg)
Itoh-Tsujii Method for Binary Fields
23-27 May 2011 Anurag Labs, DRD0 3
![Page 4: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/4.jpg)
The Steps
23-27 May 2011 Anurag Labs, DRD0 4
![Page 5: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/5.jpg)
How do we do a SquaringConsider (again) the field GF(24), with
irreducible polynomial x4+x+1. What is (x3+x2+1)2 in this field ?
23-27 May 2011 Anurag Labs, DRD0 5
![Page 6: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/6.jpg)
Squaring
Squaring can be represented in the form of a matrix multiplication T.a
23-27 May 2011 Anurag Labs, DRD0 6
![Page 7: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/7.jpg)
Quad OperationQuad operation
can be done by two squaring operations.
Quad operation can be written in the form T2.a
23-27 May 2011 Anurag Labs, DRD0 7
![Page 8: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/8.jpg)
Advantage of using Quad Operations
Quad circuits have better LUT utilization compared to Squarer circuits
23-27 May 2011 Anurag Labs, DRD0 8
![Page 9: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/9.jpg)
Generalization of the Itoh-Tsujii Algorithm
23-27 May 2011 Anurag Labs, DRD0 9
![Page 10: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/10.jpg)
Theorem 1
23-27 May 2011 Anurag Labs, DRD0 10
![Page 11: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/11.jpg)
Theorem 2
23-27 May 2011 Anurag Labs, DRD0 11
![Page 12: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/12.jpg)
Quad Itoh-Tsujii Inversion Algorithm
23-27 May 2011 Anurag Labs, DRD0 12
![Page 13: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/13.jpg)
A Circuit for InversionAt every
clock cycle, either the multiplier or the quadblock is active.
The output of the multiplier is stored in mout register
23-27 May 2011 Anurag Labs, DRD0 13
![Page 14: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/14.jpg)
Finding the Inverse
23-27 May 2011 Anurag Labs, DRD0 14
![Page 15: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/15.jpg)
Finding the Inverse Step 2
23-27 May 2011 Anurag Labs, DRD0 15
![Page 16: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/16.jpg)
Finding the Inverse Step 2
23-27 May 2011 Anurag Labs, DRD0 16
![Page 17: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/17.jpg)
Control Signals for the Inverse
23-27 May 2011 Anurag Labs, DRD0 17
![Page 18: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/18.jpg)
Performance Charts
23-27 May 2011 Anurag Labs, DRD0 18
![Page 19: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/19.jpg)
Higher Powered Itoh-Tsujii
23-27 May 2011 Anurag Labs, DRD0 19
• We seen that Quad circuits utilize LUTs in a better way compared to squarer circuits.
• Also LUT size is increasing as silicon technology reduces
• We have seen 4-LUT become 6-LUT, and now 8-LUT
• This gives us a motivation to investigate using higher powers other than quad circuits
![Page 20: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/20.jpg)
Revisiting the Theorems
23-27 May 2011 Anurag Labs, DRD0 20
![Page 21: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/21.jpg)
2n Itoh-Tsujii Inversion
23-27 May 2011 Anurag Labs, DRD0 21
These are the overheads
Higher Powered
![Page 22: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/22.jpg)
Overhead in 2n Itoh-Tsujii
23-27 May 2011 Anurag Labs, DRD0 22
• Computation of .
• Using addition chain for , can be computed in clock cycles, where is the length of addition chain for .
• Computation of , for
• Using addition chain for , that contains , can be
computed during computation, because .
![Page 23: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/23.jpg)
2n Itoh-Tsujii Design
23-27 May 2011 Anurag Labs, DRD0 23
![Page 24: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/24.jpg)
24
Configurable Parameters
• Addition chain.
• Power circuit used in power block.
• Number of cascaded power
circuits in the power block.
• These have an effect on – Number of clock cycles.
– Critical path delay.
Building the Optimal Design
For a given field and a given FPGA how do decide the optimal
design ?
23-27 May 2011 Anurag Labs, DRD0
![Page 25: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/25.jpg)
Estimating AREA required on an FPGA
23-27 May 2011 Anurag Labs, DRD0 25
• A k input LUT (k-LUT) can implement any functionality of maximum k input variables.
• Total number of k-LUTs to implement a function with variables can be expressed as
![Page 26: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/26.jpg)
Estimating Delay of a Design in an FPGA
23-27 May 2011 Anurag Labs, DRD0 26
• Delay in FPGAs comprise of LUT delay and routing delay..
• For this ITA architecture, we have experimentally found, total delay is proportional to number of LUTs in critical path.
• We denote number of LUTs in a delay path as maxlutpath.
• In k-LUT, maxlutpath of an variable function is
![Page 27: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/27.jpg)
Recap : Karatsuba Multiplier
23-27 May 2011 Anurag Labs, DRD0 27
![Page 28: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/28.jpg)
Hybrid Karatsuba Multiplier for GF(2233)Note that the school book multiplier
has replaced the general Karatsuba Multiplier
23-27 May 2011 Anurag Labs, DRD0 28
School Book Multiplier
![Page 29: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/29.jpg)
29
• The field multiplier is a hybrid Karatsuba multiplier.
• A bit hybrid Karatsuba multiplier consists of two bit and one bit multipliers. This happens in recursive manner.
• In threshold ( ) level, School-Book multiplier is invoked.
• Total area of bit hybrid Karatsuba multiplier is given by
• Total area for the School-Book multiplier is
Estimating LUT Requirement for Hybrid Karatsuba Multiplier
23-27 May 2011 Anurag Labs, DRD0
![Page 30: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/30.jpg)
Estimating Delay of Hybrid Karatsuba Multiplier
23-27 May 2011 Anurag Labs, DRD0 30
• The hybrid Karatsuba multiplier is distributed in smaller multipliers like a tree. Height of the tree is
• Each level of the Simple Karatsuba tree introduces one LUT delay.
• In threshold ( ) level, School-Book multiplier delay is added.
• Delay of School-Book multiplier is
• Delay of the entire multiplier in LUTs is given by
![Page 31: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/31.jpg)
31
• For fields generated by trinomials, area of modular reduction
is almost equal to field size and delay is one LUT considering LUT size .
• For fields generated by pentanomials, – and 2 LUT for .
– and 2 LUT for .
Estimating Area & Delay for Modular Reduction
23-27 May 2011 Anurag Labs, DRD0
![Page 32: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/32.jpg)
Area & Delay Estimates for 2n Circuit
23-27 May 2011 Anurag Labs, DRD0 32
• The output of a 2n circuit, which raises an input can be expressed as , where is binary field matrix
and ,
• LUT requirement per output bit is
• Total LUT requirement for the 2n circuit is
• LUT delay per output bit is
• Since all bits are in parallel, delay of 2n circuit is
![Page 33: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/33.jpg)
Area & Delay Estimates for Multiplexer
23-27 May 2011 Anurag Labs, DRD0 33
• For a 2s : 1 MUX, there are s selection lines and thus the output is a function of 2s + s variables.
• For a MUX in , each of the 2s input lines is of width m bits.
• Total LUT requirement is
• Total LUT delay of the MUX is
• When number of inputs to MUX , the above gives a close upper bound
![Page 34: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/34.jpg)
Area & Delay of PowerBlock
23-27 May 2011 Anurag Labs, DRD0 34
• Let the Powerblock contains us number of cascaded 2n circuits.
• The has selection lines, where
• LUT requirement for is
• Total LUT requirement for Powerblock is
• Delay of is
• Total LUT delay of Powerblock in
![Page 35: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/35.jpg)
Area & Delay for the Entire Architecture
23-27 May 2011 Anurag Labs, DRD0 35
• LUT estimate for the entire architecture is
• There are two parallel delay paths.– LUT delay of first path is
– LUT delay of second path is
– LUT delay of entire architecture is
![Page 36: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/36.jpg)
Optimal Number of Cascades
23-27 May 2011 Anurag Labs, DRD0 36
• For a given field and based FPGA, Powerblock can be configured with different power circuits and cascades .
• Increase in reduces clock cycles, but increases delay of Powerblock.
• is fixed, but depends on and .
• is minimum when
• Minimum delay of the ITA architecture is thus
![Page 37: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/37.jpg)
Power Circuit Selection to achieve Minimum Clock Cycles
23-27 May 2011 Anurag Labs, DRD0 37
• Number of clock cycles for the inversion can be approximated as
• Number of clock cycles for increases linearly with .
• The term reduces with increase in .
• When is small, the reduction in is significant for increase in .
• But, for large values of n, the increase in dominates over the decrease in
• So, increases with increase in for large values of .
![Page 38: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/38.jpg)
38
• The performance metric is
• Minimization of without increasing gives best performance. Area remains almost same.
• The following steps are performed to achieve optimal performance
• The optimal architecture is given by
Tuning Design for Optimality
23-27 May 2011 Anurag Labs, DRD0
![Page 39: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/39.jpg)
39
• Our estimation model uses maxlutpath to find LUT delay.
• Routing delay is difficult to model in FPGAs.
• To get overall delay, we have used experimental results for a reference ITA architecture.
• Total delay of reference architecture is the
• Let LUT delay of reference architecture is
• Total delay of any other ITA architecture in the same field is approximately
• Here is a constant and depends on FPGA technology.
• In 4-LUT based and 6-LUT based
Xilinx FPGAs, has values 0.2 and 0.1 respectively.
Validation of Theoretical Estimates
23-27 May 2011 Anurag Labs, DRD0
![Page 40: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/40.jpg)
40
Validation on 4-input LUT FPGAs
23-27 May 2011 Anurag Labs, DRD0
![Page 41: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/41.jpg)
41
Validation on 6-input LUT FPGAs
23-27 May 2011 Anurag Labs, DRD0
![Page 42: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/42.jpg)
42
Experimental Results
23-27 May 2011 Anurag Labs, DRD0
![Page 43: IMPLEMENTATION OF FINITE FIELD INVERSION Debdeep Mukhopadhyay Chester Rebeiro Dept. of Computer Science and Engineering Indian Institute of Technology](https://reader030.vdocuments.site/reader030/viewer/2022032709/56649ece5503460f94bda56f/html5/thumbnails/43.jpg)
43
Comparison Charts
23-27 May 2011 Anurag Labs, DRD0