resource-efficient fpga pseudorandom number generation
TRANSCRIPT
Resource-Efficient FPGA Pseudorandom Number GenerationHusrev Cılasun*, Ivy Peng†, Maya Gokhale†
†Lawrence Livermore National Laboratory*University of Minnesota, Twin Cities
Introduction
I Probability distributions play a critical role in diverse application domains.. In simulations, modeling physical properties of materials, of processes, or of behaviors.. For instance, molecular dynamics codes often utilize the Maxwell-Boltzmann distribution
for modeling temperature.I We introduce a resource-efficient hardware pseudo-random number generator (RNG) and
two optimizations:. Alias table partitioning: Separates a target distribution into multiple sub-ranges and
facilitates local optimizations in each sub-range to improve overall resource utilization. Adaptive threshold resolution: Adjusts bitsize for representing threshold values to
the precision of underlying partitionI Our main contributions:. Analytic study driven by dual considerations of improving accuracy and hardware mapping
optimization. Automated HDL generation of both simulation and synthesis scripts. Diverse use cases: emulating Gaussian delay profile in FPGA-based LiME memory system
emulator [1]; random number server for HPC applications
Methodology
I Walker’s Alias Method [2] is an efficient algorithm for FPGA hardware implementation. Itgenerates arbitrary discrete distributions from uniformly generated random numbers. For atarget distribution E(·), this method generates and uses a table of real threshold valuesF (·) and alternative index values A(·), where F (·), A(·), and E(·) are of the samelength. Each output sample Y is generated as
Y =
{X U ≤ F (X )
A(X ) U > F (X ),
where U is a real uniform random number and X is a uniform random integer. The outputquality is a function of the precision of U, i.e., increasing the bit size or representing U asa floating-point number [3] improves the quality.
I We target following Maxwell-Boltzmann distribution (Eq.1) which has its PDF as afunction of temperature T and the Planck distribution (Eq.2) which is parameterized bythe factor a.
f (x) =2hc2
x5exp −
hcxkT
(1)
f (x) =
√2
πx2exp −x2
2a2
a3(2)
Integration with MATLAB
Alias Table Sampling
Tcl ElaborationSimulation
Walker’s Algorithm
Desired Distribution
HDL
𝜒2
Tests
Boilerplate Text
MATLAB/Octave
Vivado
Range
Resolution
PythonWrapper
.csv
Sample Count
Figure 1: An automated flow of customization and testing
PwCLT Architecture
URNG-119
mixture_pdf_urng [118:0]
c0_mixture_sign_flag [0:0]
ROM
addr [6:0]
data [37:0]
c0_alias_index
[6:0]
alias_table_urng [85:0]
bernoulli_fp_urng
[78:0]
FP Comparator
[0:0]
[30:0]
[6:0]
[6:0]+ [7:0]- [7:0]
-
cltfx_urng[31:0]
-
-
“0000000”
+
“0”&x“00”
[7:0]
[16:0]
[16:0]
<<8
[16:0]
[7:0][7:0][7:0]
[8:0][8:0]
[9:0]
FP Cast
[7:0]
[16:0]
4D[6:0]
[30:0]
Figure 2: PwCLT-8 Architecture[3] for LiME[1] integration.
Alias Table Partitioning
I We improve the resource utilization for alias tables by separating the targetdistribution into multiple subranges (four subranges are exemplified in Fig. 3).. In each subrange, the standard alias table implementation is performed.. This separation allows each table to be optimized locally, i.e., alias tables whose target
distribution is smoother can be configured to have fewer threshold bits in F (·) table perentry.
. Consequently, the alias tables can be selected based on their relative probability range andlifted accordingly.
I We propose adaptive threshold resolution to adjust the threshold bitsize whilemaintaining statistical accuracy.. The quality of the generated samples is determined by the threshold resolution.. When alias table partitioning is employed, partitions with higher variance yield larger
bitsize while smaller bitsize is required for those partitions with lower variance.URNG
ROM
>
ROM
>
ROM
>
ROM
>
<c1
<c2
<c3
<c4+
Encoder
0
N/4
N/2
3N/4
Figure 3: An illustration of alias table partitioning scheme which selectively combines subdistributions by comparing a uniform random variable with CDF values of each distribution inpartition boundaries.
Validation and Evaluation
0 1 2 3 4 5 6
x
0
1
2
3
4
5
6
Norm
aliz
ed S
am
ple
Count
10-5 Maxwell-Boltzmann Distribution, a=1
MATLAB alias table, floating-point threshold, size=65536
FPGA simulated alias table, fixed-point threshold, size=65536
Ideal Double-precision
(a) Maxwell-Boltzman distribution
0 0.5 1 1.5 2 2.5 3
x 10-5
0
1
2
3
4
5
6
7
8
9
No
rma
lize
d S
am
ple
Co
un
t
10-5 Planck Distribution, T=700K
MATLAB alias table, floating-point threshold, size=65536
FPGA simulated alias table, fixed-point threshold, size=65536
Ideal Double-precision
(b) Planck distribution
(c) Gaussian Latency Histogram inLiME
11 11.5 12 12.5 13 13.5 14 14.5 15 15.5 16
Output bits
0
5
10
15
20
25
30
Me
mo
ry s
avin
g (
%)
Single Alias Table
2-partitioned
4-partitioned
8-partitioned
(d) Memory savings from variouspartitioning schemes.
Conclusion
I We introduced a resource-efficient hardware RNG whose accuracy is validated by χ2 test.I We proposed an alias table partitioning technique for optimizing resource utilization.I Our RNG is evaluated in three use cases for memory emulations and scientific simulations.
References
[1] A. K. Jain, S. Lloyd, and M. Gokhale.Microscope on memory: Mpsoc-enabled computer memory system assessments.In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM), pages 173–180, 2018.
[2] Alastair J Walker.An efficient method for generating discrete random variables with general distributions.ACM Transactions on Mathematical Software (TOMS), 3(3):253–256, 1977.
[3] D. B. Thomas.FPGA gaussian random number generators with guaranteed statistical accuracy.In 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines,pages 149–156, 2014.
Acknowledgments
This work was supported by LLNL LDRD 19-ERD-004. LLNL-ABS-813772.