interconnect length estimation in vlsi designs: a retrospective
DESCRIPTION
Interconnect Length Estimation in VLSI Designs: A Retrospective. Massoud Pedram University of Southern California. Motivation and Problem Definition. Interconnect represents an increasingly significant part of total circuit delay Longer interconnect is more significant - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/1.jpg)
MASSOUD PEDRAMUNIVERSITY OF SOUTHERN CALIFORNIA
Interconnect Length Estimation in VLSI Designs: A
Retrospective
![Page 2: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/2.jpg)
2
Motivation and Problem Definition
Interconnect represents an increasingly significant part of total circuit delay Longer interconnect is more significant
Interconnect is accurately known only after place/route This leads to timing closure problems Logic design is now coupled with physical design
Interconnect must be considered during: Floorplanning, synthesis, timing verification
We need to be able to predict the length of individual wires before layout, say during technology mapping
2
![Page 3: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/3.jpg)
3
Previous Work
Previous work in this area: Pedram and Preas, ICCD-89
Average wire length for given pin-count Heineken and Maly, CICC-96
Wire-length distribution Hamada, Cheng, and Chau, TCAD 1996
Average wire length for given pin-count Srinivas Bodapati, Farid N. Najm, TVLSI 2001 Andrew Kahng and Sherief Reda, SLIP 2006 Dirk Stroobandt Others …
![Page 4: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/4.jpg)
4
Key Ideas
The number of pins on a net (denoted Pnet) is known to affect net length
The first level neighborhood (denoted Nh1(i) ) of a given net i is defined as: The set of all other nets connected to cells to which
this net is also connectedThe second level neighborhood (denoted
Nh2(i) ) of a given net i is defined as: The union of all first level neighborhoods of nets that
are in the first level neighborhood of this net
![Page 5: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/5.jpg)
Mohammad Javad Dousti and Massoud Pedram(DAC 2013 Paper)
LEQA:Latency Estimation for a Quantum
Algorithm Mapped to a Quantum Circuit Fabric
![Page 6: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/6.jpg)
6
Related Papers
M. Pedram. B. T. Preas, "Accurate prediction of physical design characteristics of random logic," Proc. of Int'l Conference on Computer Design: VLSI in Computers and Processors, Oct. 1989, pp. 100-108.
M. Pedram. B. T. Preas, "Interconnection length estimation for optimized standard cell layouts," Proc. of Int’l Conference on Computer Aided Design, Nov. 1989, pp. 390-393.
![Page 7: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/7.jpg)
7
Overview
Introduction & Motivation Problem Statement Preliminaries
Quantum Operation Dependency Graph (QODG) Universal Logic Blocks (ULBs)
Estimating the Latency of a Quantum Algorithm Average Routing Latency for CNOT Gate
LEQA Performance Experimental Results Conclusion
![Page 8: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/8.jpg)
8
Introduction & Motivation
Total execution time of a software depends on 1. Processor architecture,2. Circuit design, 3. Place and route.
Several estimation methods for the estimation of a software execution time without running it on a specific processor/processor simulator is proposed.
The same paradigm exists for quantum computers:Calculating the exact latency of a quantum algorithm is an expansive proposition since it needs scheduling and placement of quantum operations and routing of qubitsThe exact answer has no use since there is no real-size quantum computer out there!
However, the latency estimation of the mapped quantum circuit still has many applications: Early algorithm/program analysis Helps quantum error correction code (QECC) designers to account enough amount of
resources for QECCs
![Page 9: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/9.jpg)
9
Problem Statement
Given: A quantum circuit Size of the fabric (width×height) Logical gates delays The capacity of routing channels Speed of a logical qubit through the routing channels
Estimate the latency of the mapped quantum circuit to the quantum circuit fabric.
![Page 10: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/10.jpg)
10
Preliminaries (1):Quantum Operation Dependency Graph (QODG)
In QODG, nodes represent quantum operations and edges capture data dependencies.
start 1 132 3 4 5 6
9
810
11
12
14 15 16 17
end7
18 19
QODG of ham3 circuit
3-Input Toffoli Gate
H T† T T H
T
T†
1 2 3 4 12865 10
13
14
18T† 7
T 9
11 1715 1916
Synthesized ham3 circuit
q1
q2
q3
![Page 11: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/11.jpg)
11
Preliminaries (2): Universal Logic Blocks (ULBs)
To avoid dealing with complexity, Tiled Quantum Architecture (TQA) is used which is composed of a regular two-dimensional array of ULBs.
ULB ULB
ULBULB
ULB
ULB
ULB
ULB
ULBULB ULBULB
A 3×3 Tiled Quantum Architecture (TQA)
Each ULB can perform any FT quantum operations.
ULBs are separated by the routing channels, which are needed to move logical qubits from some source ULBs to a target ULB in the TQA.
1 2
3
H
CNOT T†
CNOT
TH T† T 1 2 3 4 5
q1
q2
q3
![Page 12: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/12.jpg)
12
Estimating the Latency of a Quantum Algorithm
Delay of a quantum algorithm can be formulated as follows:
where is the set of one-qubit FT operations (such as H, T, S,
etc.); and are the number of CNOTs and operations of type
on the critical path; and determine the delay of CNOT and operation of
type respectively; and capture the average routing latency for input
qubits of the CNOT and the input qubit of the operation of type .
Tech, QECC, &QC dependent
values
Easy; Empirically set to 2×Tmove
Main challenge!
![Page 13: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/13.jpg)
13
A computationally efficient model for estimating the average qubit routing latency for CNOT gates is developed.
The model comprises a number of sub-models dealing with Possible placement locations of each qubit captured as a “presence zone” Congestion in the routing channels captured by “zone overlaps” Intra-zone routing modelled as “shortest Hamiltonian path”
A procedural method, combining the sub-models together to estimate the Qubit routing latency for CNOT gates.
Average Qubit Routing Latency for CNOT Gate
1
5
2
3
4
Highly Congested
5 presence zones
![Page 14: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/14.jpg)
14
Estimating Average Routing Latency for CNOT (
Since the result of the placement is not known a priori, the zones are assumed to be placed randomly (uniformly and independently) on the fabric. can be estimated as
where is the total number of logical qubits in the target quantum
circuit; is the expected area of the quantum circuit fabric which is
covered by exactly overlapping presence zones; is the average routing latency of a qubit when the routing
channels are occupied by qubits; and is the area of the circuit fabric and it is equal to the total
number of ULBs assuming that each ULB is a square.
Should be estimated
![Page 15: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/15.jpg)
15
Estimating the Expected Covered Surface ()
where
is the probability that the ULB at position (x,y) on the fabric is covered by a qubit’s presence zone, which is itself randomly positioned on the fabric;
where is the average area of presence zones.
y
x(0,0)
bb-y+1
a-x+1
a
![Page 16: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/16.jpg)
16
Estimating Average Area of Presence Zones (B)
A weighted graph called interaction intensity graph (IIG(V,E)) is built as follows: Nodes of this graph are logical qubits which are denoted by . An edge is added between nodes and if these two qubits
interact with each other. is equal to the number of two-qubit operations between and .
Let denote the number of neighbors of node in the IIG(V,E). Clearly, .
B can be calculated by using a weighted average over the size of the presence zone of all logical qubits
The area of the presence zone associated with , which is denoted by , is calculated as
![Page 17: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/17.jpg)
17
Average Routing Latency of a Qubit ()
where is the capacity of routing channels is the average routing latency of a qubit
where all routing channels are uncongested
Derivation of this comes
next
![Page 18: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/18.jpg)
18
Derivation of the Average Routing Latency of a Qubit ()
Routing latency when can be modeled by an M/M/1/∞ queue. ( is the arrival rate)
Avg. Queue length:
Having the arrival rate and the avg. queue length, Little’s formula gives the average waiting time in the queue:
λ μ
Ncq-Nc
![Page 19: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/19.jpg)
19
Estimating
where represents the average routing latency of qubit in an average-size presence zone when the routing channels are uncongested.
One way to estimate is to randomly place qubits in the presence zone of qubit and calculate the expected length of the shortest Hamiltonian path () which goes through these qubits.
![Page 20: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/20.jpg)
20
Estimating can be estimated By knowing the value of , can be calculated as
follows:
where is a tuning parameter and is a parameter depending on the physical characteristics of the fabric technology mostly the speed of moving a logical qubit through channels. is added to the denominator to give the average routing latency of an operation (i.e., a single edge length).
![Page 21: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/21.jpg)
21
LEQA Performance
Runtime complexity of LEQA can be written as follows:
where is the number of vertices in the given
QODG which is equal to the number of operations plus two (including two dummy nodes)
is the number of edges in the given QODG is the number of qubits in the input circuit is the area of the TQA fabric
Polynomial in terms of input size (operation count, qubit count and fabric
size)
![Page 22: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/22.jpg)
22
Experimental Results (1)
LEQA is compared with a modified version of our previous work QSPR (DATE’12)
8.29%
0123456789
8bita
dder
gf2^16
mult
hwb1
5ps
hwb1
6ps
gf2^18
mult
gf2^19
mult
gf2^20
mult
ham15
hwb2
0ps
hwb5
0ps
gf2^50
mult
mod
1048576…
gf2^64
mult
hwb1
00ps
gf2^10
0mult
hwb2
00ps
gf2^12
8mult
gf2^25
6mult
Estim
ation
Err
or (%
)
Benchmark
Average error is 2.11%
Worst case error;still low enough
![Page 23: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/23.jpg)
23
Experimental Results (2)
Shor’s factorization algorithm for a 1024-bit integer has ~1.35×1010 logical operations. Using extrapolation, QSPR would compute the latency in ~2 years whereas LEQA needs only 16.5 hours!!
y = 2E-10x2 + 0.0004x + 2.1123R² = 0.9997
y = 4E-06x + 0.2347R² = 0.9973
0.0
100.0
200.0
300.0
400.0
500.0
600.0
0 200,000 400,000 600,000 800,000 1,000,000
Runi
me
(s)
Operation Count
Runtime Comparison between LEQA and QSPR (baseline)
QSPR Runtime
LEQA Runtime
115X speedup
78X speedup
y = 2E-10x2 + 0.0004x + 2.1123R² = 0.9997
y = 4E-06x + 0.2347R² = 0.9973
0.0
100.0
200.0
300.0
400.0
500.0
600.0
0 200,000 400,000 600,000 800,000 1,000,000
Runi
me
(s)
Operation Count
Runtime Comparison between LEQA and QSPR (baseline)
QSPR Runtime
LEQA Runtime
115X speedup
78X speedup
![Page 24: Interconnect Length Estimation in VLSI Designs: A Retrospective](https://reader036.vdocuments.site/reader036/viewer/2022062422/56812f4e550346895d94df0d/html5/thumbnails/24.jpg)
24
Conclusion
Persistence of Ideas
The method developed some 25 years ago applies today not to classical computing but also to quantum computing fabric
Gratitude of Scholars
We are who we are because of what we have learned from whom and what we have done since
Voice of Hearts
Friendship and collegiality are key