queueing analysis of multi-layer contention-tolerant crossbar switch
TRANSCRIPT
972 IEEE COMMUNICATIONS LETTERS, VOL. 14, NO. 10, OCTOBER 2010
Queueing Analysis of Multi-Layer Contention-Tolerant Crossbar SwitchGuannan Qu, Hyung Jae Chang, Jianping Wang, Zhiyi Fang, and S. Q. Zheng
AbstractβWe recently proposed Contention-Tolerant Crossbar(πΆππΆ(π)) and multi-layer πΆππΆ(π) (ππΆππΆ(π)) switch ar-chitectures. By developing queueing network model of πΆππΆ(π)for tagged output, we proved that ππΆππΆ(π) achieves steadystate with the layer number π β₯ 2 under Bernoulli i.i.d. uniformtraffic. In this letter, we extend the queueing network modelfor evaluating the mean cell number in each input queue andmean waiting time of ππΆππΆ(π) working in steady state underthe Bernoulli i.i.d. uniform traffic. This model is validated bysimulation results.
Index TermsβContention-tolerant, switch, queueing analysis.
I. INTRODUCTION
IN a switch, output contentions occur when more thanone input ports have cells to be transmitted to the same
output port during the same time slot. Conventional crossbarswitches, including crossbar with crosspoint buffers switches,require complex hardware to resolve output contentions.
In our previous work [1], we proposed a new switcharchitecture called Contention-Tolerant Crossbar, denoted byπΆππΆ(π), where π is the number of input/output ports.Similar to conventional crossbar, the fabric of the πΆππΆ(π)is comprised by π2 crosspoints (Switching Element, SE)arranged as an π Γπ array. Each SE has three inputs, threeoutputs and two states, as shown in Fig. 1 (a). Each inputport is equipped with a scheduler ππ. In one time slot, if inputport π (0 β€ π β€ π β 1) wants to transmit a cell to an outputport π (0 β€ π β€ π β 1), ππ sets the state of correspondingππΈπ,π to receive-and-transmit (RT) state. The remaining SEsin the same row will be kept in cross (CR) state. If more thanone input ports set their SEs as RT in the same output line(column), the output line is configured as a pipeline, as shownin Fig. 1 (b). Cells transmitted from upstream input portswill be intercepted and buffered in downstream input ports.In this way, output contentions are tolerated in πΆππΆ(π).Without resolving output contentions, schedulers distributedin inputs operate independently and in parallel. Compared toconventional crossbar switches, πΆππΆ(π) is simpler and morescalable.
The switching throughput of πΆππΆ(π) under Bernoulli i.i.d.uniform traffic was analyzed by modeling πΆππΆ(π) withall inputs and tagged output as an open queueing network.Theoretically and experimentally, we proved that, with a single
Manuscript received January 8, 2010. The associate editor coordinating thereview of this letter and approving it for publication was V. Vokkarane.
G. Qu and Z. Fang are with the College of Computer Science andTechnology, Jilin University, Changchun, 130012, P. R. China (e-mail:[email protected]).
H. J. Chang and S. Q. Zheng are with the Department of Computer Science,University of Texas at Dallas, Richardson, TX 75083, USA.
J. Wang is with the Department of Computer Science, City University ofHong Kong, Hong Kong, P. R. China.
Digital Object Identifier 10.1109/LCOMM.2010.081910.100036
( a )
CR (cross)state
RT (receive-and-transmit ) state
Top input
SEi,j
Bottom output
Right outputRight input
Left inputLeft output
CTC(N)Layer
Input port 0
Input port 1
Input port N-1
Output port 0
Output port N-1
.
.
.
...
k
( c )
Output port j
.
.
.
Row of input port u
Row of input port i
( b )
Fig. 1. (a) A crosspoint SE and its two states; (b) each output line ofπΆππΆ(π) is configured as a pipeline; (c) ππΆππΆ(π).
FIFO queue in each input and without speedup, the throughputof πΆππΆ(π) is bounded by 63%. In order to improve through-put, we proposed multi-layer πΆππΆ(π) switch [2], as shown inFig. 1 (c). ππΆππΆ(π) comprises π parallel πΆππΆ(π) layers.Each πΆππΆ(π) layer has its own input and output buffers thatoperate independently. Traffic from outside of ππΆππΆ(π)can be evenly distributed over all layers. Once a cell is injectedinto a given layer, it will remain in this layer till it arrivesat its output port. Assuming the traffic arrival is Bernoullii.i.d. uniform traffic, we proved that the minimum value of πenabling each πΆππΆ(π) layer in ππΆππΆ(π) to be steady is2.
In this letter, we extend the queueing network model ofthe π-layer ππΆππΆ(π) (π β₯ 2) to evaluate the mean cellnumber of each input queue and mean waiting time. Bysolving Discrete-Time Markov Chain state transition for eachinput queue, we obtain analytical results. Simulation resultsare used to validate our theoretical results.
II. QUEUEING MODEL AND DISCRETE TIME MARKOV
CHAIN
Without considering output buffers, we model πΆππΆ(π) ineach layer of π-layer ππΆππΆ(π) as an open queueing net-work system, as shown in Fig. 2. Each input buffer (i.e. queue)is organized as an FIFO queue denoted by ππ. The Head-of-line cell (if exist) of an input queue will be transmitted tocorresponding output line within one time slot.
To simplify our work, we adopt the following assumptions:
1) The traffic model is Bernoulli i.i.d. uniform.2) When a cell arrives at an empty input queue in a time
slot, it cannot be transmitted out during the same slot.3) Each time slot is cut into two contiguous phases. Cells
arrive at input queues from outside during the first phase;During the second phase, cells which are transmitted totheir output lines either arrive at their downstream inputs(be intercepted) or arrive at their destination outputs.
1089-7798/10$25.00 cβ 2010 IEEE
QU et al.: QUEUEING ANALYSIS OF MULTI-LAYER CONTENTION-TOLERANT CROSSBAR SWITCH 973
a1o
aN-1o
O0
a0o
a1u
aN-1u
Q0
QN-1
.
.
.
Q1
...
...
...
...
...
ON-1O1
Fig. 2. Queueing network model of πΆππΆ(π) in each layer of ππΆππΆ(π).
4) The input queue length is infinite.Let ππ be the average throughput of ππ when the queueing
network works in steady state. The traffic equation which isheld for ππ [3] is
ππ =
{πππ if π = 0;
πππ +βπβ1
π=0 (βπβ1
π=0 ππ,π,πππ) if 0 < π β€ π β 1,(1)
where ππ,π,π is the probability of a cell leaving ππ for ππ byoutput line π; πππ is the arrival rate of ππ from outside. Let πbe offered load from outside. For π layers, πππ = π
π .From the property of πΆππΆ(π) which is described in [1],
one cell leaves ππ for its downstream ππ if and only if theyboth transmit their cells to the same output line within thesame time slot. Thus
ππ,π,π =
{ππ,π(ππ,πππ) if π = πβ 1;
ππ,π [βπβ1
π=π+1(1β ππ,πππ)]ππ,πππ if 0 β€ π < πβ 1,
(2)
where ππ,π is the probability of a cell being transmitted tooutput line π. For uniform traffic, ππ,π = 1
π .Combining Equations (1) and (2), we obtain
ππ =ππππ
π β ππππ. (3)
For ππ, the arrival rate from upstream queues ππ’π is
ππ’π =
{0 if π = 0;βπβ1
π=0 (βπβ1
π=0 ππ,π,πππ) if 0 < π β€ π β 1.(4)
Combining Equations (2), (3) and (4), we obtain
ππ’π =ππππ
2
π β ππππ.
Let the steady-state probability vector of ππ be Ξ π =[π0, π1, π2, ..., ππ, ...]. To solve Ξ π, we must consider twopossible cases:
1) ππ (π β= 0): ππ(π‘) is defined as the state of ππ which isthe number of cells in ππ at the end of time slot π‘. Obviously,ππ intercepts the cell from its upstream nodes if and only ifit is busy. Therefore, the probability of a cell arriving at ππ
from upstream under the condition that ππ is not idle is
ππ =ππ’πππ
=π
ππππ .
Analyzing this queueing model requires the construction ofa state-dependent Discrete-Time Markov Chain (DTMC) forππ. The transmission diagram is illustrated as Fig. 3.
0 21 ...
Fig. 3. The DTMC state transition diagram for ππ (π β= 0).
The state transition probabilities ππ₯,π¦ = ππ[ ππ(π‘) =π¦ β£ ππ(π‘β 1) = π₯ ] are
ππ₯,π¦ =
β§β¨β©
1β πππ π₯ = π¦ = 0;πππ π₯ = 0, π¦ = 1;(1β πππ )(1 β ππ) π¦ = π₯β 1, π₯ β₯ 1;πππ (1β ππ) + ππ(1β πππ ) π₯ = π¦, π₯ β₯ 1, π¦ β₯ 1;πππ ππ π¦ = π₯+ 1, π₯ β₯ 1;0 ππ‘βπππ .
(5)To simplify the expression, we define:{
π0 =π0,1
π1,0=
πππ
(1βπππ )(1βππ)
;
π =ππβ1,π
ππ,πβ1=
πππ ππ
(1βπππ)(1βππ)
.(6)
From global balance equations, the steady state solutions canbe obtained recursively as:
ππ =
{1βπ
1βπ+π0π = 0;
(1βπ)π0ππβ1
1βπ+π0π > 0.
(7)
2) π0: Cells only arrive at π0 from outside, thus, ππ’0 = 0.The DTMC state transition of π0 which is illustrated in Fig.4.
0 1
Fig. 4. The DTMC state transition diagram for π0.
The transition probabilities are
ππ₯,π¦ =
{1β πππ π¦ = 0;πππ π¦ = 1.
(8)
According to the definition in Equation (6), we have{π0 =
π0,1
π1,0=
πππ
1βπππ;
π = 0.
Thus, the solutions are
ππβ² =
{ 11+π0
πβ² = 0;π0
1+π0πβ² = 1.
(9)
Comparing Equations (5) and (8), we can see π0 is a case ofππ that only has state 0 and 1. Let ππβ1 = 1 when π = 0 andπ = 1. Solution (9) is a special case of (7).
974 IEEE COMMUNICATIONS LETTERS, VOL. 14, NO. 10, OCTOBER 2010
0
5
10
15
00.2
0.40.6
0.810
1
2
3
4
5
6
7
8
Input number (N=16)Offered load (Ξ»)
Mea
nce
llnum
ber
Theoretical results
Simulation results
Fig. 5. Mean cell number of each input queue (π = 16).
aio
O0
Qi ...
ON-1
aiu
i
u
...
Fig. 6. Cells arrive at ππ from outside and from Ξ©π’.
III. NUMERICAL RESULTS
A. Mean Cell Number
The steady-state probability vector of each input queue hasbeen solved as a function of layer number π, input number πand offered load π. Let ππ be the mean number of cells in ππ
at the end of one time slot. We can obtain
ππ =
ββπ=0
πππ =π0
(1β π)(1β π+ π0).
Figure 5 shows the graphical results for ππ as a function ofoffered load π and input number π with switch size π = 16and layer number π = 2. We can see that the simulation resultsvalidate the theoretical results.
B. Mean Waiting Time
We introduce a recursive method to compute the meanwaiting time experienced by cells in the queueing networksystem. First we define the queueing subnetwork of ππ as:
Ξ©π = {ππ : 0 β€ π β€ π, 0 β€ π β€ π β 1}.The cells passing through ππ (0 < π β€ π β 1) come fromtwo sources: from outside and from its upstream subnetworkΞ©π’ (0 β€ π’ β€ π β 1), as shown in Fig. 6. All cells will spendmean waiting time in ππ, yet cells from upstream ππ’ haveexperienced additional mean waiting time on getting throughΞ©π’ when they arrive at ππ.
From Littleβs law, we can obtain mean waiting time of cellsin ππ is
ππ =ππππ.
The mean waiting time for Ξ©π can be solved recursively asfollows:
πβ²π =
{ππ if π = 0;
ππ +βπβ1
π=0 (βπβ1
π=0 ππ Γ ππππππππ
) if 0 < π β€ π β 1.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
Mea
nw
aitin
gtim
e(t
ime
slots
)
N=16
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
5
Mea
nw
aitin
gtim
e(t
ime
slots
)
N=64
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
5
6
Offered load (Ξ»)
Mea
nw
aitin
gtim
e(t
ime
slots
)
N=128
Theoretical results
Simulation results
Fig. 7. Theoretical results and simulation results of mean waiting time.
A cell completes its travel when it arrives at its destinationoutput in its πΆππΆ(π) layer, no matter from which queueingsubnetwork or by which output line. Under the uniformtraffic assumption, we can consider one output, e.g. ππ . Theprobability of a cell arriving at ππ within one time slot is
πΏπ = 1βπβ1βπ=0
(1β πππππ).
If cell π arrives at ππ at time slot π‘, the probability that itleaves queueing network from Ξ©π is
πΎπ =
{1πΏππππππ
βπβ1π=π+1(1β πππππ) if 0 β€ π < π β 1;
1πΏππππππ if π = π β 1.
Thus, the mean waiting time of cells passing through thewhole queueing network is
π =
πβ1βπ=0
πβ²ππΎπ.
Figure 7 shows the theoretical results and simulation resultswith layer number π = 2 and switch size π = 16,π = 64 andπ = 128, respectively, which validate our theoretical results.
REFERENCES
[1] G. Qu, H. J. Chang, J. Wang, Z. Fang, and S. Q. Zheng, βContention-tolerant crossbar packet switches without and with speedup,β in IEEEInternational Conference on Communication (ICC), May 2010.
[2] G. Qu, H. J. Chang, J. Wang, Z. Fang, and S. Q. Zheng, βContention-tolerant crossbar packet switches,β to appear in International J. Commun.Syst., 2010.
[3] T. G. Robertazzi, Computer Networks and Systems: Queueing Theory andPerformance Evaluation, 3rd edition, p. 103. Springer-Verlag, 2000.