generationofapplicationspecificfault … · 2019. 6. 23. · bezerra, gustavo alves. generation of...
Post on 27-Feb-2021
2 Views
Preview:
TRANSCRIPT
Universidade Federal do Rio Grande do NorteCentro de Ciências Exatas e da Terra
Departamento de Informática e Matemática AplicadaBachelor in Computer Science
Generation of Application Specific FaultTolerant Irregular NoC Topologies Using Tabu
Search
Gustavo Alves Bezerra
Natal-RN
June 2019
Gustavo Alves Bezerra
Generation of Application Specific Fault twTolerantIrregular NoC Topologies Using Tabu Search
Undergraduate thesis submitted to the De-partamento de Informática e MatemáticaAplicada of the Centro de Ciências Exatas eda Terra of the Universidade Federal do RioGrande do Norte as a partial requirement forobtaining the bachelor’s degree in ComputerScience.
Advisor
PhD Monica Magalhães Pereira
Universidade Federal do Rio Grande do Norte – UFRNDepartamento de Informática e Matemática Aplicada – DIMAp
Natal-RN
June 2019
Bezerra, Gustavo Alves. Generation of application specific fault tolerant irregularNoC topologies using tabu search / Gustavo Alves Bezerra. -2019. 119f.: il.
Monografia (Bacharelado em Ciência da Computação) -Universidade Federal do Rio Grande do Norte, Centro de CiênciasExatas e da Terra, Departamento de Informática e MatemáticaAplicada. Natal, 2019. Orientadora: Monica Magalhães Pereira. Coorientadora: Sílvia Maria Diniz Monteiro Maia.
1. Computação - Monografia. 2. Redes em chip - Monografia. 3.Topologias irregulares - Monografia. 4. Aplicação específica -Monografia. 5. Tolerância a falhas - Monografia. 6. Busca Tabu -Monografia. I. Pereira, Monica Magalhães. II. Maia, Sílvia MariaDiniz Monteiro. III. Título.
RN/UF/CCET CDU 004
Universidade Federal do Rio Grande do Norte - UFRNSistema de Bibliotecas - SISBI
Catalogação de Publicação na Fonte. UFRN - Biblioteca Setorial Prof. Ronaldo Xavier de Arruda - CCET
Elaborado por Joseneide Ferreira Dantas - CRB-15/324
Undergraduate thesis under the title Generation of Application Specific Fault Tolerant
Irregular NoC Topologies Using Tabu Search presented by Gustavo Alves Bezerra and
accepted by the Departamento de Informática e Matemática Aplicada of the Centro de
Ciências Exatas e da Terra of the Universidade Federal do Rio Grande do Norte, being
approved by all members of the examining board specified below:
PhD Monica Magalhães PereiraAdvisor
Departamento de Informática e Matemática AplicadaUniversidade Federal do Rio Grande do Norte
PhD Sílvia Maria Diniz Monteiro MaiaCo-advisor
Departamento de Informática e Matemática AplicadaUniversidade Federal do Rio Grande do Norte
PhD Márcio Eduardo KreutzDepartamento de Informática e Matemática Aplicada
Universidade Federal do Rio Grande do Norte
Natal-RN, June 2019.
To my family and friends that supported me throughout this journey.
Acknowledgements
It would be impossible to conceive this work without the support provided by the
professors and UFRN’s Programa de Educação Tutorial - Ciência da Computação. Thus,
special thanks to Monica Magalhães Pereira, Sílvia Maria Diniz Monteiro Maia, and Um-
berto Souza Da Costa.
Thanks to my family for all the love and support, and for withstanding all the diffi-
culties encountered – Erbena Sales Alves Bezerra, José Guilardo Gonçalves Bezerra, and
Juliana Alves Bezerra. In addition, thanks to Iria de Fátima Bezerra Pinho for the long
distance support.
Thanks to Breno “Blinn” Viana “Phong”, “Deba” Emili Costa, Felipe “Barba-lho”,
Jhonattan “Johnson” Cabral, “Pratíxia” Pontes Cruz, Raul “Dalinda” Silva, “Showzivan”
Medeiros da Silva Gois, and Vitor “God”eiro for all the discussions, conversations, memes
and for turning the last semesters of the Computer Science course one of the most mem-
orable times of my life.
Thanks to Giorgio Brito, “Juhauare” Jales, Larissa “Lucy”ano, Misa Uehara, and Paola
Gessy for being present in some keys moments, helping me to keep my sanity. Thanks to
Joel Felipe, Vitor “God”eiro (again), and Vitor Greati for directly and indirectly inspiring
me to focus, and persist on my studies.
Last but not least, thanks to “the dudes” Victor “Polar” Santos, and Yuri “Kbelo”
Messias for being present since 2010; and for all the games, CiViKs, defeats, achievements,
and coffees shared.
“But now that it’s over
I’ll see you the next time
Remember the future is yours”
Nektar,
remember the future
Geração de Topologias Irregulares para AplicaçãoEspecífica e Tolerantes à Falhas Utilizando Busca Tabu
Autor: Gustavo Alves Bezerra
Orientador(a): Doutora Monica Magalhães Pereira
Resumo
As redes em Chip (NoC) foram propostas para aprimorar o desempenho de computa-
dores. As primeiras topologias sugeridas tendiam a possuir uma estrutura regular, vis-
ando flexibilidade – desempenho razoável para diversas aplicações e múltiplos caminhos
entre roteadores. Topologias regulares são piores em desempenho se comparadas a to-
pologias geradas para aplicações específicas, normalmente irregulares. Por outro lado,
topologias irregulares podem possuir baixa flexibilidade. Na era dos bilhões de transist-
ores, componentes de circuitos são mais suscetíveis a falhas, sejam causadas por radiação,
interferência eletromagnética ou efeitos similares. Devido ao custo de produção de tais
circuitos, deseja-se aumentar a durabilidade (vida útil), desempenho e flexibilidade dos
mesmos. Durabilidade pode ser obtida ao se adicionar tolerância a falhas num circuito.
Portanto, ao adicionar-se componentes redundantes numa NoC (roteadores e conexões),
é possível que sua durabilidade e flexibilidade (caminhos alternativos) sejam melhoradas,
embora o consumo de energia piore. Este trabalho propõe a geração de topologias irregu-
lares utilizando Busca Tabu.Por conseguinte, gerando topologias intermediárias: flexíveis
se comparadas com a maioria das NoCs irregulares (possuindo certo grau de tolerância
a falhas e caminhos alternativos entre roteadores), porém obtendo alto desempenho para
aplicações específicas se comparadas com NoCs regulares.
Palavras-chave: Redes em Chip, Topologias Irregulares, Aplicação Específica, Tolerância
a Falhas, Busca Tabu.
Generation of Application Specific Fault TolerantIrregular NoC Topologies Using Tabu Search
Author: Gustavo Alves Bezerra
Advisor: Monica Magalhães Pereira, PhD
Abstract
Network on Chip (NoC) was proposed to enhance computer performance. Initially con-
ceived topologies tended to have a regular structure, aiming flexibility – regular perform-
ance for different applications, and multiple paths between routers. Regular topologies
lack in performance if compared to specific application generated topologies, often irreg-
ular. On the other hand, irregular topologies may lack flexibility. In the billion-transistor
era, circuit components are more susceptible to faults, whether caused by radiation, elec-
tromagnetic interference or similar effects. Due to the cost of producing such circuits, it
is desirable to increase their durability (lifespan), performance, and flexibility. Durability
may be achieved by adding fault-tolerance to the circuit. Therefore, by adding redundant
components – e.g. routers or links – to an irregular NoC, it may be possible to increase
its durability and flexibility (multiple communication paths), though energy consump-
tion may be impaired. This work proposes the generation of irregular topologies using
Tabu Search.Thus generating intermediate topologies: flexible if compared to most irreg-
ular ones (some fault resistance), yet achieving application specific high performance if
compared to regular NoCs.
Keywords : Network on Chip, Irregular Topologies, Application-Specific, Fault-Tolerance,
Tabu Search.
Lista de figuras
1 Graph examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 22
2 Example of regular NoC topologies. . . . . . . . . . . . . . . . . . . . . p. 25
3 Examples of irregular NoC topologies. . . . . . . . . . . . . . . . . . . . p. 26
4 Examples of areas isolated in NoCs after faults. . . . . . . . . . . . . . p. 27
5 A Task Graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 28
6 Example of Task Graph edge conversion. . . . . . . . . . . . . . . . . . p. 39
7 Example of unfeasible solution – UNF . . . . . . . . . . . . . . . . . . . p. 40
8 Example of Delete Edges Until Epsilon. . . . . . . . . . . . . . . . . . . p. 41
9 Example of Add Edges Until Epsilon. . . . . . . . . . . . . . . . . . . . p. 42
10 Example of making an unfeasiable solution feasible. . . . . . . . . . . . p. 43
11 Tabu List examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 46
12 Examples of valid edges’ node swapping process . . . . . . . . . . . . . p. 52
13 Example of invalid edge node swapping operation. . . . . . . . . . . . . p. 52
14 Example of spin operation with a minimum degree node. . . . . . . . . p. 54
15 Examples of default scenario. . . . . . . . . . . . . . . . . . . . . . . . p. 54
16 Example of successful spin operation with a maximum degree node. . . p. 56
17 Examples of unsuccessful spin operation with a maximum degree node. p. 57
18 Example of successful double spin operation. . . . . . . . . . . . . . . . p. 58
19 Example of Fault Injection Algorithm. . . . . . . . . . . . . . . . . . . p. 60
20 Chosen TGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 61
21 Influence of tabuListSize arguments for Latency Estimation in chosen
TGs’ generated solutions. . . . . . . . . . . . . . . . . . . . . . . . . . p. 62
22 Influence of terminationCrit arguments for Latency Estimation in chosen
TGs’ generated solutions. . . . . . . . . . . . . . . . . . . . . . . . . . p. 63
23 Overall median latency for all benchmarked TGs. . . . . . . . . . . . . p. 64
24 Box plots of chosen TGs solutions’ latency estimation. . . . . . . . . . p. 66
25 Examples of AP2TG solutions. . . . . . . . . . . . . . . . . . . . . . . p. 67
26 Examples of MPEGTG solutions. . . . . . . . . . . . . . . . . . . . . . p. 67
27 Fault injection on median chosen TGs solutions. . . . . . . . . . . . . . p. 69
28 SAP2TG,15 behaviour during fault injection. . . . . . . . . . . . . . . . . p. 70
29 SMPEGTG,19 behaviour during fault injection. . . . . . . . . . . . . . . . p. 71
30 Fault injection on median solutions with median ε of the chosen TGs. . p. 72
31 Influence of tabuListSize on AP1TG solutions. . . . . . . . . . . . . . p. 78
32 Influence of tabuListSize on AP2TG solutions. . . . . . . . . . . . . . p. 78
33 Influence of tabuListSize on AP3TG solutions. . . . . . . . . . . . . . p. 79
34 Influence of tabuListSize on AP4TG solutions. . . . . . . . . . . . . . p. 79
35 Influence of tabuListSize on INTEGRALTG solutions. . . . . . . . . p. 79
36 Influence of tabuListSize on MPEGTG solutions. . . . . . . . . . . . p. 80
37 Influence of tabuListSize on MWDTG solutions. . . . . . . . . . . . . p. 80
38 Influence of tabuListSize on V OPDTG solutions. . . . . . . . . . . . . p. 80
39 Influence of terminationCriterion on AP1TG solutions. . . . . . . . . p. 81
40 Influence of terminationCriterion on AP2TG solutions. . . . . . . . . p. 81
41 Influence of terminationCriterion on AP3TG solutions. . . . . . . . . p. 82
42 Influence of terminationCriterion on AP4TG solutions. . . . . . . . . p. 82
43 Influence of terminationCriterion on INTEGRALTG solutions. . . . p. 82
44 Influence of terminationCriterion on MPEGTG solutions. . . . . . . p. 83
45 Influence of terminationCriterion on MWDTG solutions. . . . . . . . p. 83
46 Influence of terminationCriterion on V OPDTG solutions. . . . . . . . p. 83
47 Fitness (latency estimation) box plots of AP1TG generated solutions. . p. 84
48 Fitness (latency estimation) box plots of AP2TG generated solutions. . p. 84
49 Fitness (latency estimation) box plots of AP3TG generated solutions. . p. 85
50 Fitness (latency estimation) box plots of AP4TG generated solutions. . p. 85
51 Fitness (latency estimation) box plots of INTEGRALTG generated
solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 85
52 Fitness (latency estimation) box plots of MPEGTG generated solutions. p. 86
53 Fitness (latency estimation) box plots of MWDTG generated solutions. p. 86
54 Fitness (latency estimation) box plots of V OPDTG generated solutions. p. 86
55 Fault injection on median AP1TG solutions. . . . . . . . . . . . . . . . p. 87
56 Fault injection on median AP2TG solutions. . . . . . . . . . . . . . . . p. 88
57 Fault injection on median AP3TG solutions. . . . . . . . . . . . . . . . p. 88
58 Fault injection on median AP4TG solutions. . . . . . . . . . . . . . . . p. 89
59 Fault injection on median INTEGRALTG solutions. . . . . . . . . . . p. 89
60 Fault injection on median MPEGTG solutions. . . . . . . . . . . . . . p. 90
61 Fault injection on median MWDTG solutions. . . . . . . . . . . . . . . p. 90
62 Fault injection on median V OPDTG solutions. . . . . . . . . . . . . . p. 91
63 Median ε AP1 with median fitness. . . . . . . . . . . . . . . . . . . . . p. 92
64 Median ε AP2 with median fitness. . . . . . . . . . . . . . . . . . . . . p. 93
65 Median ε AP3 with median fitness. . . . . . . . . . . . . . . . . . . . . p. 93
66 Median ε AP4 with median fitness. . . . . . . . . . . . . . . . . . . . . p. 94
67 Median ε INTEGRAL with median fitness. . . . . . . . . . . . . . . . p. 94
68 Median ε MPEG with median fitness. . . . . . . . . . . . . . . . . . . p. 95
69 Median ε MWD with median fitness. . . . . . . . . . . . . . . . . . . . p. 95
70 Median ε V OPD with median fitness. . . . . . . . . . . . . . . . . . . . p. 96
71 Median ε AP1 solution with median fitness after 10% fault injection. . p. 97
72 Median ε AP2 solution with median fitness after 10% fault injection. . p. 98
73 Median ε AP3 solution with median fitness after 10% fault injection. . p. 98
74 Median ε AP4 solution with median fitness after 10% fault injection. . p. 99
75 Median ε INTEGRAL solution with median fitness after 10% fault in-
jection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 99
76 Median ε MPEG solution with median fitness after 10% fault injection. p. 100
77 Median ε MWD solution with median fitness after 10% fault injection. p. 100
78 Median ε V OPD solution with median fitness after 10% fault injection. p. 101
79 Median ε AP1 solution with median fitness after 20% fault injection. . p. 102
80 Median ε AP2 solution with median fitness after 20% fault injection. . p. 103
81 Median ε AP3 solution with median fitness after 20% fault injection. . p. 103
82 Median ε AP4 solution with median fitness after 20% fault injection. . p. 104
83 Median ε INTEGRAL solution with median fitness after 20% fault in-
jection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 104
84 Median ε MPEG solution with median fitness after 20% fault injection. p. 105
85 Median ε MWD solution with median fitness after 20% fault injection. p. 105
86 Median ε V OPD solution with median fitness after 20% fault injection. p. 106
87 Median ε AP1 solution with median fitness after 30% fault injection. . p. 107
88 Median ε AP2 solution with median fitness after 30% fault injection. . p. 108
89 Median ε AP3 solution with median fitness after 30% fault injection. . p. 108
90 Median ε AP4 solution with median fitness after 30% fault injection. . p. 109
91 Median ε INTEGRAL solution with median fitness after 30% fault in-
jection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 109
92 Median ε MPEG solution with median fitness after 30% fault injection. p. 110
93 Median ε MWD solution with median fitness after 30% fault injection. p. 110
94 Median ε V OPD solution with median fitness after 30% fault injection. p. 111
95 Fault injection on median SAP1TG,15 solution. . . . . . . . . . . . . . . . p. 112
96 Fault injection on median SAP2TG,15 solution. . . . . . . . . . . . . . . . p. 112
97 Fault injection on median SAP3TG,16 solution. . . . . . . . . . . . . . . . p. 113
98 Fault injection on median SAP4TG,16 solution. . . . . . . . . . . . . . . . p. 113
99 Fault injection on median SINTEGRALTG,15 solution. . . . . . . . . . . . p. 113
100 Fault injection on median SMPEGTG,19 solution. . . . . . . . . . . . . . p. 114
101 Fault injection on median SMWDTG,18 solution. . . . . . . . . . . . . . . p. 114
102 Fault injection on median SV OPDTG,19 solution. . . . . . . . . . . . . . p. 114
103 AP1TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 115
104 AP2TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 116
105 AP3TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 116
106 AP4TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 117
107 INTEGRALTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 117
108 MPEGTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 118
109 MWDTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 118
110 V OPDTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 119
List of abbreviations and initials
NoC – Network on Chip
QAP – Quadratic Assignment Problem
TG – Task Graph
MPSoC – Multi-Processor System-on-Chip
CVRP – Classical Vehicle Routing Problem
SEA – Set of Edges to Add
List of Symbols
∅ – Empty Set
∪ – Set Union
¬ – Logical negation
∀ – For all
∈ – In
∧ – Logical conjunction
ε – A fixed number of edges
← – Attribution
/∈ – Not in
∨ – Logical disjunction
∃ – Exists
⊆ – Is contained in
* – Is not contained in
SC – The complement of set S
∩ – Set intersection
N – Natural numbers set
List of Algorithms
1 Tabu Search Algorithm Skeleton. . . . . . . . . . . . . . . . . . . . . . p. 30
2 Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 35
3 Generate Initial Solution Graph. . . . . . . . . . . . . . . . . . . . . . . p. 38
4 Fit Solution’s number of edges to ε. . . . . . . . . . . . . . . . . . . . . p. 39
5 Deletes edge with largest degree incident nodes possible. . . . . . . . . p. 40
6 Adds edge between the two nodes with smallest possible. . . . . . . . . p. 42
7 Make a Solution Feasible. . . . . . . . . . . . . . . . . . . . . . . . . . p. 43
8 Implemented Tabu Search. . . . . . . . . . . . . . . . . . . . . . . . . . p. 44
9 Fitness Function implementation. . . . . . . . . . . . . . . . . . . . . . p. 45
10 Neighbourhood Search . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 48
11 Special Neighbourhood Search deletions . . . . . . . . . . . . . . . . . . p. 49
12 Special Neighbourhood Search Additions . . . . . . . . . . . . . . . . . p. 50
13 Swaps the nodes incident to two distinct edges. . . . . . . . . . . . . . p. 50
14 Spin edge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 53
15 Spin edge incident to one maximum degree node . . . . . . . . . . . . . p. 55
16 Double spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 58
17 Fault Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 59
Contents
1 Introduction p. 20
2 Theoretical Framework p. 22
2.1 Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 22
2.2 Quadratic Assignment Problem Function . . . . . . . . . . . . . . . . . p. 23
2.3 Network On Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 23
2.3.1 Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 25
2.3.2 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 26
2.4 Taks Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 27
2.5 Metaheuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 28
2.5.1 Tabu Searh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 29
3 Related Works p. 31
3.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 31
3.2 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 32
4 Methodology p. 35
4.1 Definitions and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . p. 36
4.1.1 Solution Representation . . . . . . . . . . . . . . . . . . . . . . p. 36
4.1.2 Feasible Solution . . . . . . . . . . . . . . . . . . . . . . . . . . p. 37
4.2 Initial Topology Generation . . . . . . . . . . . . . . . . . . . . . . . . p. 38
4.2.1 Fitting to Epsilon . . . . . . . . . . . . . . . . . . . . . . . . . . p. 39
4.2.2 Making Feasible . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 40
4.2.2.1 Deleting Edges - DEL_EDGE() . . . . . . . . . . . . . p. 40
4.2.2.2 Adding Edges - ADD_EDGE() . . . . . . . . . . . . . p. 41
4.2.2.3 Make Feasible Algorithm . . . . . . . . . . . . . . . . . p. 42
4.3 Best Solution Search – Tabu Search . . . . . . . . . . . . . . . . . . . . p. 43
4.3.1 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . p. 45
4.3.2 Tabu List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 46
4.3.3 Neighbourhood Search . . . . . . . . . . . . . . . . . . . . . . . p. 47
4.3.3.1 Delete Edge Between Two Minimum Degree Nodes . . p. 50
4.3.3.2 Delete Edge Incident to One Minimum Degree Node . p. 52
4.3.3.3 Default Scenario . . . . . . . . . . . . . . . . . . . . . p. 54
4.3.3.4 Add Edge Incident to One Maximum Degree Node . . p. 54
4.3.3.5 Add Edge Between Two Maximum Degree Nodes . . . p. 57
4.4 Fault Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 58
5 Results p. 61
5.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 63
5.2 Latency Estimation After Fault Injection . . . . . . . . . . . . . . . . . p. 67
6 Concluding Remarks p. 73
References p. 74
Appendix A -- Influence of tabuListSize Arguments p. 78
Appendix B -- Influence of terminationCriterion Arguments p. 81
Appendix C -- Latency Box Plots p. 84
Appendix D -- Fault Injection in Median Fitness Solutions p. 87
Appendix E -- Examples of Median Epsilon Solutions p. 92
Appendix F -- Median Epsilon Solutions After 10% Fault Injection p. 97
Appendix G -- Median Epsilon Solutions After 20% Fault Injection p. 102
Appendix H -- Median Epsilon Solutions After 30% Fault Injection p. 107
Appendix I -- Detailed Fault Injection in Some Median Fitness Solu-
tions p. 112
Annex A -- Mesquita’s Work TGs p. 115
20
1 Introduction
The technological advances in computers, specially transistors, lead to an increase in
the number of components that fit in a single chip. Consequently, the need to improve com-
munication between chip components also increased (WANG et al., 2013). There are some
ways to achieve this result. It is possible to reduce the size of circuit components, therefore
shortening the physical distance between them. Hence, the density of component increases
alongside the computational power in a fixed area (SCHALLER, 1997). It is also possible
to increase the number of transmitted bits per second by increasing clock frequency or
the number of channels for parallel communication (STALLINGS, 2003; PATTERSON; HEN-
NESSY, 2013). Alongside these techniques, changing the communication protocol may also
decrease latency. Network on Chip (NoC) is such an example.
Bits were traditionally transmitted between computer components via communication
bus (STALLINGS, 2003; PATTERSON; HENNESSY, 2013). This solution was satisfactory for
the early Computer Age. However, the number of components per chip and the com-
munication demand between components raised over the decades. Hence, the provided
bus communication proved to lack flexibility and efficiency as applications’ complexity
increased. In order to solve this problem, the idea of NoC was conceived.
NoCs take advantage of a well consolidated field of Computer Science: Computer
Networks (HEMANI et al., 2000). This Computer Science branch has been evolving since
the 1960s (KUROSE; ROSS, 2013). Its theory and systems are so sophisticated that the
initial limited networks evolved to a worldwide net involving several security techniques
(KUROSE; ROSS, 2013).
In order to improve chips by inserting networks features, it is necessary to add new
components to it: routers. The routers are responsible for transmitting information re-
ceived from a component to another (ZEFERINO; SUSIN, 2003). Additionally, these com-
ponents determine the course a message will take through the network. Noticeably, the
flexibility of a chip is severally increased (ZEFERINO; SUSIN, 2003). The chip’s area, how-
21
ever, is also increased (MORAES et al., 2004). Furthermore, due to the processing time
required by a router to determine the appropriate action, the communication overhead is
also affected (BEIGNÉ et al., 2005).
Initially, NoCs tended to be regularly structured. A few examples are Mesh-2d, Torus
and Honeycomb NoCs (ZEFERINO; SUSIN, 2003; HEMANI et al., 2000). The regular router
distribution tends to offer greater flexibility than the irregular one – regular performance
for different applications, and multiple paths between routers. Therefore, efficiency for
specific applications is lacked (ASCIA; CATANIA; PALESI, 2004).
On the other hand, irregular NoCs attempt to improve application specific efficiency
and network performance compared to regular topologies (CHOUDHARY; GAUR; LAXMI,
2011; CHOUDHARY et al., 2010). Regular NoCs may also become irregular during the
circuit’s lifespan due to faults on either routers or links (CHOUDHARY; GAUR; LAXMI,
2011).
Efforts are being applied to generate irregular topologies with lower energy consump-
tion (JAIN; CHOUDHARY; SINGH, 2014). In addition, classical routing algorithms such as
XY (ZEFERINO; SUSIN, 2003) tend to have unsatisfactory performance when applied to
irregular NoCs (RODRIGO et al., 2011). Nevertheless, such approaches are guaranteed to be
deadlock-free only for regular NoCs (RODRIGO et al., 2011). Therefore, countless routing
algorithms are being conceived to improve network deadlock-free performance (MILFONT
et al., 2017; GABIS; KOUDIL, 2016; LEE; PARIKH; BERTACCO, 2015). Some of these examples
focus not only in deadlock-freedom, but also in fault tolerance, congestion management,
and livelock-freedom.
Similarly to any circuit, NoCs are susceptible to faults. The operation of a chip may
be severally committed depending on fault location (CHANG et al., 2011). Moreover, it is
desired to increase the lifespan of such circuits due to their fabrication cost. Although care
must be taken not to generate a regular topology, lifespan increase may be achieved by in-
troducing redundant components into the NoC (WANG et al., 2013; MESQUITA, 2016; SHAH;
KANNIGANTI; SOUMYA, 2017). During the fault tolerant irregular topology generation,
constant monitoring is necessary to avoid energy consumption increasement. Otherwise,
the main advantages of irregular topologies would be lost.
In this scenario, the proposed work focuses on generating fault tolerant irregular NoC
topologies. It is desired to obtain long-lasting and efficient circuits for specific applications.
Notwithstanding, the circuits should be suitable for multiple applications. The generated
topologies will be evaluated regarding fault tolerance capacity, and latency.
22
2 Theoretical Framework
The purpose of this work is to generate task graph based irregular NoCs targeting low
latency and reliability via metaheuristics. Some topics require a more solid background and
are thus explored in this section: NoC, Task Graph, Metaheuristics and Fault Tolerance.
2.1 Graph Theory
A Graph is defined as a tuple of a set of vertices and a set of edges G(V,E), re-
spectively (WILSON, 1979). V is often represented as a set of integers. There are multiple
representations for E, but in any representation, an edge connects two nodes. Throughout
this work, GV means “the set containing G’s vertices”; and GE means “the set containing
G’s edges”. A Graph may be weighted or unweighted, directed or undirected.
0
1 2
3
45
80
2
6.4
84.57
2112
73
12
3 3
(a) G0
0
1 2
3
45
6
(b) G1
Figure 1: Graph examples.
In a weighted graph, some value is associated to an edge (Graph G0, Figure 1a).
On the other hand, unweighted edges have no value associated to them (G1, Figure 1b).
A directed and unweighted graph may represent edges as tuples, because edge (0, 1) 6=(1, 0). On the other hand, undirected unweighted graphs may represent edges as sets
since {0, 1} = {1, 0} (G1). Hence, self-loops would be represented as a set of one element
({0, 0} = {0}). For directed and weighted graphs (G0), edges may be represented as a
23
triple, i.e. (v1, v2, w); while undirected edges may be represented as a tuple of a set and a
value, i.e. ({v1, v2}, w). During this work, directed weighted graphs are used to represent
task graphs; while undirected unweighted graphs represents solutions or NoCs. Since no
self-loops are allowed for solutions, undirected unweighted graphs’ edges have two elements
– ∀e ∈ GE(|e| = 2).
Another concept that will be used throughout this work is the Null Graph. The Null
Graph is the only graph containing 0 nodes, and consequently no edges. Specifically the,
undirected unweighted Null Graph will represent an invalid solution or graph. In other
words, a tuple of empty sets, i.e. (∅, ∅).
2.2 Quadratic Assignment Problem Function
The Quadratic Assignment Problem (QAP) is an NP-hard problem that raises when
assigning facilities to locations. This problem is defined by Bokhari (BOKHARI, 1981) and
adapted to the current work as follows. An affinity measure between two objects i, and
j is given – in the current work it corresponds to the edge weight wij between nodes i,
and j –; n locations – in the proposed work, n = |TGV | where TG is a Task Graph –; the
distance distst between the locations in a Graph G – in the proposed work, the distance is
given by the number of hops in G’s shortest path from node s to t calculated by Dijkstra’s
Algorithm (DIJKSTRA, 1959) –; and a function that maps objects to locations – in the
proposed work, this is the identity function, i.e. i = s and j = t. Then, minimise the
Function ∑ij
wijdistst. (2.1)
It is important to highlight that in the current work, distij = 1 for TG. However, for G,
distij = distst ≥ 1.
Throughout this work, the QAP Function will be used as the Tabu Search fitness
function for latency estimation. Such a scenario is possible because the smaller the QAP
function value, the smaller shall be the overall latency in a network.
2.3 Network On Chip
The main goal of a NoC is to improve the communication between components of a
chip, specially if compared to traditional communication bus (YESIL; TOSUN; OZTURK,
24
2016). NoCs also provide a more scalable communication method if compared to tradi-
tional ones (YESIL; TOSUN; OZTURK, 2016).
A NoC consists of two major components: routers and links (ZEFERINO; SUSIN, 2003).
Routers are responsible for transmitting information (packets) between each other through
the links (SOTERIOU et al., 2009). The packets may pass through multiple links and routers
before reaching its destination. The router’s behaviour is described by routing algorithms,
which define the path to be travelled by the packets (SOTERIOU et al., 2009). There are
four core NoC features that describe the message transfers – routing algorithms, switching,
flow control, and arbitration.
Routing algorithms describe the path to be coursed by a packet. According to Ze-
ferino, a routing algorithm impacts on NoCs’ connectivity, deadlock and livelock freedom,
adaptability, and fault-tolerance (ZEFERINO; SUSIN, 2003). The connectivity is the capa-
city of sending packets from and to any core. Deadlock and livelock freedom guarantees
that all packets will arrive on its destination. Adaptability is related to flexibility – the
capacity of adapting to different topologies. Fault-tolerant routing algorithms attempt to
guarantee connectivity even though the NoC has faulty components (ZEFERINO, 2003).
Switching describes how packets are transferred from the input to the output of a
router. Some switching methods are circuit switching, store-and-forward, and wormhole.
Circuit switching reserves a path until the entire message is transmitted. Store-and-
forward packets have a header with information about its destination; and stored in a
buffer every router until its next hop is decided. Wormhole switching divides packets
into flits, and, if a flit’s output path is free, it is not stored in a buffer, being straightly
transmitted to the communication channel (ZEFERINO, 2003).
Flow control describes what shall be done with packets unable to acquire some re-
source. This may happen, for example, if there are numerous packets travelling through
the NoC, overloading it. Depending on the flow control, a packet may be discarded, tem-
porarily stored, or have its route changed (ZEFERINO, 2003).
On the other hand, arbiters are responsible for redirecting packets inside a router, i.e.
input path. This scenario may occur when a router simultaneously receives multiple pack-
ets competing for the same output path. Thus, the arbiter will be responsible for deciding
which packets will have access to the resources first. There are centralised – one per router
–, and distributed – one per path – arbiters. Some examples of arbitrating mechanism are
round-robin, first-come-first-served, least recently served, et cetera (ZEFERINO, 2003).
25
The next two NoC’s aspects are the focus of this work and hereafter explored: topo-
logies, and fault tolerance.
2.3.1 Topologies
The NoC components may be distributed in a chip regularly or irregularly. Regular
topologies tend to be used for general purpose applications and have reduced design time
(SRINIVASAN; CHATHA; KONJEVOD, 2006). Mesh (Figure 2a), Torus (Figure 2b), Ring
(Figure 2c), and Honeycomb (Figure 2d) are examples of regular NoC topologies (ZE-
FERINO; SUSIN, 2003; HEMANI et al., 2000; BONONI; CONCER, 2006). Routing algorithms
for regular NoCs are often simple, since they are based on the regular distribution of
resources. Some examples of routing algorithms for regular NoCs are XY (DEHYADGARI
et al., 2005), and DyXY (LI; ZENG; JONE, 2006).
0 1 2
3 4 5
6 7 8
(a) Mesh
0 1 2
3 4 5
6 7 8
(b) Torus topology
0 1 2
3 4 5
6 7 8
(c) Ring topology
0
1
2
3
4
5
6
7
8
9
10
11
12
(d) Honeycomb topology
Figure 2: Example of regular NoC topologies.
On the other hand, irregular topologies tend to be tailored for specific-purpose applic-
ations (CHOUDHARY; GAUR; LAXMI, 2011). Notwithstanding, irregular topologies may be
obtained from regular NoCs for which one or more components have a permanent failure
(ZHANG et al., 2009). Therefore, the study of fault-tolerance in irregular NoCs is interesting
26
even for the regular topology scenario. Irregular topologies can potentially improve area,
energy consumption, and performance if compared to regular ones (SRINIVASAN; CHATHA;
KONJEVOD, 2006). However, their routing algorithms cannot depend on the components’
regular distribution, thus, not as simple. Even so, multiple algorithms are developed and
benchmarked considering high-performance improvements (MILFONT et al., 2017). Graphs
IG0, IG1, IG3 are examples of irregular NoCs.
0 1 2
3 4 5
6 7 8
(a) IG0
0 1
2 3 4
5
(b) IG1
0
1 2
3
45
6
(c) IG2
Figure 3: Examples of irregular NoC topologies.
There are several ways to generate specific-purpose irregular NoCs. For example,
Srinivasan, Chatha, and Konjevod proposed to use slicing tree and linear programming
(SRINIVASAN; CHATHA; KONJEVOD, 2005). Pinto, Carloni, and Sangiovanni-Vincentelli
applied a heuristic to a previously proposed Constraint-Driven Communication Synthesis
(PINTO; CARLONI; SANGIOVANNI-VINCENTELLI, 2003). Ho and Pinkston’s work is based
on a recursive bisection technique (HO; PINKSTON, 2003). Metaheuristics are also com-
monly used for generation, a few examples are the works of (KREUTZ et al., 2005), (NEEB;
WEHN, 2008), (MESQUITA, 2016), and (CHOUDHARY et al., 2010).
2.3.2 Fault Tolerance
Faults may occur in both regular or irregular topologies. Faults may occur in links,
routers or even cores (AZAD et al., 2016). There are two types of faults: transient and
permanent. Transient faults may be the result of noise or interference (MILFONT et al.,
2017). Transient faults are hard to be corrected and do not compromise the behaviour of
the circuit for a long period (MILFONT et al., 2017). On the contrary, permanent faults
may happen due to physical damage or fabrication problems (MILFONT et al., 2017). NoCs
are jeopardised by permanent faults, and many works focus on dealing with them (AZAD
et al., 2016), increasing the circuit’s lifespan.
Faults may turn a topology unfeasible, i.e. creating two isolated (incommunicable)
areas, which is clearly an undesirable scenario (CHANG et al., 2011). Some examples are
27
illustrated by Graphs DISCG0, DISCG1, DISCG2, and DISCG3 (Figures 4a, 4b, 4c,
and 4d, respectively). In Figures 4a and 4c, the dotted nodes represent faulty routers;
while in Figures 4b and 4d, the dotted lines represent faulty links. Graphs DISCG0,
and DISCG1 illustrate faults turning regular NoCs unfeasible. Similarly, DISCG2, and
DISCG3 represent disconnected NoCs after the failures.
0 1 2
3 4 5
6 7 8
(a) DISCG0
0 1 2
3 4 5
6 7 8
(b) DISCG1
0 1
2 3 4
5
(c) DISCG2
0
1 2
3
45
6
(d) DISCG3
Figure 4: Examples of areas isolated in NoCs after faults.
There are two approaches to amortise the impacts of a fault: architecture level, and
system and application level approaches (AZAD et al., 2016). Architecture level approaches
tackle fault-tolerance by adding redundant components, whether routers, links, or cores
(AZAD et al., 2016; CHANG et al., 2011; ZHANG et al., 2009). System and application level
approaches tackle the problem by adding software flexibility, e.g. routing algorithms (AZAD
et al., 2016).
2.4 Taks Graphs
A Task Graph (TG) describes an application subdivided into tasks. Tasks may depend
on each other. A TG is commonly modelled as a directed graph, where the vertices
and edges represent tasks and dependency between them, respectively. Edges are often
weighted, possible representing communication cost or duration. Figure 5 illustrates a TG
generated with Task Graphs For Free (DICK; RHODES; WOLF, 1998). In this TG, task 3
depends on task 2, and the communication cost from 2 to 3 is 18.
28
Figure 5: A Task Graph.
In the Multi-Processor Network-on-Chip (MPSoC) context, NoCs may be used for
MPSoC design, while TG tasks are mapped to MPSoC cores. There is not necessarily a
bijection between TG and NoC edges. Mapping a TG to a NoC falls into the QAP category
(BOKHARI, 1981; ROCHA, 2017). Irregular NoC topologies may be generated according to
TGs using metaheuristics, such as Simulated Annealing (NEEB; WEHN, 2008), and Genetic
Algorithm (CHOUDHARY et al., 2010; MESQUITA, 2016).
2.5 Metaheuristics
The Computer Science core is to model problems mathematically so their solution
can be calculated by a computer. Problems may be classified in various categories, and
according to different aspects, e.g. running time, and memory usage. Regarding running
time, there exists problems known to be efficiently solvable on a computer (P class), i.e.
problems that require a polynomial number of operations. On the other hand, among
other characteristics, the NP class consists of decision problems for which a solution can
be verified in polynomial time (CORMEN et al., 2009). Some examples of NP problems
are the decision versions of the Quadratic Assignment Problem (BOKHARI, 1981), the
Classical Vehicle Routing Problem (CVRP) (GENDREAU; POTVIN et al., 2010), and the
Vertex Cover Problem (GAREY; JOHNSON; STOCKMEYER, 1974).
Although P ⊆ NP , it is unknown if P = NP . Thus, there are NP problems for
which no polynomial solution is known. Nevertheless, it is desirable to find efficient solu-
29
tions even for these problems. There are techniques capable of finding the best solution
(exact Algorithms). For instance, it is possible to perform exhaustive searches, branch-
and-bound, et cetera. Exhaustive searches visit all the solutions looking for the optimal
one. Branch-and-bound visits some solutions while pruning part of the search space. This
occurs only if it can be mathematically proved that no solution of the pruned space is
better than the current best one (BALAS; TOTH, 1983). However, computational time for
non-small problem instances is often unfeasible.
Depending on the desired results, some non-optimal solution may be sufficient (solu-
tions different from the best one). The CVRP is an example of such a problem because it
may be desirable to obtain the best solution possible in a limited period of time (GENDR-
EAU; POTVIN, 2005). These solutions are denominated local optima, in contrast to the
global optimum. Thus, strategies for finding local optima while still searching for the
global optimum were developed. An example of such strategies are metaheuristics – con-
trolled local searches capable of finding multiple local optima, and often analogous to some
natural phenomena (GENDREAU; POTVIN et al., 2010). Some metaheuristic examples are
Simulated Annealing, Tabu Search, Genetic Algorithms, Memetic Algorithms, et cetera.
Hybridisation is also possible.
Metaheuristics are used for generating irregular topologies because it is an NP-hard
problem – a variation of the Steiner Tree Problem (RAVI et al., 2001; MESQUITA, 2016).
2.5.1 Tabu Searh
The main idea of the Tabu Search is to find local optima, while escaping from recently
found solutions (GENDREAU; POTVIN, 2005). One solution can be obtained from another
by performing a neighbourhood step operation on the Search Space using the Neighbour-
hood Structure (GENDREAU; POTVIN, 2005). A neighbourhood step operation depends on
how the problem is modelled, and on how a movement is defined (GENDREAU; POTVIN,
2005). A movement may be described as swapping, removing, or adding Neighbourhood
Structure elements, et cetera.
One of the key factors of Tabu Search is to prevent the search from visiting a solution
multiple times. This is achieved with a short term memory – called Tabu List – that stores
recently performed neighbourhood movements (GENDREAU; POTVIN, 2005). Essentially,
if a Neighbour Solution contains a Tabu Movement, it is not considered in the Search
Space. Tabu Lists are often implemented as circular queues and their size depends on
the problem and the performed experiments (GENDREAU; POTVIN, 2005). In this case,
30
the union operation may remove the oldest inserted element. A Tabu List may be too
restrictive or too permissive, depending on how the problem is modelled or even in the
desired results.
The Tabu Search skeleton is described in Algorithm 1, this is the same Algorithm
as the one described by Gendreau and Potvin (GENDREAU; POTVIN, 2005), but with
a different notation . It is important to emphasise some points also highlighted by the
authors. The termination criterion depends on the problem, though it is usually defined
in number of iterations. The fitness() function is used to rank and evaluate different
solutions. The selectBestNeighbour() function returns the best neighbour of the current
solution considering movements not in the Tabu List. Gendreau and Potvin also states that
an Aspiration Criterion may be necessary while searching the neighbourhood (GENDREAU;
POTVIN, 2005). For instance, if the fitness of a solution containing a Tabu Movement is
better than the best solution found, the movement should be performed even though it is
Tabu.
Algorithm 1 Tabu Search Algorithm Skeleton.1: function Tabu Search skeleton
2: S ← S0 . creates initial solution and sets current solution
3: BS ← S0 . sets the initial solution as the best found
4: TL← ∅ . initialises Tabu List
5: while ¬ termination criterion do
6: S ← selectBestNeighbour(S, TL)
7: if fitness(S) < fitness(BS) then
8: BS ← S . saves current solution as the best
9: end if
10: TL← TL ∪ {performedMovement}11: end while
12: end function
31
3 Related Works
There are several works in the literature regarding irregular NoC topologies. Two
major areas are of interest: performance and fault tolerance. The performance area focus
on improvement by generating topologies; routing algorithms; physical simulations; etc.
On the other hand, fault-tolerance embraces topics such as maintenance of a regular NoC
using spare routers and virtual topologies; topology reconfiguration; built-in router self-
diagnosis; detection and handling of transient and permanent faults; routing algorithms,
fault tolerance on routers and links; et cetera (SALMINEN; KULMALA; HAMALAINEN, 2008;
RADETZKI et al., 2013).
3.1 Performance
Two possible ways to enhance performance are to improve the routing algorithms, and
to generate application-specific topologies. Two works about routing algorithms and four
about topology generation are henceforth mentioned: routing table minimisation (MOTA et
al., 2016), fault-tolerant enhanced odd-even XY routing algorithm (ABEDNEZHAD; ALAVI,
2017), design of irregular topologies for heterogeneous NoCs (NEEB; WEHN, 2008), lin-
ear programming (SRINIVASAN; CHATHA; KONJEVOD, 2006), ant lion optimisation (VEN-
KATARAMAN; KUMAR, 2019), and the genetic algorithm (MESQUITA, 2016).
There are multiple works about routing algorithms for irregular NoCs. Most of the
works focus on ensuring deadlock free algorithms, and the method used to guarantee it
directly affects performance. In addition, routing algorithms are often applied to irregular
NoCs to increase its lifespan. For instance, (MOTA et al., 2016) focuses on reducing the size
of the routing table to improve performance. As another example, (ABEDNEZHAD; ALAVI,
2017) uses a hybrid approach to obtain a fault tolerant deadlock free routing algorithm.
Their solution acts as a XY routing algorithm by default and uses enhanced odd-even
model when a faulty link is found.
The work of Neeb and Wehn uses Simulated Annealing to map tasks to a bidirectional
32
chain topology; then, edges are added to it by a greedy algorithm (NEEB; WEHN, 2008).
The obtained graphs are compared to mesh, torus, and spidergon topologies.
(SRINIVASAN; CHATHA; KONJEVOD, 2006) generate application-specific NoCs by using
Linear-Programming . Their objective is to minimise power consumption while maxim-
ising performance. Thus, the physical size and distance between components is considered
throughout the process.
Venkataraman and Kumar also proposed to decrease power consumption for applic-
ation specific topologies. Their work uses an ant lion optimisation technique to generate
the topologies (VENKATARAMAN; KUMAR, 2019). In addition, redesigning the router ar-
chitecture helped to improve the obtained results (VENKATARAMAN; KUMAR, 2019).
The Genetic Algorithm proposed in (MESQUITA, 2016) generates irregular topologies
from a 2D-Mesh population. The population is submitted to mutations, where links circuit
components may be removed. The implemented algorithm uses single-point crossover.
Due to the nature of the problem, however, single-point crossover may not contribute to
multiple neighbourhoods exploration. In addition, improved results may be achieved if
the initial population also contains different individuals, such as Torus, and Honeycomb.
However, for a few works, it is necessary to review some implementations details in
order to explore a wider range of solutions. In addition, the presented works focus solely on
performance, while fault-tolerance is mentioned as a desired feature for future projects. On
the contrary, the proposed work focuses on generating topologies that are simultaneously
efficient and fault-tolerant.
3.2 Fault Tolerance
Two usual ways to add fault tolerance to NoCs are through routing algorithms, or com-
ponent redundancy. For component redundancy, one may simply duplicate the resources
(links, routers or PEs); create alternative paths inside routers; etc.
The proposed work focuses on topology link redundancy, ensuring alternative paths
between two routers. Five complementary works are thus highlighted: lightweight fault-
tolerant mechanism, (KOIBUCHI et al., 2008), fault-isolation circuits (LIN et al., 2009), a
fault-tolerant honeycomb model (YANG et al., 2016), De Bruijn’s algorithm (HOSSEIN-
ABADY et al., 2007), Bio-inspired algorithms (BECKER; KRÖMKER; SZCZERBICKA, 2015),
and the Poorest Neighbour approach (SHAH; KANNIGANTI; SOUMYA, 2017).
33
The lightweight fault-tolerant Mechanism achieves fault tolerance by adding redund-
ant components (KOIBUCHI et al., 2008). However, the work’s premise is to duplicate
simple components since they are less susceptible to failure (KOIBUCHI et al., 2008). In
summary, it prevents failures on routers by adding alternative paths bypassing the cross-
bar (KOIBUCHI et al., 2008). The work of Lin, et al. also uses a similar strategy (LIN et al.,
2009).
In the fault-tolerant honeycomb model (YANG et al., 2016), tolerance is achieved by
adding one extra input/output link per processing element. This is achieved by adding
a spare router in the centre of each hexagon. The spare router is therefore connected
to the six processing elements. Hence, this approach handles faults in links and routers.
A message will move around a faulty link by passing through the spare router, though
there is an overhead increase. If a router fails, the corresponding processing element would
normally become inaccessible. The honeycomb model solves this problem by connecting
two routers to a processing element. Thus, a new hexagon is simulated by nearby spare
routers. In the proposed work, it was decided not to use this technique since a honeycomb
model tends to occupy larger areas.
(SHAH; KANNIGANTI; SOUMYA, 2017) states that De Bruijn’s graph is widely ap-
plied in Bioinformatics. De Bruijn’s algorithm uses mathematical formulae to determine
if two nodes should be connected (HOSSEINABADY et al., 2007). The binary version of
the algorithm focuses on associating every node possible to four edges, with only a few
exceptions (HOSSEINABADY et al., 2007). De Bruijn’s algorithm achieves 100% fault toler-
ance for links (SHAH; KANNIGANTI; SOUMYA, 2017). Although, this approach is unfeasible
because nonplanar graphs may be generated.
The work developed by (BECKER; KRÖMKER; SZCZERBICKA, 2015) seems promising
since it evaluates heuristics and Bio-inspired algorithms to generated fault-tolerant graphs.
However, the work is very superficial. It even lacks core information such as algorithm
details.
The Poorest Neighbour Algorithm (SHAH; KANNIGANTI; SOUMYA, 2017) is a determ-
inistic algorithm that adds link fault tolerance to a NoC given its application graph.
Compared to De Brujin’s algorithm, the generated topology considerably reduces the
number of necessary links. Additionally, the authors claim that the Poorest Neighbour
achieves 100% link fault-tolerance.
While simulating the algorithm provided by the author, problems were found. The
algorithm was tested only for few more than forty graphs. From these graphs, just a small
34
subset represent applications for which fault-tolerance could not be added manually and
effortless. In addition, the algorithm has three different implementations not mentioned in
the paper. And given the same graph, the algorithms may have different outputs. However,
the most compromising problem is that building a NoC directly from an outputted graph
may be unfeasible. This is due to the algorithm’s nature. Routers are simulated with
unlimited ports and links are never removed from the topology, only inserted.
Some of the presented works tackle the problem of adding fault-tolerance, focusing
either on regular or on irregular topologies. The works that generate fault-tolerant topolo-
gies need to be enhanced to consider necessary limitations (such as port limit per router).
Therefore, inapplicable (nonplanar) graphs are more likely to be obtained. On the other
hand, the proposed work focuses on generating irregular topologies for which the routers
have a maximum of four ports. In addition, a limit for the number of links is required.
Together, these restriction increase the odds of obtaining planar solutions.
35
4 Methodology
Irregular topologies overperform regular topologies for specific applications, since their
structure tends to be more similar to some TG. Adding fault tolerance is desirable, and
in many cases, essential to increase the lifespan of a NoC, whether its topology is regular
or irregular. Thus, the proposed work focuses on generating high-performance irregular
NoC topologies with link redundancy to increase fault tolerance.
The proposal is to generate irregular NoC topologies with redundant links for fault
tolerance. The topologies generated are evaluated in order to estimate how latency can
be affected by the approach.
To generate the topologies, a software was implemented in C++. The choice of a
high level abstraction design tool was made based on the fact that this work attempts to
explore different topologies through heuristic algorithms.
Therefore, latency will be estimated by the QAP function, i.e. number of hops weighted
by the TG weight, as detailed in section 2.2. For a given TG, the number of routers is
fixed; and the number of links is used to classify different topologies. It is also desirable to
evaluate if there exists a (number of links) limit for significant performance improvements.
Nevertheless, topologies for efficient circuits with long lifespan are expected to be
outputted. The proposed algorithm has three main stages: initial topology generation,
best solution search and fault injection. These stages are listed in Algorithm 2.
Algorithm 2 Methodology.1: function main(graph: TG; int: ε, tabuListSize, terminationCriterion)
2: S ← GENERATE_INITIAL_SOLUTION(TG, ε)
3: S ← TABU_SEARCH(TG, S, ε, tabuListSize, terminationCriterion)
4: FAULT_INJECTION(S, TG)
5: end function
The first step randomly generates an initial feasible solution given a TG, and an ε
36
value. The second step uses Tabu Search and the QAP function to generate a local op-
timal solution. The third step stresses the generated topology multiple times by randomly
choosing links to fail.
4.1 Definitions and Assumptions
There are a set of definitions and assumptions that will be used throughout the re-
maining sections. The topics to be discussed are solution representation, and feasible
solution.
4.1.1 Solution Representation
Any NoC (consequently any solution) can be represented as a graph. Thus, there
is a bijection between nodes (vertices) of a graph and routers of a topology, and edges
of a graph and links of a topology. These terms are hereafter used interchangeably. For
the current problem, a TG is read as an adjacency, while a solution is represented as a
triangular adjacency matrix with no main diagonal.
Due to problem restrictions, the links on solutions are bidirectional. Thus, solutions
can be represented as symmetric adjacency matrices. Since self-loops are not allowed, it is
not necessary to store the matrix’s main diagonal. Hence, in order to save memory, only
the elements below the main diagonal are stored. For example, the graph represented by
the matrix on the left would be stored as the matrix on the right,
0 10 0 7 5 0
10 0 1 0 2 0
0 1 0 4 0 0
7 0 4 0 2 2
5 2 0 2 0 0
0 0 0 2 0 0
→
10
0 1
7 0 4
5 2 0 2
0 0 0 2 0
. (4.1)
Thus, by Arithmetic Progression, instead of storing |V |2 edges, only
n(a1 + an)
2=|V |(0 + |V | − 1)
2(4.2)
=|V |2 − |V |
2(4.3)
edges are stored.
37
Throughout the Tabu Search, the only information used for solution edges is whether
they exist or not. Thus, the solution is stored as a boolean matrix with no main diagonal.
For instance, the previous graph would be represented as the following matrix:
10
0 1
7 0 4
5 2 0 2
0 0 0 2 0
→
1
0 1
1 0 1
1 1 0 1
0 0 0 1 0
. (4.4)
This matrix representation was implemented to represent graphs. However, the defin-
itions of Section 2.1 will be used in the pseudocodes. Henceforth, a solution will be rep-
resented as an undirected and unweighted graph.
4.1.2 Feasible Solution
Henceforth, a solution is considered feasible if it meets the following restriction. As-
sume that the corresponding solution graph is S(V,E); and that degree(v) is a function
returns the degree of vertex v, i.e. the number of edges incident to it. Thus, the restriction
is described by,
∀v (v ∈ SV ∧ 2 ≤ degree(v) ≤ 4). (4.5)
This restriction guarantees that the routers have a standard design, with a maximum
of four output ports. This router architecture is commonly found in regular NoCs, such
as Mesh-2D, and Torus (JANTSCH; TENHUNEN et al., 2003). Thus, the obtained solutions
are more likely to have lower power consumption, and simpler design if compared to the
solutions generated by the Poorest Neighbour (SHAH; KANNIGANTI; SOUMYA, 2017), and
De Bruijn’s algorithm (HOSSEINABADY et al., 2007). In addition, this restriction aims to
increase reliability through redundancy since there are at least two edges through which
it is possible to reach a node, i.e. at least one alternative path. Therefore, link redundancy
is achieved; potentially improving reliability.
It is important to highlight that Equation 4.5 is not sufficient to guarantee graph
planarity. Therefore, non-planar graphs may be generated by the Algorithm and classified
as feasible solutions. In other words, some solutions may not be used in 2D-MPSoC design.
38
4.2 Initial Topology Generation
In order to generate an initial feasible solution for a given combination of TG and
ε, there are two not necessarily distinct possible scenarios. First, the number of edges in
the original TG may be different from the ε value; thus it is necessary to remove or add
edges until both values match. Second, the TG may not be feasible; thus, it is necessary
to move some existent edges until the condition of Equation 4.5 is met.
For some values of ε, no feasible solution is possible. Thus, it is necessary to assert its
value before initiating the process of generating an initial topology. This condition can be
easily verified using the Handshaking Lemma (WILSON, 1979),∑v∈V
degree(v) = 2|E|. (4.6)
Therefore, it is possible to assert that the condition described by Equation 4.5 is satisfied
by guaranteeing that, for the desired solution,
2 ≤ 2ε
|V |=
2|E||V |≤ 4. (4.7)
In summary, the algorithm for generating the initial topology (Algorithm 3) is divided
in four steps: asserting that ε is valid, converting the directed edges TGE to undirected,
fitting |E| to ε, and making the solution feasible. The last two steps are discussed in
Sections 4.2.1, and 4.2.2, respectively.
Algorithm 3 Generate Initial Solution Graph.1: function GENERATE_INITIAL_SOLUTION(graph: TG, int: ε)
2: if 2 ≤ 2ε/|TGV | ≤ 4 then
3: SE ← {{v1, v2}|∃e∃w(e ∈ TGE ∧ (e = (v1, v2, w) ∨ e = (v2, v1, w))}4: S ← (TGV , SE)
5: S ← FIT_TO_EPSILON(S, ε)
6: S ← MAKE_FEASIBLE(S)
7: return S
8: end if
9: return (∅, ∅)10: end function
The solution is represented as a symmetric unweighted graph. Therefore, the conver-
sion process (Algorithm 3, line 3) “removes” the edge direction and weight, adding it to the
39
graph. This process can be visualised in Figure 6. Suppose that Graph TG = GISEC0
(Figure 6a). Then, both edges (5, 2, 6) – i.e. edge from node 5 to 2 with weight 6 –, and
(2, 5, 2) would be converted to {2, 5}; thus, two edges were collapsed to one. In addition,
edge (1, 0, 4) would be converted to {0, 1}. After all conversions are performed, the res-
ulting Graph S is represented in Figure 6b (Graph GISEC1). Note that the obtained
Graph is undirected and unweighted.
0
1 2
3
45
6
24 8
56
8
7
8
(a) GISEC0
0
1 2
3
45
(b) GISEC1
Figure 6: Example of Task Graph edge conversion.
4.2.1 Fitting to Epsilon
This stage guarantees that the initial solution will have a number of edges correspond-
ent to the epsilon restriction parameter (ε = |E|). For example, if ε < |E|, then edges
need to be removed from the graph. On the other hand, if ε > |E|, then edges will be
added to the graph. In order to explore a wider range of solutions in multiple executions,
these edges are randomly deleted and inserted, as detailed in Algorithm 4.
Algorithm 4 Fit Solution’s number of edges to ε.1: function FIT_TO_EPSILON(graph: S, int: ε)
2: while ε < |SE| do3: SE ← SE − {random(SE)}4: end while
5: while ε > |SE| do6: SE ← SE ∪ {random(SCE )}7: end while
8: return S
9: end function
40
4.2.2 Making Feasible
This stage guarantees that the initial solution is feasible. For instance, Graph UNF
does not represent a feasible solution because nodes 4 and 7 have degree 5, and nodes
0 and 2 have degree 1 (Figure 7). In addition, the condition stated in Equation 4.7
would be true for Graph UNF if ε = 13. Therefore, it could have been outputted
from the FIT_TO_EPSILON function (Section 4.2.1). In order to properly under-
stand the MAKE_FEASIBLE() function, it is necessary to comprehend the behaviour of
DEL_EDGE() and ADD_EDGE() functions.
0 1 2
3 4 5
6 7 8
Figure 7: Example of unfeasible solution – UNF .
4.2.2.1 Deleting Edges - DEL_EDGE()
To select an edge to be deleted, this function simply selects the node with largest
degree (ldn). Then, selects its neighbour with largest degree (ldneigh). Afterwards, it
removes the edge between these two nodes. It is possible that multiple vertices have
the largest degree, any of them can be chosen randomly. The algorithm’s pseudo-code is
described in Algorithm 5.
Algorithm 5 Deletes edge with largest degree incident nodes possible.1: function DEL_EDGE(graph: S)
2: ldn← random(argmax(degrees(SV ))) . largest degree node
3: edges← {e|e ∈ SE ∧ ldn ∈ e} . edges incident to ldn
4: neighs← {v|v ∈ e ∧ v 6= ldn ∧ e ∈ edges} . nodes adjacent tp ldn
5: ldneigh← random(argmax(degrees(neighs))) . largest degree neighbour
6: edge← {ldn, ldneigh}7: SE ← SE − {edge}8: return S
9: end function
41
To illustrate this procedure, suppose that S = DUE0 (Figure 8a). In Graph DUE0,
degree(v) = 4 for either v ∈ {0, 1, 3, 5}. Hence, any of these nodes can be selected. Let
ldn = 0; since degree(1) = degree(5) = 4, it is possible to delete two edges – {0, 1}, or{0, 5}. This scenario is illustrated by Graph DUE1 (Figure 8b), where removable edges
are dotted. If {0, 1} is chosen, Graph DUE2 (Figure 8c) is obtained. If the function was
called for Graph DUE2 (S = DUE2), edge {3, 5} would be deleted. The resulting Graph
is illustrated in Figure 8d, where the removed edge is dotted.
0
1 2
3
45
(a) DUE0
0
1 2
3
45
(b) DUE1
0
1 2
3
45
(c) DUE2
0
1 2
3
45
(d) DUE3
Figure 8: Example of Delete Edges Until Epsilon.
4.2.2.2 Adding Edges - ADD_EDGE()
This function simply selects the nodes with smallest degrees (sdn, and sdn2), and
adds an edge between them. The Algorithm expects a set of prohibited edges, a “Tabu
List” named TL. If the Algorithm attempts to add edge {sdn, sdn2}, and it already exists
({sdn, sdn2} ∈ SE), or it is tabu ({sdn, sdn2} ∈ TL); it is necessary to change vertex
sdn2 to a previously unvisited smallest degree node. The process is repeated until an edge
can be inserted into the Graph. This Algorithm’s behaviour is detailed in Algorithm 6.
The random function is necessary because there may exist multiple smallest degree nodes
in a Graph.
For example, suppose that S = AUE0 (Figure 9a), and TL = ∅. The possible values
for sdn are 1, 2, 3, 4, or 5. Suppose that sdn = 3. Then, the possible values for sdn2 are 1, 2,
4, or 5. If sdn2 = 4, the edge {sdn, sdn2} cannot be added to the Graph since {3, 4} ∈ SE.This scenario is illustrated by Graph AUE1 (Figure 9b), where the dashed edges are
addable, and the dotted edge cannot be inserted. If sdn2 = 5, then the Graph S obtained
42
is represented by AUE2 in Figure(9c). If the function is called for AUE2 and TL =
{ {1, 2} }, the possible values for sdn, and sdn2 are sdn = random({1, 2, 4}), and sdn2 =
random({1, 2, 4}−{sdn}). In other words, one random edge in the { {1, 2}, {1, 4}, {2, 4}}set will be inserted. However, {1, 2} ∈ TL, thus it cannot be inserted into the Graph.
Graph AUE3 (Figure 9d) illustrates this scenario, where dashed links can be added,
while the dotted link cannot.
Algorithm 6 Adds edge between the two nodes with smallest possible.1: function ADD_EDGE(graph: S, tabulist: TL)
2: sdn← random(argmin(degrees(SV ))) . smallest degree node
3: V ← {sdn} . set of visited vertices
4: repeat
5: sdn2 ← random(argmin(degrees(SV − V )))
6: edge← {sdn, sdn2}7: V ← V ∪ {sdn2}8: until edge /∈ SE ∧ edge /∈ TL9: SE ← SE ∪ {edge}
10: return S
11: end function
0
1 2
3
45
(a) AUE0
0
1 2
3
45
(b) AUE1
0
1 2
3
45
(c) AUE2
0
1 2
3
45
(d) AUE3
Figure 9: Example of Add Edges Until Epsilon.
4.2.2.3 Make Feasible Algorithm
The algorithm presented in this section swaps edges’ positions until the obtained
solution is feasible (Equation 4.5). The algorithm consists of deleting edges from the
43
nodes with the largest degrees and adding them between the nodes with smallest degrees
until the solution is feasible. The last deleted edge cannot be added in the same iteration
or an infinite loop may occur. There is a chance that a disconnected graph is generated
during this process. The MAKE_FEASIBLE() function is detailed in Algorithm 7.
Algorithm 7 Make a Solution Feasible.1: function MAKE_FEASIBLE(graph: S)
2: while min(degrees(SV )) < 2 ∨max(degrees(SV )) > 4 do
3: S2 ← DEL_EDGE(S)
4: TL← SE − S2E . identifies deleted edge and creates Tabu List
5: S ← ADD_EDGE(S2, TL)
6: end while
7: return S
8: end function
Suppose, for example, that S =MFG0 (Figure 10a). Then, during the first iteration,
either edge {0, 3}, or {0, 5} are deleted, and stored into S2. If {0, 5} is randomly chosen,
then S2 = MFG1 (Figure 10b). resulting in Graph MFG1 – the deleted edge is dotted.
Afterwards, the Algorithm randomly adds one of the edges {1, 4}, {2, 4}, or {5, 4}. Ifedge {2, 4} is chosen, then the Graph MFG2 (Figure 10c) is obtained. The removed, and
inserted edges are respectively drawn as dotted, and dashed lines in the Figure. After this
step, the Algorithm would stop since MFG2 is a feasible solution.
0
1 2
3
45
(a) MFG0
0
1 2
3
45
(b) MFG1
0
1 2
3
45
(c) MFG2
Figure 10: Example of making an unfeasiable solution feasible.
4.3 Best Solution Search – Tabu Search
Given an initial feasible solution, a local search (Tabu Search) is performed to optimise
the QAP function (Section 2.2). The Tabu Search skeleton was discussed in Section 2.5.1.
The Tabu Search per se is not complicated. Although, as stated by Gendreau, and Potvin,
it is necessary to properly analyse the problem in hand in order to efficiently represent
44
a solution and the neighbourhood step. The Tabu List contains recently deleted edges,
preventing them to be added again to the solution in the next few iterations. The major
efforts of the proposed Tabu Search focus on the Neighbourhood Search steps to explore
as much feasible solutions as possible. Section 4.3.2 details the Tabu List implementation,
while Section 4.3.3 describes the Neighbourhood Search Algorithms. The behaviour of the
proposed Tabu Search is described by Algorithm 8.
Algorithm 8 Implemented Tabu Search.1: function TABU_SEARCH(graph: TG, S0; int: ε, tabuListSize, termCrit)
2: S ← S0
3: BS ← S0 . sets the initial solution as the best found
4: TL← ∅ . empty circular queue
5: count← 0
6: while count < termCrit do
7: NS ← NEIGHBOURHOOD_SEARCH(S, ∅, n) . set of n random neighbours
8: BN ← SELECT_BEST_NEIGHBOUR(NS, TG)
9: if FITNESS(BN, TG) ≥ FITNESS(BS, TG) then
10: REMOVE_TABU_SOLUTIONS(NS, TL)
11: if NS = ∅ then12: NS ← NEIGHBOURHOOD_SEARCH(S, TL, n)
13: if NS = ∅ then14: return BS . non-tabu neighbour obtainable
15: end if
16: end if
17: BN ← SELECT_BEST_NEIGHBOUR(NS, TG)
18: end if
19: count← count+ 1
20: S ← BN
21: if FITNESS(S, TG) < FITNESS(BS, TG) then
22: BS ← S . saves current solution as the best
23: count← 0
24: end if
25: if |TL| = tabuListSize then
26: REMOVE_OLDEST_ELEMENT(TL)
27: end if
28: TL← TL ∪ {S performedDels}
45
29: end while
30: return BS
31: end function
The termination criterion (termCrit) is defined in number of iterations with no best
solution improvement. Each iteration begins with a Neighbourhood Search (NS), return-
ing n solutions, independent of being tabu or not. In the implemented code, n = ε.
Afterwards, it is verified if the best neighbour is better than the best solution – the FIT-
NESS() function is defined as the QAP stated in section 2.2. Since NS may contain tabu
neighbours, this condition simulates an Aspiration Criterion. If the condition is not met,
tabu neighbours are removed from NS, and the best solution from the remaining ones
is chosen as the current solution. However, this set may be empty, i.e. only tabu neigh-
bours were generated. If this is the case, then a maximum of n non-tabu neighbours are
generated. The best solution of this new set is then chosen as the current solution. If the
new set is empty, then a non-tabu neighbour of S does not exist, and the Tabu Search
returns the best solution found even though the termination criterion is not yet met. Such
a scenario may happen if the Tabu List size is too large, i.e. too restrictive. The deletions
performed during the Neigbourhood Search to obtain BN (Best Neighbour) are added to
the Tabu List.
4.3.1 Fitness Function
The fitness function is defined as the QAP function. The better the solution, the
smaller will be its fitness value. The function behaviour is described by Algorithm 9.
Dijkstra’s Algorithm is being used to calculate the shortest path between two nodes.
Then, the number of edges (hops) is multiplied by the respective communication cost of
the TG. This process is repeated for all edges in the TG.
Algorithm 9 Fitness Function implementation.1: function FITNESS(graph: S, TG)
2: sum← 0
3: for all (v1, v2, w) ∈ TGE do
4: sum← sum+ w · |SHORTEST_PATH(TG, v1, v2)|5: end for
6: return sum
7: end function
46
4.3.2 Tabu List
The Tabu List stores all the edges deleted in a single movement. Multiple edges may
be removed in a single movement, thus, a Tabu List stores sets of edges (a set of sets
of edges). For the current proposal, implementing the Tabu List simply as a set of edges
would be too restrictive, while storing all edges deleted and added in a single movement
would be a loose restriction.
Unless the Aspiration Criterion is being evaluated, the elements of the Tabu List will
neither be present in the current solution, nor in the generated neighbours. Formally, a
graph S is considered Tabu if, for a Tabu List TL,
∃te (te ∈ TL ∧ te ⊆ SE). (4.8)
For instance, if a Tabu List element contains two edges, then they cannot be simultan-
eously in the graph.
As a concrete example, suppose that Graph TLG0 corresponds to some iteration’s
current solution (Figure 11a). Also, assume that
TL = { {{5, 2}}, {{5, 0}, {4, 1}}, {{3, 0}, {5, 3}} } (4.9)
is the Tabu List for the same iteration. Then, using the movements described in Section
4.3.3, it is possible to generate neighbours Graphs TLG1, TLG2, and TLG3 (Figures 11b,
11c, and 11d, respectively).
0
1 2
3
45
(a) TLG0
0
1 2
3
45
(b) TLG1
0
1 2
3
45
(c) TLG2
0
1 2
3
45
(d) TLG3
Figure 11: Tabu List examples.
In Figure 11, dotted lines represent tabu edges, while the dashed line is part of a tabu
47
element te, though te 6⊆ SE. TLG1, and TLG2 will not be visited since they correspond to
Tabu Solutions. TLG1 is Tabu because it contains edge {5, 2}, i.e. {{5, 2}} ⊆ GRAPH1E.
TLG2 is Tabu because it contains both edges {5, 0}, and {4, 1}, i.e. {{5, 0}, {4, 1}} ⊆GRAPH2E. On the contrary, TLG3 is not Tabu because although it contains edge (3, 0),
it does not contain edge {5, 3}, i.e. {{3, 0}, {5, 3}} * GRAPH3E.
4.3.3 Neighbourhood Search
Although Gendreau, and Potvin states that visiting unfeasible solutions may con-
tribute to find the global optimum, this scenario is not desirable in the proposed work
due to the large number of unfeasible solutions. For example, the simplest scenario is to
generate ring topologies 1 – topologies for which all nodes have degree 2. Considering
a TG with 10 vertices, a ring topology would have 10 edges as well. Therefore, there
are |V |2−|V |2
= 45 possible edges (Section 4.1.1). Then, by simple Combinatorics, there are45!
10!(45−10)!) ≈ 3.19 ·109 possible graphs with 10 edges. However, there are only 10!10≈ 3.6 ·105
possible ring topologies, i.e. approximately 0.0113% of the total combinations to be ex-
plored. Consequently, allowing Tabu Search to explore unfeasible solutions in this scenario
is impracticable.
Therefore, the neighbourhood step focus on visiting solely feasible solutions. This
process is described by Algorithm 10. It basically randomly selects an existent edge to be
deleted (edel ∈ SE), and randomly selects a non-existent edge to be added (eadd ∈ SCE ).Once the values of edel or eadd are chosen, the Algorithm will attempt to generate a non-
tabu solution using these edges, adding it to the set of obtained neighbours NS. If this is
not possible, the edges are added to a set of non-selectable edges – Tdel for edel, and Taddfor eadd. Then, yet not explored edges are selected (edel /∈ Tdel ∨ eadd /∈ Tadd). The process
is repeated until n neighbours are generated (|NS| = n) or all combinations of edel and
eadd are explored (Tdel = SE). It is important to highlight that in the implemented code,
n = ε.
The condition in lines 8 and 19 guarantees that the obtained neighbour is valid. It
must not be the Null Graph, N 6= (∅, ∅), neither be an already generated neighbour,
N /∈ NS, nor be a Tabu Neighbour after the special operations were performed, @te(te ∈TL ∧ te ⊆ NE).
1Even though ring topologies are regular, they are the minimum possible fault tolerant topologyconceivable. Therefore, they are still eligible study objects. Notwithstanding, for a given TG, it may beinteresting to analyse what would be the solution (smallest latency) for the worst case scenario (minimumfault tolerance).
48
Algorithm 10 Neighbourhood Search1: function NEIGHBOURHOOD_SEARCH(graph: S, tabulist: TL, int: n)
2: Tdel ← ∅ . tabu edges to del
3: NS ← ∅ . set of generated neighbours
4: while |Tdel| < |SE| ∧ |NS| < n do
5: edel ← random(SE − Tdel)6: N ← SPECIAL_DELS(S, edel, TL)
7: if NE 6= SE − {edel} then . some special deletion was performed
8: if N 6= (∅, ∅) ∧N /∈ NS ∧ @te (te ∈ TL ∧ te ⊆ NE) then
9: NS ← NS ∪ {N} . non-tabu neighbour generated
10: else
11: Tdel ← Tdel ∪ {edel} . edel cannot be deleted
12: end if
13: continue
14: end if
15: Tadd ← {edel} . tabu edges to add
16: while Tadd < NCE do
17: eadd ← random(NCE − Tadd)
18: N ← SPECIAL_ADDS(N, eadd, TL ∪ {edel})19: if N 6= (∅, ∅) ∧N /∈ NS ∧ @te (te ∈ TL ∧ te ⊆ NE) then
20: NS ← NS ∪ {N}21: break
22: end if
23: Tadd ← Tadd ∪ {eadd}24: end while
25: if Tadd = NCE then . if no possible edge can be added
26: Tdel ← Tdel ∪ {edel} . then edel cannot be deleted
27: end if
28: end while
29: return NS
30: end function
Throughout Algorithm 10, it is possible to generate unfeasible solutions by either
deleting or adding the edges. Thus, five possible scenarios may raise depending on the
chosen edges:
49
1. Deleting an edge between two nodes with minimum degree (Section 4.3.3.1);
2. Deleting an edge between two nodes for which one has minimum degree (Section
4.3.3.2);
3. Default scenario (Section 4.3.3.3);
4. Adding an edge between two nodes for which one has maximum degree (Section
4.3.3.4);
5. Adding an edge between two nodes with maximum degree (section 4.3.3.5).
The first two scenarios are delegated by Algorithm 11, invoked by the Neighbourhood
Search in line 6. Similarly, the last two scenarios are delegated by Algorithm 12, invoked
by the Neighbourhood Search in line 18. The default scenario is implicitly delegated by
both Algorithms 11, and 12.
Algorithm 11 is responsible for deleting a random selected edge from the Graph – edel.
If no special operation is needed, the Graph (SV , SE−{edel}) is returned and it is necessary
to select an edge to be added – Algorithm 10 line 7. On the other hand, if a special
operation is performed, a distinct Graph will be returned. If the special operations cannot
be performed (probably due to the Tabu List), the Null Graph is returned. Therefore, it
is necessary to verify if N 6= (∅, ∅) in Algorithm 10’s line 8 condition. To prevent that edelis reinserted into the Graph by special deletions or additions, it is temporarily added to
the Tabu List – Algorithm 11 lines 5 and 8, and Algorithm 10 line 17.
Algorithm 11 Special Neighbourhood Search deletions1: function SPECIAL_DELS(graph: S, edge: edel, tabulist: TL)
2: SE ← SE − {edel}3: performedDels← {edel}4: if ∃v(v ∈ SV ∧ degree(v) < 2) then . if true, then special deletion is needed
5: if ∀v(v ∈ SV ∧ v ∈ edel ∧ degree(v) < 2) then
6: return SWAP_EDGES_NODES(S, edel, TL ∪ {edel})7: else
8: {centre} ← {v|v ∈ SV ∧ v ∈ edel ∧ degree(v) < 2}9: return SPIN_EDGE(S, edel, centre, TL ∪ {edel})
10: end if
11: end if
12: return S . no special deletion is needed
50
13: end function
Algorithm 12 is responsible for adding a random selected non-existent edge to the
Graph – eadd. Before inserting eadd into the Graph, some special operations may be needed:
spin max degree (Section 4.3.3.4), or double spin (Section 4.3.3.5). If these special opera-
tions cannot be performed (probably due to the Tabu List), the Null Graph is returned.
This situation is verified by Algorithm 10’s line 19. Whether no special operation is ne-
cessary or it can be successfully performed, eadd is added to the Graph. In addition, to
prevent that eadd will be inserted during the special operations, it is temporarily added
to the Tabu List.
Algorithm 12 Special Neighbourhood Search Additions1: function SPECIAL_ADDS(graph: S, edge: eadd, tabulist: TL)
2: if ∀v(v ∈ SV ∧ v ∈ eadd ∧ degree(v) = 4) then
3: S ← DOUBLE_SPIN(S, eadd, TL ∪ {eadd})4: else if ∃v(v ∈ NV ∧ v ∈ eadd ∧ degree(v) = 4) then
5: {mdn} ← {v|v ∈ SV ∧ v ∈ eadd ∧ degree(v) = 4} . Max Degree Node
6: S ← SPIN_MAX_DEGREE(S, eadd,mdn, TL ∪ {eadd})7: end if
8: if S = (∅, ∅) then9: return (∅, ∅)
10: end if
11: SE ← eadd
12: return S
13: end function
4.3.3.1 Delete Edge Between Two Minimum Degree Nodes
This scenario raises in situations similar to attempting to remove edge {1, 2} (dottedline) from the graph SWG0 (Figure 12a). The resulting solution would be unfeasible
because for both nodes 1, and 2, degree(1) = degree(2) = 1 < 2.
In order to keep the solution feasible, both nodes need to have degree 2. This can be
done by swapping a node incident to one edge with a node incident to another edge. The
swapping process is detailed in Algorithm 13. It searches all valid swap movements until
a non-tabu solution is found. If no movement is possible, the Null Graph is returned.
Algorithm 13 Swaps the nodes incident to two distinct edges.
51
1: function SWAP_EDGES_NODES(graph: S; edge: edel, tabulist: TL)
2: V S ← all valid swap movements
3: while V S 6= ∅ do4: e2del, SEA← random(V S)
5: SE ← (SE − {e2del}) ∪ SEA . does swap
6: if @te (te ∈ TL ∧ te ⊆ SE) then . if not tabu
7: performedDels← performedDels ∪ {e2del}8: return S
9: end if
10: SE ← (SE − SEA) ∪ {e2del} . undoes swap
11: V S ← V S − {(e2del, SEA)}12: end while
13: return (∅, ∅)14: end function
A swap is represented as a tuple. The first element is an existent edge, and the second
is a set of two non-existent edges. A swap tuple is thus represented as (e2del, SEA), where
SEA = {eadd, e2add} and stands for Set of Edges to Add. A swap tuple has the following
restrictions, given a deleted edge edel.
1. e2del ∩ edel = ∅;
2. eadd ∩ edel 6= ∅ ∧ eadd ∩ e2del 6= ∅;
3. eadd ∩ e2del 6= ∅ ∧ e2add ∩ e2del 6= ∅.
If these restrictions are not met, an unfeasible solution would be generated.
Suppose it is desired to perform a swap operation on Graph SWG0. For this Graph,
edel = {1, 2}. The dashed lines in Graph SWG1 indicate possible values for e2del (Figure
12b). Suppose that e2del = {4, 5}. Then, it is possible to obtain two Graphs after the swap
operation successfully finishes: SWG2 (Figure 12c), and SWG3 (Figure 12d). In these
figures, the nodes corresponding to edel are dotted, while e2del are dashed. Both edges were
deleted. In addition, SEA edges are represented as dash-dotted lines and were added to
the Graphs. It is interesting to note that Graph SWG3 is feasible, though undesirable.
The Tabu Search will move away from disconnected graphs by evaluating its fitness to
infinity, e.g. FITNESS(SWG3) =∞.
52
0
1 2
3
45
(a) SWG0
0
1 2
3
45
(b) SWG1
0
1 2
3
45
(c) SWG2
0
1 2
3
45
(d) SWG3
Figure 12: Examples of valid edges’ node swapping process
The established restrictions prevent unfeasible solutions to be generated. For instance,
if restriction 1 was loosened, it would be possible that edel = {0, 1}, and e2del = {0, 5},Graph SWG4 (Figure 13a). If a swap operation was then performed, either Graph SWG4
would be re-obtained, or the unfeasible Graph SWG5 would be generated (Figure 13b). If
either restriction 2 or 3 were loosened instead, it would be possible to obtain Graph SWG6
(Figure 13c) from SWG1 – edel = {1, 2}, e2del = {4, 5}, and SEA = { {2, 5}, {0, 3} }.This solution is unfeasible since degree(1) = degree(4) < 2.
0
1 2
3
45
(a) SWG4
0
1 2
3
45
(b) SWG5
0
1 2
3
45
(c) SWG6
Figure 13: Example of invalid edge node swapping operation.
4.3.3.2 Delete Edge Incident to One Minimum Degree Node
This scenario raises in situations similar to the previous one. The core difference is
that only one of the selected edge’s incident nodes have minimum degree. That is, after
deleting that edge, the respective node degree would be 1. For example, this situation
would be caused by removing edge {0, 1} from Graph MINSG0 (Figure 14a) because
degree(0) = 2, which is the minimum acceptable.
This problem can be simply solved by, after deleting the edge, adding a random edge
53
incident to the node with minimum degree. This process is hereafter called spin edge. The
unchanged node will be named spin centre (or simply centre), while the remaining nodes
will be referred as targets (a set of vertices). The node that was originally incident to the
deleted edge will be named as original target, and original target /∈ targets. The process
of spinning an edge is described by Algorithm 14. Its only restriction is that the vertex
centre ∈ e. It is important to recall that edge e was previously deleted from the Graph,
and e ∈ TL (Algorithm 11 – lines 2, and 6, respectively).
Algorithm 14 Spin edge.1: function SPIN_EDGE(graph: S; edge: e, vertex: centre, tabulist: TL)
2: targets← SV − e3: while targets 6= ∅ do4: target← random(targets)
5: targets← targets− {target}6: eadd ← {centre, target}7: if eadd /∈ SE ∧ degree(target) < 4 then
8: SE ← SE ∪ {eadd} . spins
9: if @te(te ∈ TL ∧ te ⊆ SE) then . if non-tabu
10: return S
11: end if
12: SE ← SE − {eadd} . unspins
13: end if
14: end while
15: return (∅, ∅)16: end function
Until all targets are explored, the Algorithm will select a random target, and attempt
do add an edge between it and the spin centre. If this edge is already in the Graph, or one
of its nodes have maximum degree, it cannot be added (line 7). If the Graph obtained by
adding this edge is not tabu, return the solution, tracebacking otherwise. If there is no
possible spin, the Null Graph will be returned.
For example, deleting edge {0, 1} from Graph MINSG0 would trigger the scenario
represented by MINSG1 (Figure 14b), where the dotted edge was deleted, and either of
the dashed lines may be added ({0, 2}, {0, 3}, or {0, 4}). It is not possible to spin {0, 1}to {0, 5} because it is already in the graph.
Algorithm 14 also verifies if target’s degree is the maximum possible; preventing the
54
edge spin in this case. For instance, consider the extreme scenario described by Graph
MINSG2 (Figure 14c). If edge {0, 1} was removed, there would be no possible spin
because {v|v 6= 0 ∧ v 6= 1 ∧ degree(v) = 4} = MINSG2V − {0, 1}. In this case, the
algorithm would return the Null Graph.
0
1 2
3
45
(a) MINSG0
0
1 2
3
45
(b) MINSG1
0
1 2
3
45
(c) MINSG2
Figure 14: Example of spin operation with a minimum degree node.
4.3.3.3 Default Scenario
The default scenario is the simplest one. No feasibleness condition is violated when
deleting or adding edges. Thus, no special operation is needed to guarantee it. It is impli-
citly performed by the Neighbourhood Search – Algorithm 11, line 2, and Algorithm 12,
line 11. Therefore, the focus of this Section is solely to illustrate such scenarios.
In the Graphs represented in Figure 15, dotted lines represent deleted edges, while
dashed line are added edges. Therefore, considering Graph DG0 (Figure 15a), if edel =
{1, 5}, and eadd = {2, 4}, the resulting Graph is DG1 (Figure 15b). Analogously, if edel =
{1, 5}, and eadd = {1, 4}, the Graph DG2 is obtained (Figure 15c). In this case, if edelwas not previously deleted, an unfeasible solution would have been generated because the
degree of 1 would be 5.
0
1 2
3
45
(a) DG0
0
1 2
3
45
(b) DG1
0
1 2
3
45
(c) DG2
Figure 15: Examples of default scenario.
4.3.3.4 Add Edge Incident to One Maximum Degree Node
This scenario is similar to delete an edge incident to one minimum degree node. The
core difference is that the maximum degree node will not be the spin centre, but rather
55
its adjacent node. This case occurs, for example, if edge {2, 5} is attempted to be added
in Graph MAXSG0 (Figure 16a).
Algorithm 15 describes how to tackle this problem. Essentially, an already existent
edge incident to the maximum degree node will be spun. The algorithm expects to receive
the maximum degree node (mdn) such that mdn ∈ eadd. The Algorithm will attempt to
spin all edges incident to mdn (TES). One of these edges is randomly selected (edel), and
the spin centre is set to the node adjacent to the mdn. A temporary tabu list containing
the selected edge is created (TLS). Tabu or invalid solutions obtained after the spinning
process will also be added to TLS. A copy of the current solution is created removing edel,
and the SPIN_EDGE is called. A union operation is performed between TL and TLS to
prevent edel from being added. After the spin process, it is verified if it is possible to add
eadd to the Graph, i.e. if no adjacent node to eadd has degree 4. If that is the case, eadd is
inserted, and it is verified if the new solution is tabu or not. If it is not tabu, then a valid
solution was found. If none of the previous two conditions is met, the obtained solution is
added to TLS to prevent the spin process of generating it again. Eventually, if no solution
is found, SC = (∅, ∅), and the process will be repeat for a previously unselected edge from
TES. If it is not possible to perform the max spin operation, the Null Graph is returned.
Algorithm 15 Spin edge incident to one maximum degree node1: function SPIN_MAX_DEGREE(graph: S, edge: eadd, vertex: mdn, tabulist: TL)
2: TES ← {e|e ∈ SE ∧mdn ∈ e} . target edges to spin
3: while TES 6= ∅ do4: edel ← random(TES)
5: {centre} ← edel − {mdn} . node adjacent to max degree node
6: TLS ← {edel} . tabu list of spins
7: repeat
8: SC ← (SV , SE − {edel}) . Solution Copy with removed edge to spin
9: SC ← SPIN_EDGE(SC, edel, centre, TL ∪ TLS)10: if @v(v ∈ SCV ∧ v ∈ eadd ∧ degree(v) = 4) then
11: SCE ← SCE ∪ {eadd}12: if @te(te ∈ TL ∧ te ⊆ SCE) then . if not tabu
13: performedDels← performedDels ∪ {edel}14: return SC
15: end if
16: end if
17: TLS ← TLS ∪ {SCE − {eadd}}
56
18: until SC = (∅, ∅)19: TES ← TES − {edel}20: end while
21: return (∅, ∅)22: end function
Before executing Algorithm 15, some edge was deleted from the graph (Algorithm 11,
line 2). Both this edge and eadd were previously added to TL in order to prevent them
from being inserted to the Graph during the spinning process – Algorithm 10, line 15,
and Algorithm 12, line 6.
As an example of this process, suppose that S = MAXSG0, and eadd = {2, 5},consequently, mdn = 5. Therefore, it would be necessary to spin either {0, 5}, {1, 5},{3, 5}, or {4, 5}. This scenario is illustrated by GraphMAXSG1 (Figure 16b), where eaddis the dashed line and the pontential edges to spin are dotted. If edge {0, 5} is chosen
to be spun, then it is possible to add edge {0, 3} to the graph, resulting in MAXSG2
(Figure 16c).
0
1 2
3
45
(a) MAXSG0
0
1 2
3
45
(b) MAXSG1
0
1 2
3
45
(c) MAXSG2
Figure 16: Example of successful spin operation with a maximum degree node.
There are several conditions to assert feasibility during or after the spin. If eadd is
incident to a node of degree 4 after the spin, that solution cannot be visited. This scenario
could be triggered, for example, from Graph MAXSG1 if edge {0, 5} was spun to {0, 2}instead of {0, 3}. As illustrated by Graph MAXSG3 in Figure 17a – the spun edge
{0, 2} (dashed) is prohibiting the originally chosen eadd = {2, 5} (dash-dotted) to be
inserted. A problem may also happen even if a non-Tabu solution was obtained after
the spin. Because it may become tabu after adding eadd to the graph. For instance, if
TL = {{{0, 3}, {2, 5}}, {0, 4}}, the Graph generated after spinning edge {0, 5} to {0, 3}is not tabu – Graph MAXSG4 in Figure 17b considering only the solid lines. However,
after inserting eadd (dotted line) to the Graph, a tabu solution would be obtained because
{{0, 3}, {2, 5}} ∈ TL ∧ {{0, 3}, {2, 5}} ⊆ SCE.
57
0
1 2
3
45
(a) MAXSG3
0
1 2
3
45
(b) MAXSG4
Figure 17: Examples of unsuccessful spin operation with a maximum degree node.
4.3.3.5 Add Edge Between Two Maximum Degree Nodes
It may be necessary to spin two edges if an edge incident to two nodes of maximum
degree is chosen to be added. Such a scenario would occur by attempting to insert edge
eadd = {0, 3} (dashed line) to the Graph DSG0 (Figure 18a). Also, suppose that edge
{1, 2} (dotted) was previously removed – Algorithm 10, line 15.
The behaviour of the double spin is described in Algorithm 16. It basically con-
sists on spinning one edge incident to each vertex of eadd. The eadd’s vertices are ran-
domly assigned to v1, and v2. An empty set of visited solutions is created – V S. The
SPIN_MAX_DEGREE() function is called. Note that ∅ is then passed as the argument
for eadd; therefore, only a random edge incident to v1 will be spun. The edge deleted during
the call is identified edel, and temporarily added to the Tabu List TL to guarantee that
it will not be inserted during the SPIN_MAX_DEGREE() call. If the solution obtained
is valid; it is returned, and both edges deleted throughout the Algorithm are added to
performedDels. If the Null Graph was obtained from the second spin, then it is not pos-
sible to obtain a solution from SC. Hence, SC is added to V S in order to be considered
as a tabu solution in the next iteration. This process is repeated until all possible solu-
tions are explored, returning the Null Graph if that is the case. Algorithm 16 does not
verify if the obtained solutions are tabu because this is implicitly done by Algorithm 15.
Notwithstanding, eadd is not added to TL since it was previously performed – Algorithm
12, line 3.
As an example of this Algorithm, suppose that, for Graph DSG0, v1 = {0}, andv2 = {3}. Then, after the first spin, it would be only possible to obtain Graph DSG1
(Figure 18b) by spinning {0, 5} (dotted) to {5, 2} (dashed). Analogously, for the second
spin, the only obtainable solution is to spin edge {3, 4} to {4, 1}, as illustrated in Graph
DSG2 (Figure 18c. The dotted edges ({0, 5}, and {3, 4}) were spun during the Algorithm,
while the dashed edges ({2, 5}, and {4, 1}) were added. The dash-dotted line ({0, 3}) willbe added posteriorly by Algorithm 12 in line 11.
58
Algorithm 16 Double spin1: function DOUBLE_SPIN(graph: S, edge: eadd, tabulist: TL)
2: v1 ← {random(eadd)}3: {v2} ← eadd − {v1}4: V S ← ∅5: repeat
6: SC ← SPIN_MAX_DEGREE(S, ∅, v1, TL ∪ V S)7: {edel} ← SE − SCE8: if SC 6= (∅, ∅) then9: SC2 ← SPIN_MAX_DEGREE(SC, ∅, v2, TL ∪ {edel})
10: if SC2 6= (∅, ∅) then11: {e2del ← SCE − SC2E}12: performedDels← performedDels ∪ {edel} ∪ {e2del}13: return SC2
14: end if
15: end if
16: V S ← V S ∪ {SC}17: until SC = (∅, ∅)18: return (∅, ∅)19: end function
0
1 2
3
45
(a) DSG0
0
1 2
3
45
(b) DSG1
0
1 2
3
45
(c) DSG2
Figure 18: Example of successful double spin operation.
4.4 Fault Injection
The fault injection consists in a step performed after the topology generation to eval-
uate the reliability of each solution.
Given the best solution from the previous algorithm step, this stage will stress the
provided solution by randomly removing links from the topology. The random removal
59
simulates a faulty link between two routers, which cannot be used. After the links are re-
moved, the fitness function is recomputed. Systematically, the experiments were executed
thirty times for five different scenarios: removing 10%, 15%, 20%, 25%, and 30% of the
total links. The process is detailed in Algorithm 17.
Algorithm 17 Fault Injection1: function FAULT_INJECTION(graph: BS, TG)
2: for all perc ∈ {0.1, 0.15, 0.2, 0.25, 0.3} do3: qtd← |BSE| − b|BSE| · percc4: for i← 1 to 30 do
5: S ← BS
6: while |SE| > qtd do
7: e← random(SE)
8: SE ← SE − {e}9: end while
10: write S
11: write FITNESS(S, TG)
12: end for
13: end for
14: end function
The Algorithm receives the Best Solution calculated by the Tabu Search, and the
Task Graph in order to compute the fitness function. Depending on the number of edges
in the Graph, the value of failed links may not be a natural number. Therefore, the floor
function is used.
For example, consider Graph FI0 (Figure 19a), and assume that perc = 0.2. Since
|FI0E| = 12, qtd = 12 − b12 · 0.2c = 10. Therefore, two edges will be removed from
the Graph. Graph FI1 is an example of an obtainable Graph in this scenario (Figure
19c), where edges {0.4}, and {1, 3} (dotted lines) were randomly deleted. It is expected
that Graph FI1 has lower performance if compared to FI0, since there are fewer paths
possible. Analogously, if BS = FI2, and perc = 0.3, then qtd = 5. Therefore, it would
be possible that edges {0, 5}, and {4, 5} (dotted lines) were randomly selected; thus ob-
taining Graph FI3 (Figure 19d). Note that FI3 is a disconnected graph. Therefore, its
fitness would be evaluated to infinity – FITNESS(FI3, TG) = ∞. In other words, a
router (corresponding to node 5) would be isolated from the system. As a consequence,
a processor that is directly connect to this node would also be isolated from the system.
60
0
1 2
3
45
(a) FI0
0
1 2
3
45
(b) FI1
0
1 2
3
45
(c) FI2
0
1 2
3
45
(d) FI3
0
1 2
3
45
(e) FI4
Figure 19: Example of Fault Injection Algorithm.
However, if edge {0, 3} was chosen to be deleted instead of {4, 5}, the resulting graph
would not be disconnected (Figure 19e).
61
5 Results
Throughout this Chapter, the results of the obtained latency estimations are dis-
cussed. The latency is estimated via the QAP Function (Section 2.2). Several graphs were
generated, and studied during the analyses whenever possible. However, for a more de-
tailed analysis a subset of all graphs is chosen – median latency graphs. For instance, fault
injection analysis is made on this subset; also taking the obtained latency estimation into
consideration.
The TGs used to benchmark the Algorithm are the same discussed in Mesquita’s work
(MESQUITA, 2016). For sake of clarity, from the eight TGs, two will be the focus of the
forthcoming discussion, and hereby referred as chosen TGs – AP2TG (Figure 20a), and
MPEGTG (Figure 20b). However, all eight TGs are illustrated in Annex A.
(a) AP2TG(b) MPEGTG
Figure 20: Chosen TGs
For each TG, the remaining Algorithm 2 arguments were all possible combinations
between the following set elements.
62
• tabuListSize ∈ {1, |V |/2, |V |};
• terminationCriterion = {100, 250, 500};
• ε ∈ {|TGV |, |TGV |+ 1, . . . , 2|TGV | − 1, 2|TGV |}.
Algorithm 2 was executed 30 times for each combination.
Figures 21 and 22 respectively illustrate the influence of parameters tabuListSize,
and terminationCriterion for the chosen TGs. The x-axes correspond to the arguments
values, while the y-axes illustrate the latency estimation for each outputted solution. Each
x-axes value has an associated box plot. Several values can be inferred from a box plot.
The median value corresponds to the line inside the box (approximately 1200 for the
first box plot of Figure 21a). The first quartile is described by the line that inferiorly
delimits the box (a value slightly smaller than 1200 for the the first box plot of Figure
21a). Analogously, the third quartile is described by the line that superiorly delimits the
box (approximately 1300 for the first box plot of Figure 21a). The box plot also describes
the maximum and minimum values in the distribution. For instance, the maximum value
for the first box plot of Figure 21a is approximately 1500, while the minimum value equals
the first quartile. In the third box plot of Figure 21a, the maximum value is approximately
1600, while the minimum value is a bit smaller than the first quartile. The dots in the box
plots represent outliers, i.e. values that are not in the range of the expected distribution.
For example, the outliers of the first box plot of Figure 21a are in the [1750, 1900] range.
(a) Influence on AP2TG solutions. (b) Influence on MPEGTG solutions.
Figure 21: Influence of tabuListSize arguments for Latency Estimation in chosen TGs’generated solutions.
Practically no change is observable between the box plots. The minimum, median, and
first and third quartiles are very similiar between box plots of the same Figure. Only the
63
(a) Influence on AP2TG solutions. (b) Influence on MPEGTG solutions.
Figure 22: Influence of terminationCrit arguments for Latency Estimation in chosen TGs’generated solutions.
maximum value, and the outliers change. However, no overall latency estimation improve-
ment is observable. Therefore, it is possible to conclude that the influence of parameters
tabuListSize, and terminationCriterion is minimal. The influence of the same paramet-
ers for all TGs are compiled in Appendices A and B, and the same behaviour is noticeable.
Therefore, throughout this Chapter, the discussion will focus on the ε parameter.
There are two interesting analyses for the current work – a study regarding the gen-
erated topologies’ performance, and a study regarding the reliability after fault injection.
Thus, this chapter is subdivided into two Sections, one for each analysis. Throughout this
chapter, let STG,ε denote a solution for Task Graph TG with ε links. Also, to save space,
let FITNESS(STG,ε) denote FITNESS(STG,ε, TG).
5.1 Performance
In order to perform an overall analysis of the obtained topologies for a given TG,
the median latency estimation between all executions with the same ε value was se-
lected, i.e. arg(median(FITNESS(STG,ε))). Figure 23a compiles median latency for TGs
AP1TG, AP4TG, and INTEGRALTG, while Figure 23b illustrates the median latency
for TGs AP2TG, AP3TG,MPEGTG,MWDTG, and V OPDTG. For instance, the me-
dian latency for V OPDTG obtained solutions with 15 to 26 links is 3420. There are two
Figures in order to improve the visualisation.
It is possible to conclude that the FITNESS(STG,ε) function behaves similarly to the
f(x) = 1xrational function. Note that if the obtained solution was disconnected for some ε,
64
(a) Median latency estimation for AP1TG, AP4TG, and INTEGRALTG.
(b) Median latency estimation for AP2TG, AP3TG, MPEGTG, MWDTG, andV OPDTG.
Figure 23: Overall median latency for all benchmarked TGs.
65
FITNESS(STG,ε) =∞. In other words, some message would not arrive into its destination.
This behaviour is similar to limx→0f(x) =∞.
From Figure 23, it is also possible to infer that as the number of links in the topology
increases, the difference between latency improvements decreases. For instance, let
ε < ψ < ϕ, where ε, ψ, ϕ ∈ N; (5.1)
be possible ε values. Then,
FITNESS(STG,ε)− FITNESS(STG,ψ) ≥ FITNESS(STG,ψ)− FITNESS(STG,ϕ). (5.2)
As an example, for MPEGTG in Figure 23b,
FITNESS(SMPEGTG,13)− FITNESS(SMPEGTG,14) ≈ 500, (5.3)
FITNESS(SMPEGTG,14)− FITNESS(SMPEGTG,15) ≈ 250, (5.4)
500 ≥ 250. (5.5)
There are some cases that this difference is zero. For example, in Figure 23b,
FITNESS(SV OPDTG,21)− FITNESS(SV OPDTG,22) = 0. (5.6)
Such a scenario may happen when the number of links in the solution is greater or equal
to the number of edges in the TG (ε ≥ |TGE|). This scenario may also occur if no
improvement is possible due to the feasibility restrictions. An example of this situation is
MPEGTG, since degree(1) > 4.
It is interesting to analyse how latency estimation varies between different generated
topologies. Thus detailing the information provided by Figure 23. To accomplish this
task, a latency estimation box plot is calculated for each ε value of a given TG. Then,
all the obtained box plots are compiled into a single plot. The plots corresponding to
AP2TG, and MPEGTG are illustrated by Figures 24a and 24b, respectively. The x-axes
correspond to the number of links in the topology (ε), while the y-axes represent the
latency estimation. The plots for all eight TGs are gathered in Appendix C.
From figure 24a, it is possible to verify the same rational function behaviour previously
mentioned. In addition, the difference between the maximum outlier and the minimum
latency estimation tends to decrease as ε increases. Therefore, choosing whether or not to
connect two routers is more relevant for topologies with fewer links. Also, there exists an
ε value for which no major latency improvements are observable – i.e. 16, and arguably
66
(a) Box plots of AP2TG solutions’ latency es-timation.
(b) Box plots of MPEGTG solutions’ latencyestimation.
Figure 24: Box plots of chosen TGs solutions’ latency estimation.
15 or 14. Hence, it may be possible to obtain an efficient and fault-tolerant MPSoC with
reduced energy consumption. Notwithstanding, after some point, the box plots practically
turn into lines. This may happen when ε is large enough to generate a fault-tolerant
NoC, and yet capable of almost matching the original TG. For AP2TG, this value is 16,
although |AP2TGE| = 19. However, given the feasibility restriction 1 edge from node 2,
and 2 edges from node 5 would be removed, totalling 16. In other words, this value should
be calculated using the following formula,
convergence ε =∑
v∈TGV
min(degree(v), 4). (5.7)
Similar conclusions may be inferred from Figure 24b. However, the outliers in this
graph are more unstable than in the remainder ones. There are a couple of probable
reasons for this behaviour. First, it may be a consequence of MPEGTG’s description.
Second, it may also be caused by an unlucky initial solution; and a much larger number of
iterations would be necessary for the Tabu Search to converge. In addition, all the box plots
for which ε ≥ 16 turn into a line. Note that, given the TG description, convergence ε = 8.
This is the reason why all box plots have a small difference between their maximum and
minimum.
To illustrate some generated topologies, the NoCs with median fitness for both
ε = |TGV |+ round
(2|TGV | − |TGV |
4
), (5.8)
67
and
ε = 2|TGV | − round(2|TGV | − |TGV |
4
)(5.9)
were selected. Namely, the solutions SAP2TG,13 (Figure 25a), SAP2TG,17 (Figure 25b),
SMPEGTG,16 26a, and SMPEGTG,23 (Figure 26b) that correspond to the median fitness
for the respective ε values. It is interesting to note that TG edges with greater weights
were more likely to be present in the resulting topologies.
0
1 2 3
4 5 6
7 8 9
(a) Median fitness SAP2TG,13.
0
1 2 3
4 5 6
7 8 9
(b) Median fitness SAP2TG,17.
Figure 25: Examples of AP2TG solutions.
0 1
2
3
4
56
7
8
9
1011
12
(a) Median fitness SMPEGTG,16.
0 1
2
3
4
56
7
8
9
1011
12
(b) Median fitness SMPEGTG,23.
Figure 26: Examples of MPEGTG solutions.
5.2 Latency Estimation After Fault Injection
Recall that Algorithm 17 randomly removes 10, 15, 20, 25, and 30 percent of the best
solution’s links to simulate faults. To analyse the impact of faulty links on performance, a
68
selection procedure similar to the performance analysis was executed. In other words, the
solutions selected for analysis have median fitness value for each ε. Only five ε values are
chosen for each analysis in order to improve readability – the minimum value (ε = |TGV |),the maximum value (ε = 2|TGV |), and three intermediary values such that all values are as
even spaced as possible. Since 30 fault injections were performed for each percentage, the
topology corresponding to the median fitness after the injection is chosen to be analysed.
The fault injection latency estimation results for AP2TG and MPEGTG are illus-
trated in Figures 27a and 27b, respectively. Each line in the Figures correspond to the
median solution with the specified number of links before fault injection. The x-axes cor-
respond to the fault injection percentage. The y-axes illustrate the proportion by which
the latency estimation has worsened, e.g. the median latency of SAP2TG,20 after 20% fault
injection is approximately 24% larger than the latency of SAP2TG,20 before the fault injec-
tion. Also, the max y-axes value is ∞, represented by Inf. In other words, the topology
became disconnected after the fault-injection. The fault-injection graphs for all eight TGs
are compiled in Appendix D.
For the AP2TG application, it is possible to verify that the performance after fault
injection behaved as expected. Solutions with fewer links are more susceptible to become
disconnected, and less resistant to a larger number of faults. In addition, the latency
overhead tends to increase as the number of fault also increases. Notwithstanding, the
solutions with ε ≥ 15 did not become disconnected even after a 30% fault injection. It
is possible to observe that some solutions with fewer links had smaller latency overheads
than solutions if more links in some scenarios. For instance, considering 10, 15, 20, and
30% fault injection,
FITNESS(SAP2TG,18) ≤ FITNESS(SAP2TG,20). (5.10)
This behaviour probably occurred because of the links removed. In other words, the links
randomly deleted from SAPE2TG,20 were probably responsible for larger communications
(larger edge weight) than the links randomly removed from SAP2TG,18.
From Figure 27b, similar conclusions are inferable. The latency overheads tends to in-
crease as the number of faulty links also increases. Solutions with fewer links are less likely
to become disconnected. Notwithstanding, some scenarios where solutions with fewer links
resist better to fault injection than solution with more links also exist. Note, however, that
the latency overhead for the same solution is more unstable if compared to the results
of Figure 27a. For instance, SMPEGTG,20 resisted better to 30% fault injection than for
69
(a) Fault injection on median AP2TG solutions.
(b) Fault injection on median MPEGTG solutions.
Figure 27: Fault injection on median chosen TGs solutions.
70
15, 20, and 25% fault injection; while SMPEGTG,23’s latency overhead alternately increases
and decreases. The TG description (Figure 20b) alongside the feasibility restriction may
have a large influence on this effect. Note that there are eight vertices adjacent to node 1.
Thus, if any edge incident to node 1 is removed, the latency may dramatically increase,
e.g. (1, 5). On the other hand, if links with smaller weights are chosen to be deleted, the
latency overhead will not be large, e.g. (1, 2).
To illustrate some generated topologies after the fault injection, consider the solution
represented in Figure 28a. It corresponds to the median FITNESS(SAP2TG,15) solution.
Figures 28b, 28c, and 28d represent the median fitness topologies after 10%, 20%, and
30% fault injection, respectively. The dotted lines are the faulty links. Analogously, Figure
29a corresponds to the median FITNESS(SMPEGTG,19) solution. Figures 29b, 29c, and
29d represent the median fitness topologies after 10%, 20%, and 30% fault injection,
respectively. The dotted lines are the faulty links. The median fitness topologies for all
eight TGs can be found in Appendix E. In addition, the graphs for these topologies after
10, 20, and 30% fault injection is available in Appendices F, G, and H, respectively.
0
1 2 3
4 5 6
7 8 9
(a) Median fitness SAP2TG,15 solution.
0
1 2 3
4 5 6
7 8 9
(b) Median fitness SAP2TG,15 solution with 10%fault injection.
0
1 2 3
4 5 6
7 8 9
(c) Median fitness SAP2TG,15 solution with 20%fault injection.
0
1 2 3
4 5 6
7 8 9
(d) Median fitness SAP2TG,15 solution with 30%fault injection.
Figure 28: SAP2TG,15 behaviour during fault injection.
71
0 1
2
3
4
5
6
7
8
9
1011
12
(a) Median fitness SMPEGTG,19 solution.
0 1
2
3
4
5
6
7
8
9
1011
12
(b) Median fitness SMPEGTG,19 solutionwith 10% fault injection.
0 1
2
3
4
5
6
7
8
9
1011
12
(c) Median fitness SMPEGTG,19 solutionwith 20% fault injection.
0 1
2
3
4
5
6
7
8
9
1011
12
(d) Median fitness SMPEGTG,19 solutionwith 30% fault injection.
Figure 29: SMPEGTG,19 behaviour during fault injection.
Nevertheless, it may be desirable to analyse how some provided topologies behave
for all fault injections. The topologies corresponding to Figures 28a and 29a will be the
objects of study. All the corresponding fitness values were gathered and categorised into
the corresponding percentages (10, 15, 20, 25, or 30%). The box plots of Figures 30a and
30b are thus obtained. The x-axes correspond to the fault injection percentage. The y-axes
correspond to the latency overhead, similarly to Figure 27. It is important to emphasise
that values smaller than 1 in the y-axis correspond to disconnected solutions (red dashed
line). Similar plots for the remaining TG solutions are available in Appendix I.
From Figure 30a, it is inferable that the latency overhead worsens as the fault injection
percentage increases, as expected. In addition, the box plots’ range increases alongside the
percentage. Therefore, the latency overhead becomes more dependent on the faulty links
as the fault injection percentage increases. Disconnected NoCs are considered outliers
until 30% fault injection. In addition, minimum box plot values for 10, 20, and 25% fault
injection is 1. Therefore, depending on the faulty links, the obtained topology may be
capable of achieving the same latency estimation previous to the fault injection. Such a
72
(a) Fault injection on median SAP2TG,15 solu-tion.
(b) Fault injection on median SMPEGTG,19 solu-tion.
Figure 30: Fault injection on median solutions with median ε of the chosen TGs.
scenario may occur either if a “spare link” was removed, or if the number of hops through
the alternative path equals the number of hops through the original path.
Figure 30b illustrates the fault injection behaviour on median solution SMPEGTG,19.
Unlike the previous Figure, no significant variation in the box plots is observable, e.g.
the 20% box plot is more similar to the 30% than to the 25% box plot. As previously
discussed, this is probably a consequence of the MPEGTG description (Figure 20b).
In other words, removing a link adjacent to node 1 may greatly increase the latency
estimation. Notwithstanding, disconnected graphs were never classified as outliers.
73
6 Concluding Remarks
This work proposed the generation of fault-tolerant irregular NoC topologies. The
importance of this topic involves the fabrication of supercomputers, which need to be as
fast, flexible, and durable as possible.
Moreover, the advances in NoC topologies are independent of the cores’ progress.
Therefore, a more suitable NoC may be capable of extracting the full potential of the
cores by diminishing the communication overhead. On the other hand, faster processing
elements may explore the NoC resources better by requesting data transmission more
frequently.
The obtained topologies were generated using Tabu Search. Throughout the process,
multiple operations were performed to guarantee minimal fault resistance. The resulting
topologies were benchmarked and shown to be application specific efficient. In addition,
topologies with a few more links than the minimal acceptable (ring topology) resisted
from 10 to 30% of the random link failure. Therefore, these topologies may also have
lower power consumption if compared to regular NoCs; yet achieving high performance
and fault resistance.
Some features may be added to the proposed algorithm to augment its benefits. For
instance, combining the Tabu Search with an Evolutionary Algorithm – i.e. a Memetic
Algorithm – may explore a wider range of solutions. Thus increasing the odds of finding
the global optimum. It may be also necessary to add graph planarity restrictions to the
algorithm to guarantee that the generated topologies can be used for 2D-MPSoC design.
Future versions of the Algorithm may also measure fault-tolerance throughout the process;
though multi-objective approaches may be necessary. Notwithstanding, further studies
may synthesise the generated topologies, properly benchmarking its performance.
74
References
ABEDNEZHAD, D.; ALAVI, S. E. A new irregular fault-tolerant routing algorithm innetwork-on-chip. International Journal of Computer Science and Network Security (IJC-SNS), International Journal of Computer Science and Network Security, v. 17, n. 4, p. 166,2017.
ASCIA, G.; CATANIA, V.; PALESI, M. Multi-objective mapping for mesh-based nocarchitectures. In: ACM. Proceedings of the 2nd IEEE/ACM/IFIP international conferenceon Hardware/software codesign and system synthesis. [S.l.], 2004. p. 182–187.
AZAD, S. P. et al. Holistic approach for fault-tolerant network-on-chip based many-coresystems. arXiv preprint arXiv:1601.07089, 2016.
BALAS, E.; TOTH, P. Branch and bound methods for the traveling salesman problem.[S.l.], 1983.
BECKER, M.; KRÖMKER, M.; SZCZERBICKA, H. Evaluating heuristic optimization,bio-inspired and graph-theoretic algorithms for the generation of fault-tolerant graphswith minimal costs. In: Information Science and Applications. [S.l.]: Springer, 2015. p.1033–1041.
BEIGNÉ, E. et al. An asynchronous noc architecture providing low latency service andits multi-level design framework. In: IEEE. Asynchronous Circuits and Systems, 2005.ASYNC 2005. Proceedings. 11th IEEE International Symposium on. [S.l.], 2005. p. 54–63.
BOKHARI, S. H. On the mapping problem. IEEE Transactions on Computers, IEEE,n. 3, p. 207–214, 1981.
BONONI, L.; CONCER, N. Simulation and analysis of network on chip architectures:ring, spidergon and 2d mesh. In: EUROPEAN DESIGN AND AUTOMATION ASSOCI-ATION. Proceedings of the conference on Design, automation and test in Europe: Design-ers’ forum. [S.l.], 2006. p. 154–159.
CHANG, Y.-C. et al. On the design and analysis of fault tolerant noc architecture usingspare routers. In: IEEE PRESS. Proceedings of the 16th Asia and South Pacific DesignAutomation Conference. [S.l.], 2011. p. 431–436.
CHOUDHARY, N.; GAUR, M.; LAXMI, V. Irregular noc simulation framework: Irnirgam.In: IEEE. Emerging Trends in Networks and Computer Communications (ETNCC), 2011International Conference on. [S.l.], 2011. p. 1–5.
CHOUDHARY, N. et al. Genetic algorithm based topology generation for applicationspecific network-on-chip. In: IEEE. Circuits and Systems (ISCAS), Proceedings of 2010IEEE International Symposium on. [S.l.], 2010. p. 3156–3159.
75
CORMEN, T. H. et al. Introduction to algorithms. [S.l.]: MIT press, 2009.
DEHYADGARI, M. et al. Evaluation of pseudo adaptive xy routing using an objectoriented model for noc. In: IEEE. 2005 International Conference on Microelectronics.[S.l.], 2005. p. 5–pp.
DICK, R. P.; RHODES, D. L.; WOLF, W. Tgff: task graphs for free. In:IEEE. Proceedings of the Sixth International Workshop on Hardware/SoftwareCodesign.(CODES/CASHE’98). [S.l.], 1998. p. 97–101.
DIJKSTRA, E. Dijkstra’s algorithm. Dutch scientist Dr. Edsger Dijkstra network al-gorithm: http://en. wikipedia. org/wiki/Dijkstra’s_algorithm, 1959.
GABIS, A. B.; KOUDIL, M. Noc routing protocols–objective-based classification. Journalof Systems Architecture, Elsevier, v. 66, p. 14–32, 2016.
GAREY, M. R.; JOHNSON, D. S.; STOCKMEYER, L. Some simplified np-completeproblems. In: ACM. Proceedings of the sixth annual ACM symposium on Theory of com-puting. [S.l.], 1974. p. 47–63.
GENDREAU, M.; POTVIN, J.-Y. Tabu search. In: Search methodologies. [S.l.]: Springer,2005. p. 165–186.
GENDREAU, M.; POTVIN, J.-Y. et al. Handbook of metaheuristics. [S.l.]: Springer, 2010.
HEMANI, A. et al. Network on chip: An architecture for billion transistor era. In: Pro-ceeding of the IEEE NorChip Conference. [S.l.: s.n.], 2000. v. 31, p. 11.
HO, W. H.; PINKSTON, T. M. A methodology for designing efficient on-chip intercon-nects on well-behaved communication patterns. In: IEEE. The Ninth International Sym-posium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.[S.l.], 2003. p. 377–388.
HOSSEINABADY, M. et al. Reliable network-on-chip based on generalized de bruijngraph. In: IEEE. 2007 IEEE International High Level Design Validation and Test Work-shop. [S.l.], 2007. p. 3–10.
JAIN, K.; CHOUDHARY, N.; SINGH, D. Energy efficient branch and bound based on-chip irregular network design. Global Journal of Computer Science and Technology, 2014.
JANTSCH, A.; TENHUNEN, H. et al. Networks on chip. [S.l.]: Springer, 2003.
KOIBUCHI, M. et al. A lightweight fault-tolerant mechanism for network-on-chip. In:IEEE COMPUTER SOCIETY. Proceedings of the Second ACM/IEEE International Sym-posium on Networks-on-Chip. [S.l.], 2008. p. 13–22.
KREUTZ, M. et al. Energy and latency evaluation of noc topologies. In: IEEE. 2005IEEE International Symposium on Circuits and Systems. [S.l.], 2005. p. 5866–5869.
KUROSE, J. F.; ROSS, K. W. Computer networking: a top-down approach: internationaledition. [S.l.]: Pearson Higher Ed, 2013.
76
LEE, D.; PARIKH, R.; BERTACCO, V. Highly fault-tolerant noc routing withapplication-aware congestion management. In: ACM. Proceedings of the 9th InternationalSymposium on Networks-on-Chip. [S.l.], 2015. p. 10.
LI, M.; ZENG, Q.-A.; JONE, W.-B. Dyxy: a proximity congestion-aware deadlock-freedynamic routing method for network on chip. In: ACM. Proceedings of the 43rd annualDesign Automation Conference. [S.l.], 2006. p. 849–852.
LIN, S.-Y. et al. Fault-tolerant router with built-in self-test/self-diagnosis and fault-isolation circuits for 2d-mesh based chip multiprocessor systems. In: IEEE. 2009 Interna-tional Symposium on VLSI Design, Automation and Test. [S.l.], 2009. p. 72–75.
MESQUITA, J. W. d. Exploração de espaço de projeto para geração de redes em chip detopologias irregulares otimizadas: a rede UTNoC. Dissertação (Mestrado) — UniversidadeFederal do Rio Grande do Norte, 2016.
MILFONT, R. et al. Analysis of routing algorithms generation for irregular noc topologies.In: IEEE. Test Symposium (LATS), 2017 18th IEEE Latin American. [S.l.], 2017. p. 1–5.
MORAES, F. et al. Hermes: an infrastructure for low area overhead packet-switchingnetworks on chip. INTEGRATION, the VLSI journal, Elsevier, v. 38, n. 1, p. 69–93,2004.
MOTA, R. G. et al. Efficient routing table minimization for fault-tolerant irregularnetwork-on-chip. In: IEEE. 2016 IEEE International Conference on Electronics, Circuitsand Systems (ICECS). [S.l.], 2016. p. 632–635.
NEEB, C.; WEHN, N. Designing efficient irregular networks for heterogeneous systems-on-chip. Journal of Systems architecture, Elsevier, v. 54, n. 3-4, p. 384–396, 2008.
PATTERSON, D. A.; HENNESSY, J. L. Computer Organization and Design MIPS Edi-tion: The Hardware/Software Interface. [S.l.]: Newnes, 2013.
PINTO, A.; CARLONI, L. P.; SANGIOVANNI-VINCENTELLI, A. L. Efficient synthesisof networks on chip. In: IEEE. Proceedings 21st International Conference on ComputerDesign. [S.l.], 2003. p. 146–150.
RADETZKI, M. et al. Methods for fault tolerance in networks-on-chip. ACM ComputingSurveys (CSUR), ACM, v. 46, n. 1, p. 8, 2013.
RAVI, R. et al. Approximation algorithms for degree-constrained minimum-cost network-design problems. Algorithmica, Springer, v. 31, n. 1, p. 58–78, 2001.
ROCHA, H. M. G. d. A. O Problema do Mapeamento: Heurísticas de mapeamento detarefas em MPSoCs baseados em NoC. Monografia (B.S. thesis) — Universidade Federaldo Rio Grande do Norte, 2017.
RODRIGO, S. et al. Cost-efficient on-chip routing implementations for cmp and mpsocsystems. IEEE transactions on computer-aided design of integrated circuits and systems,IEEE, v. 30, n. 4, p. 534–547, 2011.
SALMINEN, E.; KULMALA, A.; HAMALAINEN, T. D. Survey of network-on-chip pro-posals. white paper, OCP-IP, Citeseer, v. 1, p. 13, 2008.
77
SCHALLER, R. R. Moore’s law: past, present and future. IEEE spectrum, IEEE, v. 34,n. 6, p. 52–59, 1997.
SHAH, P.; KANNIGANTI, A.; SOUMYA, J. Fault-tolerant application specific network-on-chip design. In: IEEE. Embedded Computing and System Design (ISED), 2017 7thInternational Symposium on. [S.l.], 2017. p. 1–5.
SOTERIOU, V. et al. A high-throughput distributed shared-buffer noc router. IEEEComputer Architecture Letters, IEEE, v. 8, n. 1, p. 21–24, 2009.
SRINIVASAN, K.; CHATHA, K. S.; KONJEVOD, G. An automated technique for to-pology and route generation of application specific on-chip interconnection networks. In:IEEE COMPUTER SOCIETY. Proceedings of the 2005 IEEE/ACM International con-ference on Computer-aided design. [S.l.], 2005. p. 231–237.
SRINIVASAN, K.; CHATHA, K. S.; KONJEVOD, G. Linear-programming-based tech-niques for synthesis of network-on-chip architectures. IEEE Transactions on Very LargeScale Integration (VLSI) Systems, IEEE, v. 14, n. 4, p. 407–420, 2006.
STALLINGS, W. Computer organization and architecture: designing for performance.[S.l.]: Pearson Education India, 2003.
VENKATARAMAN, N.; KUMAR, R. Design and analysis of application specific networkon chip for reliable custom topology. Computer Networks, Elsevier, 2019.
WANG, C. et al. An efficient topology reconfiguration algorithm for noc based multi-processor arrays. In: IEEE. High Performance Computing and Communications & 2013IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC),2013 IEEE 10th International Conference on. [S.l.], 2013. p. 873–880.
WILSON, R. J. Introduction to graph theory. [S.l.]: Pearson Education India, 1979.
YANG, P. et al. A fault tolerance noc topology and adaptive routing algorithm. In: IEEE.Embedded Software and Systems (ICESS), 2016 13th International Conference on. [S.l.],2016. p. 42–47.
YESIL, S.; TOSUN, S.; OZTURK, O. Fpga implementation of a fault-tolerant application-specific noc design. In: IEEE. 2016 International Conference on Design and Technologyof Integrated Systems in Nanoscale Era (DTIS). [S.l.], 2016. p. 1–6.
ZEFERINO, C. A. Redes-em-chip: arquiteturas e modelos para avaliação de área e desem-penho. 2003.
ZEFERINO, C. A.; SUSIN, A. A. Socin: a parametric and scalable network-on-chip.In: IEEE. Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings. 16thSymposium on. [S.l.], 2003. p. 169–174.
ZHANG, L. et al. On topology reconfiguration for defect-tolerant noc-based homogeneousmanycore systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,IEEE, v. 17, n. 9, p. 1173–1186, 2009.
78
APPENDIX A -- Influence of tabuListSize
Arguments
Figure 31: Influence of tabuListSize on AP1TG solutions.
Figure 32: Influence of tabuListSize on AP2TG solutions.
79
Figure 33: Influence of tabuListSize on AP3TG solutions.
Figure 34: Influence of tabuListSize on AP4TG solutions.
Figure 35: Influence of tabuListSize on INTEGRALTG solutions.
80
Figure 36: Influence of tabuListSize on MPEGTG solutions.
Figure 37: Influence of tabuListSize on MWDTG solutions.
Figure 38: Influence of tabuListSize on V OPDTG solutions.
81
APPENDIX B -- Influence ofterminationCriterion
Arguments
Figure 39: Influence of terminationCriterion on AP1TG solutions.
Figure 40: Influence of terminationCriterion on AP2TG solutions.
82
Figure 41: Influence of terminationCriterion on AP3TG solutions.
Figure 42: Influence of terminationCriterion on AP4TG solutions.
Figure 43: Influence of terminationCriterion on INTEGRALTG solutions.
83
Figure 44: Influence of terminationCriterion on MPEGTG solutions.
Figure 45: Influence of terminationCriterion on MWDTG solutions.
Figure 46: Influence of terminationCriterion on V OPDTG solutions.
84
APPENDIX C -- Latency Box Plots
Figure 47: Fitness (latency estimation) box plots of AP1TG generated solutions.
Figure 48: Fitness (latency estimation) box plots of AP2TG generated solutions.
85
Figure 49: Fitness (latency estimation) box plots of AP3TG generated solutions.
Figure 50: Fitness (latency estimation) box plots of AP4TG generated solutions.
Figure 51: Fitness (latency estimation) box plots of INTEGRALTG generated solutions.
86
Figure 52: Fitness (latency estimation) box plots of MPEGTG generated solutions.
Figure 53: Fitness (latency estimation) box plots of MWDTG generated solutions.
Figure 54: Fitness (latency estimation) box plots of V OPDTG generated solutions.
87
APPENDIX D -- Fault Injection in MedianFitness Solutions
Figure 55: Fault injection on median AP1TG solutions.
88
Figure 56: Fault injection on median AP2TG solutions.
Figure 57: Fault injection on median AP3TG solutions.
89
Figure 58: Fault injection on median AP4TG solutions.
Figure 59: Fault injection on median INTEGRALTG solutions.
90
Figure 60: Fault injection on median MPEGTG solutions.
Figure 61: Fault injection on median MWDTG solutions.
91
Figure 62: Fault injection on median V OPDTG solutions.
92
APPENDIX E -- Examples of Median EpsilonSolutions
0
1 2 3
4 5 6
7 8 9
Figure 63: Median ε AP1 with median fitness.
93
0
1 2 3
4 5 6
7 8 9
Figure 64: Median ε AP2 with median fitness.
0
1 2 3 4
5 6 7 8
9 10
Figure 65: Median ε AP3 with median fitness.
94
0
1 2 3 4
5 6 7 8
9 10
Figure 66: Median ε AP4 with median fitness.
0
1
2
3
4
5
6
7
8 9
Figure 67: Median ε INTEGRAL with median fitness.
95
0 1
2
3
4
5
6
7
8
9
10
11
12
Figure 68: Median ε MPEG with median fitness.
0
1 2 3
4
5
6 7 8
9
10
11
Figure 69: Median ε MWD with median fitness.
96
0
1
2
3 4 5
678
9 10 11
12
Figure 70: Median ε V OPD with median fitness.
97
APPENDIX F -- Median Epsilon SolutionsAfter 10% Fault Injection
0
1 2 3
4 5 6
7 8 9
Figure 71: Median ε AP1 solution with median fitness after 10% fault injection.
98
0
1 2 3
4 5 6
7 8 9
Figure 72: Median ε AP2 solution with median fitness after 10% fault injection.
0
1 2 3 4
5 6 7 8
9 10
Figure 73: Median ε AP3 solution with median fitness after 10% fault injection.
99
0
1 2 3 4
5 6 7 8
9 10
Figure 74: Median ε AP4 solution with median fitness after 10% fault injection.
0
1
2
3
4
5
6
7
8 9
Figure 75: Median ε INTEGRAL solution with median fitness after 10% fault injection.
100
0 1
2
3
4
5
6
7
8
9
10
11
12
Figure 76: Median ε MPEG solution with median fitness after 10% fault injection.
0
1 2 3
4
5
6 7 8
9
10
11
Figure 77: Median ε MWD solution with median fitness after 10% fault injection.
101
0
1
2
3 4 5
678
9 10 11
12
Figure 78: Median ε V OPD solution with median fitness after 10% fault injection.
102
APPENDIX G -- Median Epsilon SolutionsAfter 20% Fault Injection
0
1 2 3
4 5 6
7 8 9
Figure 79: Median ε AP1 solution with median fitness after 20% fault injection.
103
0
1 2 3
4 5 6
7 8 9
Figure 80: Median ε AP2 solution with median fitness after 20% fault injection.
0
1 2 3 4
5 6 7 8
9 10
Figure 81: Median ε AP3 solution with median fitness after 20% fault injection.
104
0
1 2 3 4
5 6 7 8
9 10
Figure 82: Median ε AP4 solution with median fitness after 20% fault injection.
0
1
2
3
4
5
6
7
8 9
Figure 83: Median ε INTEGRAL solution with median fitness after 20% fault injection.
105
0 1
2
3
4
5
6
7
8
9
10
11
12
Figure 84: Median ε MPEG solution with median fitness after 20% fault injection.
0
1 2 3
4
5
6 7 8
9
10
11
Figure 85: Median ε MWD solution with median fitness after 20% fault injection.
106
0
1
2
3 4 5
678
9 10 11
12
Figure 86: Median ε V OPD solution with median fitness after 20% fault injection.
107
APPENDIX H -- Median Epsilon SolutionsAfter 30% Fault Injection
0
1 2 3
4 5 6
7 8 9
Figure 87: Median ε AP1 solution with median fitness after 30% fault injection.
108
0
1 2 3
4 5 6
7 8 9
Figure 88: Median ε AP2 solution with median fitness after 30% fault injection.
0
1 2 3 4
5 6 7 8
9 10
Figure 89: Median ε AP3 solution with median fitness after 30% fault injection.
109
0
1 2 3 4
5 6 7 8
9 10
Figure 90: Median ε AP4 solution with median fitness after 30% fault injection.
0
1
2
3
4
5
6
7
8 9
Figure 91: Median ε INTEGRAL solution with median fitness after 30% fault injection.
110
0 1
2
3
4
5
6
7
8
9
10
11
12
Figure 92: Median ε MPEG solution with median fitness after 30% fault injection.
0
1 2 3
4
5
6 7 8
9
10
11
Figure 93: Median ε MWD solution with median fitness after 30% fault injection.
111
0
1
2
3 4 5
678
9 10 11
12
Figure 94: Median ε V OPD solution with median fitness after 30% fault injection.
112
APPENDIX I -- Detailed Fault Injection inSome Median FitnessSolutions
Figure 95: Fault injection on median SAP1TG,15 solution.
Figure 96: Fault injection on median SAP2TG,15 solution.
113
Figure 97: Fault injection on median SAP3TG,16 solution.
Figure 98: Fault injection on median SAP4TG,16 solution.
Figure 99: Fault injection on median SINTEGRALTG,15 solution.
114
Figure 100: Fault injection on median SMPEGTG,19 solution.
Figure 101: Fault injection on median SMWDTG,18 solution.
Figure 102: Fault injection on median SV OPDTG,19 solution.
115
ANNEX A -- Mesquita’s Work TGs
Figure 103: AP1TG
116
Figure 104: AP2TG
Figure 105: AP3TG
117
Figure 106: AP4TG
Figure 107: INTEGRALTG
118
Figure 108: MPEGTG
Figure 109: MWDTG
119
Figure 110: V OPDTG
top related