![Page 1: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/1.jpg)
Tiling Examples for X86 ISA
Slides Selected from Radu Rugina’s CS412/413 Lecture on Instruction
Selection at Cornell
![Page 2: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/2.jpg)
Instruction Selection
• Translate 3-address code to DAG• Cover DAG with tiles– Disjoint set of tiles cover DAG– Algorithm:• Greedy maximal munch• Dynamic programming
![Page 3: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/3.jpg)
Tiling
• Assume abstract assembly– Infinite registers– Temporary and/or local variables stored in registers– Array, struct, parameter passing use memory
accesses• Translation process from IR:– Convert 3-address code IR to abstract assembly– Build DAG– Perform tiling
![Page 4: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/4.jpg)
Example
• a = a + i where a is a local variable and i is a parameter passed in from a caller
![Page 5: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/5.jpg)
Pentium ISA
• Two-address CISC• Multiple addressing modes– Immediate: $imm– Register: reg– Indirect: (reg), (reg + imm)– Indexed: (reg + reg’), (reg + imm*reg’)
![Page 6: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/6.jpg)
More Tile Examples
![Page 7: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/7.jpg)
Conditional Branches
![Page 8: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/8.jpg)
Load Effective Address
![Page 9: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/9.jpg)
Maximal Munch Algorithm
• A greedy algorithm• Start from top of tree (or DAG)• Find largest tile that matches top node• Tile the sub-trees recursively
![Page 10: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/10.jpg)
Non-Greedy Tiling
![Page 11: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/11.jpg)
Greedy Tiling
![Page 12: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/12.jpg)
ADD Expression and Statement
![Page 13: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/13.jpg)
Designing Tiles
• Only add tiles that are useful to compiler• Many instructions will be too hard to use
effectively or will offer no advantage• Need tiles for all single-node trees to
guarantee that every tree can be tiled
![Page 14: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/14.jpg)
Implementation
• Maximal Munch: start from top node• Find largest tile matching top node and all of
the children nodes• Invoke recursively on all children of tile• Generate code for this tile• Code for children will have been generated
already in recursive calls
![Page 15: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/15.jpg)
Matching Tiles
![Page 16: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/16.jpg)
Finding globally optimum tiling
• Goal: find minimum total cost tiling of DAG• Algorithm: for every node, find minimum total
cost tiling of that node and subgraph below it• Lemma: Given minimum cost tiling of all nodes
in subgraph, we can find minimum cost tiling of the node by trying out all possible tiles matching the node
• Therefore: start from leaves, work upward to top node
![Page 17: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/17.jpg)
Timing Cost Model• Idea: associate cost with each tile (say proportional to number of
cycles to execute)– May not be a good metric on modern architectures
• Total execution time is sum of costs of all tiles
![Page 18: Tiling Examples for X86 ISA Slides Selected from Radu Ruginas CS412/413 Lecture on Instruction Selection at Cornell](https://reader036.vdocuments.site/reader036/viewer/2022062417/55178d015503460e6e8b5817/html5/thumbnails/18.jpg)
Dynamic Progamming• Traverse DAG recursively, and for each node n, record <t,c>, where
– t is the best tile to use for subgraph rooted at n,– c is the total cost of tiling the subgraph rooted at n if t is chosen.
• To compute <t,c> for node n– Consider every tile t’ that matches rooted at n, and compute total cost c’ = cost of tile t’ + sum of the costs of tiling the subgraphs rooted at the leaves of t’ (which costs can be computed recursively and memoized)– Store lowest-cost tile t’ and its total cost c’
• To emit code, traverse least-cost tiles recursively and emit code in postorder