Instruction selection: Difference between revisions

Content deleted Content added

Inline

Latest revision as of 20:14, 3 December 2023

In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers (often known as temporaries) and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.

For example, for the following sequence of middle-level IR code

t1 = a
t2 = b
t3 = t1 + t2
a = t3
b = t1

a good instruction sequence for the x86 architecture is

MOV EAX, a
XCHG EAX, b
ADD a, EAX

For a comprehensive survey on instruction selection, see. ^[1] ^[2]

Macro expansion

The simplest approach to instruction selection is known as macro expansion^[3] or interpretative code generation.^[4]^[5]^[6] A macro-expanding instruction selector operates by matching templates over the middle-level IR. Upon a match the corresponding macro is executed, using the matched portion of the IR as input, which emits the appropriate target instructions. Macro expansion can be done either directly on the textual representation of the middle-level IR,^[7]^[8] or the IR can first be transformed into a graphical representation which is then traversed depth-first.^[9] In the latter, a template matches one or more adjacent nodes in the graph.

Unless the target machine is very simple, macro expansion in isolation typically generates inefficient code. To mitigate this limitation, compilers that apply this approach typically combine it with peephole optimization to replace combinations of simple instructions with more complex equivalents that increase performance and reduce code size. This is known as the Davidson-Fraser approach and is currently applied in GCC.^[10]

Graph covering

Another approach is to first transform the middle-level IR into a graph and then cover the graph using patterns. A pattern is a template that matches a portion of the graph and can be implemented with a single instruction provided by the target machine. The goal is to cover the graph such that the total cost of the selected patterns is minimized, where the cost typically represents the number of cycles it takes to execute the instruction. For tree-shaped graphs, the least-cost cover can be found in linear time using dynamic programming,^[11] but for DAGs and full-fledged graphs the problem becomes NP-complete and thus is most often solved using either greedy algorithms or methods from combinatorial optimization.^[12] ^[13] ^[14]

References

^ Blindell, Gabriel S. Hjort (2013). Survey on Instruction Selection: An Extensive and Modern Literature Review (Report). arXiv:1306.4898. ISBN 978-91-7501-898-0.
^ Blindell, Gabriel S. Hjort (2016). Instruction Selection: Principles, Methods, & Applications. Springer. doi:10.1007/978-3-319-34019-7. ISBN 978-3-319-34017-3. S2CID 13390131.
^ Brown, P. (1969). "A Survey of Macro Processors". Annual Review in Automatic Programming. 6 (2): 37–88. doi:10.1016/0066-4138(69)90001-9. ISSN 0066-4138.
^ Cattell, R. G. G. (1979). "A Survey and Critique of Some Models of Code Generation" (PDF). School of Computer Science, Carnegie Mellon University (Technical report). Archived (PDF) from the original on May 23, 2019.
^ Ganapathi, M.; Fischer, C. N.; Hennessy, J. L. (1982). "Retargetable Compiler Code Generation". Computing Surveys. 14 (4): 573–592. doi:10.1145/356893.356897. ISSN 0360-0300. S2CID 2361347.
^ Lunell, H. (1983). Code Generator Writing Systems (Doctoral thesis). Linköping, Sweden: Linköping University.
^ Ammann, U.; Nori, K. V.; Jensen, K.; Nägeli, H. (1974). "The PASCAL (P) Compiler Implementation Notes". Instituts für Informatik (Technical report).
^ Orgass, R. J.; Waite, W. M. (1969). "A Base for a Mobile Programming System". Communications of the ACM. 12 (9): 507–510. doi:10.1145/363219.363226. S2CID 8164996.
^ Wilcox, T. R. (1971). Generating Machine Code for High-Level Programming Languages (Doctoral thesis). Ithaca, New York, USA: Cornell University.
^ Davidson, J. W.; Fraser, C. W. (1984). "Code Selection Through Object Code Optimization". ACM Transactions on Programming Languages and Systems. 6 (4): 505–526. CiteSeerX 10.1.1.76.3796. doi:10.1145/1780.1783. ISSN 0164-0925. S2CID 10315537.
^ Aho, A. V.; Ganapathi, M.; Tjiang, S. W. K. (1989). "Code Generation Using Tree Matching and Dynamic Programming". ACM Transactions on Programming Languages and Systems. 11 (4): 491–516. CiteSeerX 10.1.1.456.9102. doi:10.1145/69558.75700. S2CID 1165995.
^ Wilson, T.; Grewal, G.; Halley, B.; Banerji, D. (1994). "An integrated approach to retargetable code generation". Proceedings of 7th International Symposium on High-Level Synthesis. pp. 70–75. CiteSeerX 10.1.1.521.8288. doi:10.1109/ISHLS.1994.302339. ISBN 978-0-8186-5785-6. S2CID 14384424.
^ Bashford, Steven; Leupers, Rainer (1999). "Constraint driven code selection for fixed-point DSPS". Proceedings of the 36th ACM/IEEE conference on Design automation conference - DAC '99. pp. 817–822. CiteSeerX 10.1.1.331.390. doi:10.1145/309847.310076. ISBN 978-1581331097. S2CID 5513238.
^ Floch, A.; Wolinski, C.; Kuchcinski, K. (2010). "Combined Scheduling and Instruction Selection for Processors with Reconfigurable Cell Fabric". Proceedings of the 21st International Conference on Application-Specific Architectures and Processors (ASAP'10): 167–174.

External links

Alternative ways of supporting different generations of computer^{[permanent dead link]}

[hjort-blindell-report-1] Blindell, Gabriel S. Hjort (2013). Survey on Instruction Selection: An Extensive and Modern Literature Review (Report). arXiv:1306.4898. ISBN 978-91-7501-898-0.

[hjort-blindell-book-2] Blindell, Gabriel S. Hjort (2016). Instruction Selection: Principles, Methods, & Applications. Springer. doi:10.1007/978-3-319-34019-7. ISBN 978-3-319-34017-3. S2CID 13390131.

[3] Brown, P. (1969). "A Survey of Macro Processors". Annual Review in Automatic Programming. 6 (2): 37–88. doi:10.1016/0066-4138(69)90001-9. ISSN 0066-4138.

[4] Cattell, R. G. G. (1979). "A Survey and Critique of Some Models of Code Generation" (PDF). School of Computer Science, Carnegie Mellon University (Technical report). Archived (PDF) from the original on May 23, 2019.

[5] Ganapathi, M.; Fischer, C. N.; Hennessy, J. L. (1982). "Retargetable Compiler Code Generation". Computing Surveys. 14 (4): 573–592. doi:10.1145/356893.356897. ISSN 0360-0300. S2CID 2361347.

[6] Lunell, H. (1983). Code Generator Writing Systems (Doctoral thesis). Linköping, Sweden: Linköping University.

[7] Ammann, U.; Nori, K. V.; Jensen, K.; Nägeli, H. (1974). "The PASCAL (P) Compiler Implementation Notes". Instituts für Informatik (Technical report).

[8] Orgass, R. J.; Waite, W. M. (1969). "A Base for a Mobile Programming System". Communications of the ACM. 12 (9): 507–510. doi:10.1145/363219.363226. S2CID 8164996.

[9] Wilcox, T. R. (1971). Generating Machine Code for High-Level Programming Languages (Doctoral thesis). Ithaca, New York, USA: Cornell University.

[10] Davidson, J. W.; Fraser, C. W. (1984). "Code Selection Through Object Code Optimization". ACM Transactions on Programming Languages and Systems. 6 (4): 505–526. CiteSeerX 10.1.1.76.3796. doi:10.1145/1780.1783. ISSN 0164-0925. S2CID 10315537.

[11] Aho, A. V.; Ganapathi, M.; Tjiang, S. W. K. (1989). "Code Generation Using Tree Matching and Dynamic Programming". ACM Transactions on Programming Languages and Systems. 11 (4): 491–516. CiteSeerX 10.1.1.456.9102. doi:10.1145/69558.75700. S2CID 1165995.

[12] Wilson, T.; Grewal, G.; Halley, B.; Banerji, D. (1994). "An integrated approach to retargetable code generation". Proceedings of 7th International Symposium on High-Level Synthesis. pp. 70–75. CiteSeerX 10.1.1.521.8288. doi:10.1109/ISHLS.1994.302339. ISBN 978-0-8186-5785-6. S2CID 14384424.

[13] Bashford, Steven; Leupers, Rainer (1999). "Constraint driven code selection for fixed-point DSPS". Proceedings of the 36th ACM/IEEE conference on Design automation conference - DAC '99. pp. 817–822. CiteSeerX 10.1.1.331.390. doi:10.1145/309847.310076. ISBN 978-1581331097. S2CID 5513238.

[14] Floch, A.; Wolinski, C.; Kuchcinski, K. (2010). "Combined Scheduling and Instruction Selection for Processors with Reconfigurable Cell Fabric". Proceedings of the 21st International Conference on Application-Specific Architectures and Processors (ASAP'10): 167–174.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

@@ Line 2: / Line 2: @@
 In [[computer science]], ''instruction selection'' is the stage of a [[compiler]] backend that transforms its middle-level [[intermediate representation]] (IR) into a low-level IR. In a typical compiler, instruction selection precedes both [[instruction scheduling]] and [[register allocation]]; hence its output IR has an infinite set of pseudo-registers (often known as ''temporaries'') and may still be – and typically is – subject to [[peephole optimization]]. Otherwise, it closely resembles the target [[machine code]], [[bytecode]], or [[assembly language]].
-For example, for the following sequence of middle-level IR code<pre>
+For example, for the following sequence of middle-level IR code
+<pre>
 t1 = a
 t2 = b
@@ Line 16: / Line 17: @@
 XCHG EAX, b
 ADD a, EAX
+</syntaxhighlight>
-</syntaxhighlight>For a comprehensive survey on instruction selection, see.<ref name="hjort-blindell-survey">{{cite book|url=https://www.springer.com/us/book/9783319340173|title=Instruction Selection: Principles, Methods, & Applications|last=Hjort Blindell|first=Gabriel|publisher=Springer|year=2016|isbn=978-3-319-34017-3|doi=10.1007/978-3-319-34019-7|s2cid=13390131}}</ref>
+For a comprehensive survey on instruction selection, see.
+<ref name = "hjort-blindell-report">
+{{cite report
+ | last      = Blindell
+ | first     = Gabriel S. Hjort
+ | title     = Survey on Instruction Selection: An Extensive and Modern Literature Review
+ | year      = 2013
+ | arxiv     = 1306.4898
+ | isbn      = 978-91-7501-898-0
+}}</ref>
+<ref name = "hjort-blindell-book">
+{{cite book
+ | last      = Blindell
+ | first     = Gabriel S. Hjort
+ | title     = Instruction Selection: Principles, Methods, & Applications
+ | url       = https://www.springer.com/us/book/9783319340173
+ | publisher = Springer
+ | doi       = 10.1007/978-3-319-34019-7
+ | year      = 2016
+ | isbn      = 978-3-319-34017-3
+ | s2cid     = 13390131
+}}</ref>
 == Macro expansion ==
-The simplest approach to instruction selection is known as ''macro expansion''<ref>{{Cite journal|last=Brown|first=P.|year=1969|title=A Survey of Macro Processors|journal=Annual Review in Automatic Programming|volume=6|issue=2|pages=37–88|issn=0066-4138|doi=10.1016/0066-4138(69)90001-9}}</ref> or ''interpretative code generation''.<ref>{{Cite journal|last=Cattell|first=R. G. G.|year=1979|title=A Survey and Critique of Some Models of Code Generation|url=https://apps.dtic.mil/dtic/tr/fulltext/u2/a056027.pdf|journal=School of Computer Science, Carnegie Mellon University|type=Technical report}}</ref><ref>{{Cite journal|last1=Ganapathi|first1=M.|last2=Fischer|first2=C. N.|last3=Hennessy|first3=J. L.|year=1982|title=Retargetable Compiler Code Generation|journal=Computing Surveys|volume=14|issue=4|pages=573–592|issn=0360-0300|doi=10.1145/356893.356897|s2cid=2361347}}</ref><ref>{{Cite book|title=Code Generator Writing Systems|last=Lunell|first=H.|publisher=Linköping University|year=1983|location=Linköping, Sweden|type=Doctoral thesis}}</ref> A macro-expanding instruction selector operates by matching ''templates'' over the middle-level IR. Upon a match the corresponding ''macro'' is executed, using the matched portion of the IR as input, which emits the appropriate target instructions. Macro expansion can be done either directly on the textual representation of the middle-level IR,<ref>{{Cite journal|last1=Ammann|first1=U.|last2=Nori|first2=K. V.|last3=Jensen|first3=K.|last4=Nägeli|first4=H.|year=1974|title=The PASCAL (P) Compiler Implementation Notes|journal=Instituts für Informatik|type=Technical report}}</ref><ref>{{Cite journal|last1=Orgass|first1=R. J.|last2=Waite|first2=W. M.|year=1969|title=A Base for a Mobile Programming System|journal=Communications of the ACM|volume=12|issue=9|pages=507–510|doi=10.1145/363219.363226|s2cid=8164996}}</ref> or the IR can first be transformed into a graphical representation which is then traversed depth-first.<ref>{{Cite book|title=Generating Machine Code for High-Level Programming Languages|last=Wilcox|first=T. R.|publisher=Cornell University|year=1971|location=Ithaca, New York, USA|type=Doctoral thesis}}</ref> In the latter, a template matches one or more adjacent nodes in the graph.
+The simplest approach to instruction selection is known as ''macro expansion''<ref>{{Cite journal|last=Brown|first=P.|year=1969|title=A Survey of Macro Processors|journal=Annual Review in Automatic Programming|volume=6|issue=2|pages=37–88|issn=0066-4138|doi=10.1016/0066-4138(69)90001-9}}</ref> or ''interpretative code generation''.<ref>{{Cite journal|last=Cattell|first=R. G. G.|year=1979|title=A Survey and Critique of Some Models of Code Generation|url=https://apps.dtic.mil/dtic/tr/fulltext/u2/a056027.pdf|archive-url=https://web.archive.org/web/20190523223442/https://apps.dtic.mil/dtic/tr/fulltext/u2/a056027.pdf|url-status=live|archive-date=May 23, 2019|journal=School of Computer Science, Carnegie Mellon University|type=Technical report}}</ref><ref>{{Cite journal|last1=Ganapathi|first1=M.|last2=Fischer|first2=C. N.|last3=Hennessy|first3=J. L.|year=1982|title=Retargetable Compiler Code Generation|journal=Computing Surveys|volume=14|issue=4|pages=573–592|issn=0360-0300|doi=10.1145/356893.356897|s2cid=2361347}}</ref><ref>{{Cite book|title=Code Generator Writing Systems|last=Lunell|first=H.|publisher=Linköping University|year=1983|location=Linköping, Sweden|type=Doctoral thesis}}</ref> A macro-expanding instruction selector operates by matching ''templates'' over the middle-level IR. Upon a match the corresponding ''macro'' is executed, using the matched portion of the IR as input, which emits the appropriate target instructions. Macro expansion can be done either directly on the textual representation of the middle-level IR,<ref>{{Cite journal|last1=Ammann|first1=U.|last2=Nori|first2=K. V.|last3=Jensen|first3=K.|last4=Nägeli|first4=H.|year=1974|title=The PASCAL (P) Compiler Implementation Notes|journal=Instituts für Informatik|type=Technical report}}</ref><ref>{{Cite journal|last1=Orgass|first1=R. J.|last2=Waite|first2=W. M.|year=1969|title=A Base for a Mobile Programming System|journal=Communications of the ACM|volume=12|issue=9|pages=507–510|doi=10.1145/363219.363226|s2cid=8164996|doi-access=free}}</ref> or the IR can first be transformed into a graphical representation which is then traversed depth-first.<ref>{{Cite book|title=Generating Machine Code for High-Level Programming Languages|last=Wilcox|first=T. R.|publisher=Cornell University|year=1971|location=Ithaca, New York, USA|type=Doctoral thesis}}</ref> In the latter, a template matches one or more adjacent nodes in the graph.
 Unless the target machine is very simple, macro expansion in isolation typically generates inefficient code. To mitigate this limitation, compilers that apply this approach typically combine it with [[peephole optimization]] to replace combinations of simple instructions with more complex equivalents that increase performance and reduce code size. This is known as the ''Davidson-Fraser approach'' and is currently applied in [[GNU Compiler Collection|GCC]].<ref>{{Cite journal|last1=Davidson|first1=J. W.|last2=Fraser|first2=C. W.|year=1984|title=Code Selection Through Object Code Optimization|journal=ACM Transactions on Programming Languages and Systems|volume=6|issue=4|pages=505–526|issn=0164-0925|doi=10.1145/1780.1783|citeseerx=10.1.1.76.3796|s2cid=10315537}}</ref>
@@ Line 25: / Line 49: @@
 == Graph covering ==
-Another approach is to first transform the middle-level IR into a graphical representation and then ''cover'' the graph using ''patterns''. A pattern is a template that matches a portion of the graph and can be implemented with a single instruction provided by the target machine. The goal is to cover the graph such that the total cost of the selected patterns is minimized, where the cost typically represents the number of cycles it takes to execute the instruction. For tree-shaped graphs, the least-cost cover can be found in linear time using [[dynamic programming]],<ref>{{Cite journal|last1=Aho|first1=A. V.|last2=Ganapathi|first2=M.|last3=Tjiang|first3=S. W. K.|year=1989|title=Code Generation Using Tree Matching and Dynamic Programming|journal=ACM Transactions on Programming Languages and Systems|volume=11|issue=4|pages=491–516|doi=10.1145/69558.75700|citeseerx=10.1.1.456.9102|s2cid=1165995}}</ref> but for DAGs and full-fledged graphs the problem becomes NP-complete and thus is most often solved using either [[greedy algorithm]]s or methods from combinatorial optimization.<ref>{{Cite book|last1=Wilson|first1=T.|last2=Grewal|first2=G.|last3=Halley|first3=B.|last4=Banerji|first4=D.|year=1994|title=An Integrated Approach to Retargetable Code Generation|journal=Proceedings of the 7th International Symposium on High-Level Synthesis (ISSS'94)|pages=70–75|doi=10.1109/ISHLS.1994.302339|citeseerx=10.1.1.521.8288|isbn=978-0-8186-5785-6|s2cid=14384424}}</ref><ref>{{Cite book |doi=10.1145/309847.310076|citeseerx=10.1.1.331.390|isbn=978-1581331097|chapter=Constraint driven code selection for fixed-point DSPS|title=Proceedings of the 36th ACM/IEEE conference on Design automation conference - DAC '99|pages=817–822|year=1999|last1=Bashford|first1=Steven|last2=Leupers|first2=Rainer|s2cid=5513238}}</ref><ref>{{Cite journal|last1=Floch|first1=A.|last2=Wolinski|first2=C.|last3=Kuchcinski|first3=K.|year=2010|title=Combined Scheduling and Instruction Selection for Processors with Reconfigurable Cell Fabric|journal=Proceedings of the 21st International Conference on Application-Specific Architectures and Processors (ASAP'10)|pages=167–174}}</ref>
+Another approach is to first transform the middle-level IR into a [[Graph (discrete mathematics)|graph]] and then [[Covering graph|cover the graph]] using ''patterns''. A pattern is a template that matches a portion of the graph and can be implemented with a single instruction provided by the target machine. The goal is to cover the graph such that the total cost of the selected patterns is minimized, where the cost typically represents the number of cycles it takes to execute the instruction. For tree-shaped graphs, the least-cost cover can be found in linear time using [[dynamic programming]],<ref>{{Cite journal|last1=Aho|first1=A. V.|last2=Ganapathi|first2=M.|last3=Tjiang|first3=S. W. K.|year=1989|title=Code Generation Using Tree Matching and Dynamic Programming|journal=ACM Transactions on Programming Languages and Systems|volume=11|issue=4|pages=491–516|doi=10.1145/69558.75700|citeseerx=10.1.1.456.9102|s2cid=1165995}}</ref> but for [[Directed acyclic graph|DAG]]s and full-fledged graphs the problem becomes NP-complete and thus is most often solved using either [[greedy algorithm]]s or methods from combinatorial optimization.<ref>{{Cite book|last1=Wilson|first1=T.|last2=Grewal|first2=G.|last3=Halley|first3=B.|last4=Banerji|first4=D.|title=Proceedings of 7th International Symposium on High-Level Synthesis |chapter=An integrated approach to retargetable code generation |year=1994|pages=70–75|doi=10.1109/ISHLS.1994.302339|citeseerx=10.1.1.521.8288|isbn=978-0-8186-5785-6|s2cid=14384424}}</ref>
+<ref>{{Cite book |doi=10.1145/309847.310076|citeseerx=10.1.1.331.390|isbn=978-1581331097|chapter=Constraint driven code selection for fixed-point DSPS|title=Proceedings of the 36th ACM/IEEE conference on Design automation conference - DAC '99|pages=817–822|year=1999|last1=Bashford|first1=Steven|last2=Leupers|first2=Rainer|s2cid=5513238}}</ref>
+<ref>{{Cite journal|last1=Floch|first1=A.|last2=Wolinski|first2=C.|last3=Kuchcinski|first3=K.|year=2010|title=Combined Scheduling and Instruction Selection for Processors with Reconfigurable Cell Fabric|journal=Proceedings of the 21st International Conference on Application-Specific Architectures and Processors (ASAP'10)|pages=167–174}}</ref>
-== Lowest common denominator strategy==
-{{Unreferenced section|date=March 2009}}
-The ''[[lowest common denominator]] strategy'' is an instruction selection technique used on platforms where processor-supplementary instructions exist to make executable programs portable across a wide range of computers. Under a lowest common denominator strategy, the default behaviour of the [[compiler]] is to build for the lowest common architecture.  Use of any available processor extension is switched off by default, unless explicitly switched on by command line switches.
-The use of a lowest common denominator strategy means that processor-supplementary instructions and [[Processor supplementary capability|capabilities]] are not used by default.
 ==References==

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Use-define chain
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis