Instruction fetch architectures and code layout optimizations

Ramírez Bellido, Alejandro; Larriba Pey, Josep; Valero Cortés, Mateo

doi:10.1109/5.964440

Instruction fetch architectures and code layout optimizations

Fitxers

00964440.pdf (407.33 KB)

Estadístiques d'ús

Pàgina completa de l'ítem

Autors

Ramírez Bellido, Alejandro

Larriba Pey, Josep

Valero Cortés, Mateo

Tipus de document

Article

Data publicació

2001-11

Condicions d'accés

Accés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització de la persona titular dels drets

Abstract

The design of higher performance processors has been following two major trends: increasing the pipeline depth to allow faster clock rates, and widening the pipeline to allow parallel execution of more instructions. Designing a higher performance processor implies balancing all the pipeline stages to ensure that overall performance is not dominated by any of them. This means that a faster execution engine also requires a faster fetch engine, to ensure that it is possible to read and decode enough instructions to keep the pipeline full and the functional units busy. This paper explores the challenges faced by the instruction fetch stage for a variety of processor designs, from early pipelined processors, to the more aggressive wide issue superscalars. We describe the different fetch engines proposed in the literature, the performance issues involved, and some of the proposed improvements. We also show how compiler techniques that optimize the layout of the code in memory can be used to improve the fetch performance of the different engines described Overall, we show how instruction fetch has evolved from fetching one instruction every few cycles, to fetching one instruction per cycle, to fetching a full basic block per cycle, to several basic blocks per cycle: the evolution of the mechanism surrounding the instruction cache, and the different compiler optimizations used to better employ these mechanisms.

Citació

Ramírez, A., Larriba, J., Valero, M. Instruction fetch architectures and code layout optimizations. "Proceedings of the IEEE", Novembre 2001, vol. 89, núm. 11, p. 1588-1609.

URI

https://hdl.handle.net/2117/103408

DOI

10.1109/5.964440

ISSN

0018-9219

Versió de l'editor

http://ieeexplore.ieee.org/document/964440/

Col·leccions

Altres - Enviament des de DRAC
Departament d'Arquitectura de Computadors - Articles de revista
CAP - Grup de Computació d'Altes Prestacions - Articles de revista

Pàgina completa de l'ítem

Instruction fetch architectures and code layout optimizations

Fitxers

Projectes de recerca

Unitats organitzatives

Número de la revista

Títol de la revista

ISSN de la revista

Títol del volum

Autors

Col·laborador

Tribunal avaluador

Realitzat a/amb

Tipus de document

Data publicació

Editor

Condicions d'accés

item.page.rightslicense

Assignatures relacionades

Assignatures relacionades

Publicacions relacionades

Datasets relacionats

Datasets relacionats

Projecte CCD

Abstract

Descripció

Persones/entitats

Document relacionat

item.page.versionof

Citació

Ajut

Forma part

URI

DOI

Dipòsit legal

ISBN

ISSN

Versió de l'editor

Altres identificadors

Referències

Col·leccions