default search action
ACM Transactions on Architecture and Code Optimization, Volume 12
Volume 12, Number 1, April 2015
- Christopher Zimmer, Frank Mueller:
NoCMsg: A Scalable Message-Passing Abstraction for Network-on-Chips. 1:1-1:24 - Beayna Grigorian, Glenn Reinman:
Accelerating Divergent Applications on SIMD Architectures Using Neural Networks. 2:1-2:23 - Anup Holey, Vineeth Mekkat, Pen-Chung Yew, Antonia Zhai:
Performance-Energy Considerations for Shared Cache Management in a Heterogeneous Multicore Processor. 3:1-3:29 - Jinho Suh, Chieh-Ting Huang, Michel Dubois:
Dynamic MIPS Rate Stabilization for Complex Processors. 4:1-4:25 - Naghmeh Karimi, Arun Karthik Kanuparthi, Xueyang Wang, Ozgur Sinanoglu, Ramesh Karri:
MAGIC: Malicious Aging in Circuits/Cores. 5:1-5:25 - Pablo de Oliveira Castro, Chadi Akel, Eric Petit, Mihail Popov, William Jalby:
CERE: LLVM-Based Codelet Extractor and REplayer for Piecewise Benchmarking and Optimization. 6:1-6:24 - Benedict R. Gaster, Derek Hower, Lee W. Howes:
HRF-Relaxed: Adapting HRF to the Complexities of Industrial Heterogeneous Memory Models. 7:1-7:26 - Kevin Streit, Johannes Doerfert, Clemens Hammacher, Andreas Zeller, Sebastian Hack:
Generalized Task Parallelism. 8:1-8:25
Volume 12, Number 2, July 2015
- Hamed Tabkhi, Gunar Schirner:
A Joint SW/HW Approach for Reducing Register File Vulnerability. 9:1-9:28 - Arun K. Kanuparthi, Ramesh Karri:
Reliable Integrity Checking in Multicore Processors. 10:1-10:23 - Do-Heon Lee, Su-Kyung Yoon, Jung-Geun Kim, Charles C. Weems, Shin-Dug Kim:
A New Memory-Disk Integrated System with HW Optimizer. 11:1-11:23 - Morteza Mohajjel Kafshdooz, Alireza Ejlali:
Dynamic Shared SPM Reuse for Real-Time Multicore Embedded Systems. 12:1-12:25 - Wenhao Jia, Elba Garza, Kelly A. Shaw, Margaret Martonosi:
GPU Performance and Power Tuning Using Regression Trees. 13:1-13:26 - Irshad Pananilath, Aravind Acharya, Vinay Vasista, Uday Bondhugula:
An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations. 14:1-14:23 - Shuangde Fang, Wenwen Xu, Yang Chen, Lieven Eeckhout, Olivier Temam, Yunji Chen, Chengyong Wu, Xiaobing Feng:
Practical Iterative Optimization for the Data Center. 15:1-15:26 - Tao Zhang, Naifeng Jing, Kaiming Jiang, Wei Shu, Min-You Wu, Xiaoyao Liang:
Buddy SM: Sharing Pipeline Front-End for Improved Energy Efficiency in GPGPUs. 16:1-16:23 - Hsiang-Yun Cheng, Matt Poremba, Narges Shahidi, Ivan Stalev, Mary Jane Irwin, Mahmut T. Kandemir, Jack Sampson, Yuan Xie:
EECache: A Comprehensive Study on the Architectural Design for Energy-Efficient Last-Level Caches in Chip Multiprocessors. 17:1-17:22 - Arjun Suresh, Bharath Narasimha Swamy, Erven Rohou, André Seznec:
Intercepting Functions for Memoization: A Case Study Using Transcendental Functions. 18:18:1-18:18:23 - Chung-Hsiang Lin, De-Yu Shen, Yi-Jung Chen, Chia-Lin Yang, Cheng-Yuan Michael Wang:
SECRET: A Selective Error Correction Framework for Refresh Energy Reduction in DRAMs. 19:19:1-19:19:24 - Doug Simon, Christian Wimmer, Bernhard Urban, Gilles Duboscq, Lukas Stadler, Thomas Würthinger:
Snippets: Taking the High Road to a Low Level. 20:20:1-20:20:25 - Raghuraman Balasubramanian, Vinay Gangadhar, Ziliang Guo, Chen-Han Ho, Cherin Joseph, Jaikrishnan Menon, Mario Paulo Drumond, Robin Paul, Sharath Prasad, Pradip Valathol, Karthikeyan Sankaralingam:
Enabling GPGPU Low-Level Hardware Explorations with MIAOW: An Open-Source RTL Implementation of a GPGPU. 21:21:1-21:21:25 - Quan Chen, Minyi Guo:
Locality-Aware Work Stealing Based on Online Profiling and Auto-Tuning for Multisocket Multicore Architectures. 22:1-22:24 - Madan Mohan Das, Gabriel Southern, Jose Renau:
Section-Based Program Analysis to Reduce Overhead of Detecting Unsynchronized Thread Communication. 23:23:1-23:23:26 - Atieh Lotfi, Abbas Rahimi, Luca Benini, Rajesh K. Gupta:
Aging-Aware Compilation for GP-GPUs. 24:1-24:20 - Brian P. Railing, Eric R. Hein, Thomas M. Conte:
Contech: Efficiently Generating Dynamic Task Graphs for Arbitrary Parallel Programs. 25:1-25:24
Volume 12, Number 3, October 2015
- Mahdad Davari, Alberto Ros, Erik Hagersten, Stefanos Kaxiras:
The Effects of Granularity and Adaptivity on Private/Shared Classification for Coherence. 26:1-26:21 - Mark Gottscho, Abbas BanaiyanMofrad, Nikil D. Dutt, Alex Nicolau, Puneet Gupta:
DPCS: Dynamic Power/Capacity Scaling for SRAM Caches in the Nanoscale Era. 27:1-27:26 - Pierre Michaud, Andrea Mondelli, André Seznec:
Revisiting Clustered Microarchitecture for Future Superscalar Cores: A Case for Wide Issue Clusters. 28:1-28:22 - Ragavendra Natarajan, Antonia Zhai:
Leveraging Transactional Execution for Memory Consistency Model Emulation. 29:1-29:24 - Biswabandan Panda, Shankar Balachandran:
CAFFEINE: A Utility-Driven Prefetcher Aggressiveness Engine for Multicores. 30:1-30:25 - Jishen Zhao, Sheng Li, Jichuan Chang, John L. Byrne, Laura L. Ramirez, Kevin T. Lim, Yuan Xie, Paolo Faraboschi:
Buri: Scaling Big-Memory Computing with Hardware-Based Memory Expansion. 31:1-31:24 - Jan Lucas, Michael Andersch, Mauricio Alvarez-Mesa, Ben H. H. Juurlink:
Spatiotemporal SIMT and Scalarization for Improving GPU Efficiency. 32:1-32:26
Volume 12, Number 4, January 2016
- Subhasis Das, Tor M. Aamodt, William J. Dally:
Reuse Distance-Based Probabilistic Cache Replacement. 33:1-33:22 - Etem Deniz, Alper Sen:
MINIME-GPU: Multicore Benchmark Synthesizer for GPUs. 34:1-34:25 - Li Tan, Zizhong Chen, Shuaiwen Leon Song:
Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology. 35:1-35:27 - Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan:
Tumbler: An Effective Load-Balancing Technique for Multi-CPU Multicore Systems. 36:1-36:24 - Erik Tomusk, Christophe Dubach, Michael F. P. O'Boyle:
Four Metrics to Evaluate Heterogeneous Multicores. 37:1-37:25 - Morteza Hoseinzadeh, Mohammad Arjomand, Hamid Sarbazi-Azad:
SPCM: The Striped Phase Change Memory. 38:1-38:25 - Chuntao Jiang, Zhibin Yu, Lieven Eeckhout, Hai Jin, Xiaofei Liao, Cheng-Zhong Xu:
Two-Level Hybrid Sampled Simulation of Multithreaded Applications. 39:1-39:25 - Sandeep D'Souza, Soumya J., Santanu Chattopadhyay:
Integrated Mapping and Synthesis Techniques for Network-on-Chip Topologies with Express Channels. 40:1-40:26 - Dimitrios Chasapis, Marc Casas, Miquel Moretó, Raul Vidal, Eduard Ayguadé, Jesús Labarta, Mateo Valero:
PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite. 41:1-41:22 - Francisco Gaspar, Luís Taniça, Pedro Tomás, Aleksandar Ilic, Leonel Sousa:
A Framework for Application-Guided Task Management on Heterogeneous Embedded Systems. 42:1-42:25 - Ehsan K. Ardestani, Rafael Trapani Possignolo, José Luis Briz, Jose Renau:
Managing Mismatches in Voltage Stacking with CoreUnfolding. 43:1-43:26 - Prashant J. Nair, David A. Roberts, Moinuddin K. Qureshi:
FaultSim: A Fast, Configurable Memory-Reliability Simulator for Conventional and 3D-Stacked Systems. 44:1-44:24 - Byeongcheol Lee:
Adaptive Correction of Sampling Bias in Dynamic Call Graphs. 45:1-45:24 - Andrew J. McPherson, Vijay Nagarajan, Susmit Sarkar, Marcelo Cintra:
Fence Placement for Legacy Data-Race-Free Programs via Synchronization Read Detection. 46:1-46:23 - Ding-Yong Hong, Chun-Chen Hsu, Cheng-Yi Chou, Wei-Chung Hsu, Pangfeng Liu, Jan-Jan Wu:
Optimizing Control Transfer and Memory Virtualization in Full System Emulators. 47:1-47:24 - Aravind Sukumaran-Rajam, Philippe Clauss:
The Polyhedral Model of Nonlinear Loops. 48:1-48:27 - Prashant J. Nair, David A. Roberts, Moinuddin K. Qureshi:
Citadel: Efficiently Protecting Stacked Memory from TSV and Large Granularity Failures. 49:1-49:24 - Andrew Anderson, Avinash Malik, David Gregg:
Automatic Vectorization of Interleaved Data Revisited. 50:1-50:25 - Lihang Zhao, Lizhong Chen, Woojin Choi, Jeffrey T. Draper:
A Filtering Mechanism to Reduce Network Bandwidth Utilization of Transaction Execution. 51:1-51:26 - Olivier Serres, Abdullah Kayi, Ahmad Anbar, Tarek A. El-Ghazawi:
Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study. 52:1-52:26 - Riccardo Cattaneo, Giuseppe Natale, Carlo Sicignano, Donatella Sciuto, Marco Domenico Santambrogio:
On How to Accelerate Iterative Stencil Loops: A Scalable Streaming-Based Approach. 53:1-53:26 - Unnikrishnan C., Rupesh Nasre, Y. N. Srikant:
Falcon: A Graph Manipulation Language for Heterogeneous Systems. 54:1-54:27 - Rajshekar Kalayappan, Smruti R. Sarangi:
FluidCheck: A Redundant Threading-Based Approach for Reliable Execution in Manycore Processors. 55:1-55:26 - Jesse Elwell, Ryan Riley, Nael B. Abu-Ghazaleh, Dmitry V. Ponomarev, Iliano Cervesato:
Rethinking Memory Permissions for Protection Against Cross-Layer Attacks. 56:1-56:27 - Amir Morad, Leonid Yavits, Shahar Kvatinsky, Ran Ginosar:
Resistive GP-SIMD Processing-In-Memory. 57:1-57:22 - Yaohua Wang, Dong Wang, Shuming Chen, Zonglin Liu, Shenggang Chen, Xiaowen Chen, Xu Zhou:
Iteration Interleaving-Based SIMD Lane Partition. 58:1-58:18 - Tomi Äijö, Pekka Jääskeläinen, Tapio Elomaa, Heikki Kultala, Jarmo Takala:
Integer Linear Programming-Based Scheduling for Transport Triggered Architectures. 59:1-59:22 - Qixiao Liu, Miquel Moretó, Jaume Abella, Francisco J. Cazorla, Daniel A. Jiménez, Mateo Valero:
Sensible Energy Accounting with Abstract Metering for Multicore Systems. 60:1-60:26 - Miao Zhou, Yu Du, Bruce R. Childers, Daniel Mossé, Rami G. Melhem:
Symmetry-Agnostic Coordinated Management of the Memory Hierarchy in Multicore Systems. 61:1-61:26 - Amir Yazdanbakhsh, Gennady Pekhimenko, Bradley Thwaites, Hadi Esmaeilzadeh, Onur Mutlu, Todd C. Mowry:
RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads. 62:1-62:26 - Donghyuk Lee, Saugata Ghose, Gennady Pekhimenko, Samira Manabi Khan, Onur Mutlu:
Simultaneous Multi-Layer Access: Improving 3D-Stacked Memory Bandwidth at Low Cost. 63:1-63:29 - Yeoul Na, Seon Wook Kim, Youngsun Han:
JavaScript Parallelizing Compiler for Exploiting Parallelism from Data-Parallel HTML5 Applications. 64:1-64:25 - Hiroyuki Usui, Lavanya Subramanian, Kevin Kai-Wei Chang, Onur Mutlu:
DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators. 65:1-65:28 - Morteza Mohajjel Kafshdooz, Mohammadkazem Taram, Sepehr Assadi, Alireza Ejlali:
A Compile-Time Optimization Method for WCET Reduction in Real-Time Embedded Systems through Block Formation. 66:1-66:25
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.