SCU: a GPU stream compaction unit for graph processing

A Segura, JM Arnau, A González - Proceedings of the 46th international …, 2019 - dl.acm.org
… time in several graph applications. We propose to offload the stream compaction operations
to a specialized unit, the SCU. The SCU is an efficient, compact and small footprint unit that …

Energy-efficient stream compaction through filtering and coalescing accesses in gpgpu memory partitions

A Segura, JM Arnau, A González - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
GPU extension that combines both the efficient SCU and the filtering mechanism of the IRU
to improve overall graph processing efficiency. We evaluate our proposal on top of a modern …

High-performance and energy-efficient irregular graph processing on GPU architectures

A Segura Salvador - 2021 - upcommons.upc.edu
… to a programmable Stream Compaction Unit (SCU) hardware … the graph-based algorithm
are efficiently executed on the GPU cores. The SCU is a small unit tightly integrated in the GPU

Tdgraph: a topology-driven accelerator for high-performance streaming graph processing

J Zhao, Y Yang, Y Zhang, X Liao, L Gu, L He… - Proceedings of the 49th …, 2022 - dl.acm.org
… We analysed the characteristics of streaming graph processing and made two main observations.
… a compacting and filtering technique to prepare data for SMs of GPU for higher GPU

Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads

A Segura, JM Arnau, A Gonzalez - The Journal of Supercomputing, 2023 - Springer
… Despite these efforts, we show that irregular graph processingSCU [8] proposes a
programmable GPU hardware extension for graph processing that is tailored to stream compaction

SCU: a GPU stream compaction unit for graph processing

A Segura Salvador, JM Arnau Montañés… - recercat.cat
stream compaction, and propose to offload this task to a … Stream Compaction Unit (SCU)
tailored to the requirements of this kernel. The SCU is a small unit tightly integrated in the GPU

Improving streaming graph processing performance using input knowledge

A Basak, Z Qu, J Lin, AR Alameldeen… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
… the performance of streaming graph processing. To improve graph update efficiency, we …
To complement adaptive batch reordering, we propose updating graphs dynamically, based …

WER: Maximizing Parallelism of Irregular Graph Applications Through GPU Warp EqualizeR

EM Huang, BW Cheng, MH Lin… - 2024 29th Asia and …, 2024 - ieeexplore.ieee.org
… In the context of largescale graph processing, programmable General-Purpose Graphics
Processing Units (… GraphPEG [10] and SCU [11], on the other hand, proposed custom hardware …

Redzone stream compaction: removing k items from a list in parallel O (k) time

J Bontes, J Gain - ACM Transactions on Parallel Computing, 2024 - dl.acm.org
… Redzone stream compaction, the first parallel stream compaction algorithm … GPU and CPU,
if k is proportionally small (k ≪ n), Redzone outperforms existing parallel stream compaction

Near-Memory Parallel Indexing and Coalescing: Enabling Highly Efficient Indirect Access for SpMV

C Zhang, P Scheffler, T Benz, M Perotti… - … Design, Automation & …, 2024 - ieeexplore.ieee.org
… González, “Scu: a gpu stream compaction unit for graph processing,” in Proceedings of
the 46th international symposium on computer architecture, 2019, pp. 424–435. [20] S. …