In-Place Irregular Computation for Message-Passing Chip-Multiprocessors

Abstract

With the increase of CMP (Chip-Multiprocessor) scale, moving data to computation on chip becomes more expensive. Accordingly, moving computation to data has potential to improve efficiency. We propose an in-place computation co-design of many-simple-core CMP for irregular applications. The computing paradigm is that an application’s critical irregular data (or part of them) is partitioned into on-chip memory-slices and each slice is delegated by an adjacent core. From the hardware aspect, it divides cores into two groups with load balancing; each group is responsible for accessing off-chip data or irregular data respectively. Moreover, L2 caches are replaced with scratchpads and intra-core message-passing is supported by hardware. Accordingly, algorithms of some typical irregular application kernels are presented, including Breadth-First Search, hash-map, Sparse Matrix-Vector Multiplication and data-walk. Simulations show that, compared with conventional implementations based on cache-coherence (CC), it can improve the performance and energy-efficiency significantly. The limitation is also discussed.

Publication
In International Conference on Parallel Processing Workshops 2021
Weimin Zheng
Weimin Zheng
Professor