Multi-dimensional Array Operations for Signal Processing Algorithms
Game and graphics processors are increasingly utilized for scientific computing applications and for signal processing in particular. This paper addresses the issue of efficiently mapping high-dimensional array operations for signal processing algorithms onto such computing platforms. Algorithms for fast Fourier transforms and convolutions are essential to many signal processing applications. Such algorithms entail fast high-dimensional data array operations. Game and graphics processors typically differ from general-purpose processors in memory hierarchy as well as in memory capacity and access patterns. We characterize the memory structures from the perspective of high-dimensional array operations, identify the mismatch between algorithmic dimension and architectural dimension and then describe the consequent penalty in memory latency. An architecture supports d-dimensional accesses if a d-dimensional data array can be accessed along one dimension as fast as along any other dimension. We introduce an approach to reduce latency and provide experimental results on the STI Cell Broadband Engine as supporting evidence.