An Optimization of FMM under CPU+GPU Heterogeneous Architecture (IEEE)

Publication Year: 


Heterogeneous architecture of CPU+GPU has been the main trend for high-performance computing/parallel processing in recent years. However, the formulation of scientific algorithms to take advantage of the performance offered by the new architecture requires rethinking core methods. The algorithmic acceleration is achieved with the main part of fast multipole method (FMM) under the heterogeneous architecture. Based on PetFMM, a Two Dimensional Threads Mapping Model (TDTMM) is proposed to lighten the workload per thread on GPU. The presented threads mapping model is able to improve the execution efficiency of hardware acceleration. Experiment results show that the presented models are feasible and effective.

Paper available at IEEE.

Comput. Center, Shanghai Univ., Shanghai, China
File attachments: