Research on parallel cone-beam CT image reconstruction on CUDA-Enabled GPU (IEEE)
Computed tomography (CT) image reconstruction algorithms via graphic processing unit (GPU) have recently attracted much public attention. These methods often adopt cached texture memory to reduce GPU's high memory latency. However, these texture-based methods still have low efficiency because of their low cache hit rates. By studying threads' execution model on GPU, this paper proposes an accelerating scheme based on the degree of streaming multiprocessor level parallelism. This parallel strategy could make simultaneously executing threads in each multiprocessor have closer localities of memory accesses to improve the utilization of cached texture memory. Experiment results indicate that our accelerating scheme could reduce the computing time by 20%-30% for both forward- and backward- projections on GPU.
Paper available at IEEE.