Loading...
Stories, Papers, WIKIs
| Title | Body | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| An Introduction to Workforce from a User´s Perspective (10/7/2009) |
|
||||||||||||||||
| Introducing GMAC |
We are proud to announce the first public version of GMAC. GMAC is being developed by the Operating System Group at the Universitat Politecnica de Catalunya and the IMPACT Research Group at the Univeristy of Illinois under the University of Illinois/NCSA Open Source License. The project is hosted here. There you can find documentation, code and pre-built Debian packages. |
||||||||||||||||
| Accelerating Phase Unwrapping and Affine Transformations for Optical Quadrature Microscopy using CUDA |
Abstract: Optical Quadrature Microscopy (OQM) is a process which uses phase data to capture information about the sample being studied. OQM is part of an imaging framework developed by the Optical Science Laboratory at Northeastern University. In one particular application of interest, the framework is used to extract phase information from the image of an embryo to determine embryo viability. Phase Unwrapping is the process of reconstructing the real phase shift (propagation delay) of a sample from the measured “wrapped“ representation which is between −π and +π. Unwrapping can be done using the Minimum LP Norm Phase Unwrap algorithm. Images are first preprocessed using an Affine Transform before they are unwrapped. Both of these steps are time consuming and would benefit greatly from parallelization and acceleration. Faster processing would lower many research barriers (in terms of throughput and performance) present when using OQM. In this paper we report on accelerating Phase Unwrapping and Affine Transformations using NVIDIA’s CUDA programming model. We also run elementary noise removal on the GPU using NVIDIA’s CUBLAS (CUDA Basic Linear Algebra Subprograms) library. We integrate GPU execution into a Matlab environment to seamlessly interface to the pre-existing image acquisition system. By mapping the unwrap and noise removal to a GPU, and by also reducing the |
||||||||||||||||
| MR Image Reconstruction Using the GPU |
Abstract:
Magnetic resonance (MR) image reconstruction has reached a bottleneck where further speed improvement from the algorithmic perspective is difficult. However, some clinical practices such as real-time surgery monitoring demand faster reconstruction than what is currently available. For such dynamic imaging applications, radial sampling in k-space (i.e. projection acquisition) recently revives due to fast image acquisition, relatively good signal-to-noise ratio, and better resistance to motion artifacts, as compared with the conventional Cartesian scan. Concurrently, using the graphic processing unit (GPU) to improve algorithm performance has become increasingly popular. In this paper, an efficient GPU implementation of the fast Fourier transform (FFT) will first be described in detail, since the FFT is an important part of virtually all MR image reconstruction algorithms. Then, we evaluate the speed and image quality for the GPU implementation of two reconstruction algorithms that are suited for projection acquisition. The first algorithm is the look-up table based gridding algorithm. The second one is the filtered backprojection method which is widely used in computed tomography. Our results show that the GPU implementation is up to 100 times faster than a conventional CPU implementation with comparable image quality. |
||||||||||||||||
| Accelerating MR image reconstruction on GPUs (IEEE) |
Note: Requires IEEE subscription.
Abstract:
With the explosive development of advanced image reconstruction algorithms, there is an urgent need for acceleration of these algorithms to facilitate their use in practical applications. This paper describes our experience using graphics processing units (GPUs) for advanced MR image reconstruction from non-Cartesian data. We show that implementation of MR image reconstruction on NVIDIA CUDA-enabled GPUs can significantly accelerate the solution of this type of image reconstruction problem. Given the acceleration afforded by the GPU, we expect our strategy to be applied to other computationally intensive imaging algorithms. |
||||||||||||||||
| Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information |
Abstract: This paper presents a computational performance analysis of an accelerated medical image registration using Graphics Processing Units (GPUs). In our previous work, a multi-resolution approach using normalized mutual information (NMI) has proven to be useful in medical image registration. In this paper, we propose an acceleration of the NMI procedure using GPU implementation because of the parallel processing capabilities. Registration algorithms were implemented on NVIDIA’s GeForece 9600 GT graphic processor with the Compute Unified Device Architecture (CUDA) programming environment. Experimental results showed that the GPU implementation improves the registration computational performance with a speedup factor of 38. In addition, the maximum speedup can be achieved with diligent data profiling. |
||||||||||||||||
| High Quality Volume Rendering for Large Medical Datasets Using GPUs |
Abstract:
In this paper, we present efficient, high-quality volume rendering techniques for large volume datasets using graphics processing units (GPUs). We employ the 3D texture mapping capability commonly available in modern GPUs as a core rendering engine and take advantage of combinations of HW-supported occlusion queries, stencil tests and programmable shaders to accelerate the whole rendering process. As a preprocessing step, we subdivide the entire volume dataset into a union of subvolumes of a uniform size. For each subvolume, we also create a filtered visible subvolumes (FVS). The FVS is defined as a set of subvolumes that contain the visible voxels. Before executing an interactive rendering loop, using FVS, we find the boundary subvolumes that are closest to the bounding planes enclosing the entire volume data, and pre-fetch them from main memory to texture memory as they are likely to be rendered regardless of the change of a viewpoint. Then, by rendering the boundary subvolumes onto stencil buffer, we create an initial occlusion map. At runtime, as we render each subvolume, the occlusion map is updated accordingly. Moreover, using the occlusion map, we issue a series of HW-supported occlusion queries to cull away occluded subvolumes and also perform an early ray termination based on the stencil test. We have implemented the volume rendering algorithm and, for a large volume data of 512×512×1024 dimensions, we achieve real-time performance (i.e., 2~3 FPS) on a Pentium IV 2.8 GHz PC equipped with ATI 9800Pro graphics card with 256MB video memory and 256MB AGP memory without any loss of image quality.
Note: Requires purchase or SpringerLink subscription to view in full.
|
||||||||||||||||
| How GPUs Can Improve the Quality of Magnetic Resonance Imaging |
Abstract:
In magnetic resonance imaging (MRI), non-Cartesian scan trajectories are advantageous in a wide variety of emerging applications. Advanced reconstruction algorithms that operate directly on non-Cartesian scan data using optimality criteria such as least-squares (LS) can produce significantly better images than conventional algorithms that apply a fast Fourier transform (FFT) after interpolating the scan data onto a Cartesian grid. However, advanced LS reconstructions require significantly more computation than conventional reconstructions based on the FFT. For example, one LS algorithm requires nearly six hours to reconstruct a single three-dimensional image on a modern CPU. Our work demonstrates that this advanced reconstruction can be performed quickly and efficiently on a modern GPU, with the reconstruction of a 643 3D image requiring just three minutes, an acceptable latency for key applications. This paper describes how the reconstruction algorithm leverages the resources of the GeForce 8800 GTX (G80) to achieve over 150 GFLOPS in performance. We find that the combination of tiling the data and storing the data in the G80’s constant memory dramatically reduces the algorithm’s required bandwidth to off-chip memory. The G80’s special functional units provide substantial acceleration for the trigonometric computations in the algorithm’s inner loops. Finally, experiment-driven code transformations increase the reconstruction’s performance by as much as 60% to 80%. |
||||||||||||||||
| Accelerating Advanced MRI Reconstructions on GPUs |
Abstract: Computational acceleration on graphics processing units (GPUs) can make advanced magnetic resonance imaging (MRI) reconstruction algorithms attractive in clinical settings, thereby improving the quality of MR images across a broad spectrum of applications. At present, MR imaging is often limited by high noise levels, signi cant imaging artifacts, and/or long data acquisition (scan) times. Advanced image reconstruction algorithms can mitigate these limitations and improve image quality by simultaneously operating on scan data acquired with arbitrary trajectories and incorporating additional information such as anatomical constraints. However, the improvements in image quality come at the expense of a considerable increase in computation. This paper describes the acceleration of an advanced reconstruction algorithm on NVIDIA's Quadro FX 5600. Optimizations such as register allocating the voxel data, tiling the scan data, and storing the scan data in the Quadro's constant memory dramatically reduce the reconstruction's required bandwidth to o -chip memory. The Quadro's special functional units provide substantial acceleration of the trigonometric computations in the algorithm's inner loops, and experimentally-tuned code transformations increase the reconstruction's performance by an additional 20%. |
||||||||||||||||
| Real-Time Dynamic Display of Registered 4D Cardiac MR and Ultrasound Images Using a GPU |
Abstract: In minimally invasive image-guided surgical interventions, different imaging modalities, such as magnetic resonance imaging (MRI) or computed tomography (CT), and real-time three-dimensional (3D) ultrasound (US), can provide complementary, multi-spectral image information. Multimodality dynamic image registration is a well-established approach that permits real-time diagnostic information to be enhanced by placing lower-quality real-time images within a high quality anatomical context. For the guidance of cardiac procedures, it would be valuable to register dynamic MRI or CT with intraoperative US. However, in practice, either the high computational cost prohibits such real-time visualization of volumetric multimodal images in a real-world medical environment, or else the resulting image quality is not satisfactory for accurate guidance during the intervention. Modern graphics processing units (GPUs) provide the programmability, parallelism and increased computational precision to begin to address this problem. In this work, we first outline our research on dynamic 3D cardiac MR and US image acquisition, real-time dual-modality registration and US tracking. Then we describe image processing and optimization techniques for 4D (3D + time) cardiac image real-time rendering. We also present our multimodality 4D medical image visualization engine, which directly runs on a GPU in real-time by exploiting the advantages of the graphics hardware. In addition, techniques such as multiple transfer functions for different imaging modalities, dynamic texture binding, advanced texture sampling and multimodality image compositing are employed to facilitate the real-time display and manipulation of the registered dual-modality dynamic 3D MR and US cardiac datasets. |

BayWebSoft