Stories, Papers, WIKIs

Title Body
Geospatial Overlay Computation on the GPU (ACM)

Abstract:

General purpose computing on graphics processing units provides a relatively low cost mechanism to achieve high computational throughput on desktop computers. However, the architecture of GPUs is fundamentally different than CPUs; thus, traditional algorithms cannot simply be run on a GPU. In this paper, we develop algorithms for the line segment intersection problem and the arrangement problem for GPU architectures.

Paper available at ACM.

TerraNNI: Natural Eeighbor Interpolation on a 3D Grid using a GPU (ACM)

Abstract:

With modern focus on LiDAR technology the amount of topographic data, in the form of massive point clouds, has increased dramatically. Furthermore, due to the popularity of LiDAR, repeated surveys of the same areas are becoming more common. This trend will only increase as topographic changes prompt surveys over already scanned terrain, in which case we obtain large spatio-temporal data sets.

In dynamic terrains, such as coastal regions, such spatio-temporal data can offer interesting insight into how the terrain changes over time. An initial step in the analysis of such data is to create a digital elevation model representing the terrain over time. In the case of spatio-temporal data sets those models often represent elevation on a 3D volumetric grid. This involves interpolating the elevation of LiDAR points on these grid points. In this paper we show how to efficiently perform natural neighbor interpolation over a 3D volumetric grid. Using a graphics processing unit (GPU), we describe different algorithms to attain speed and GPU-memory trade-offs. Our algorithm extends to higher dimensions. Our experimental results demonstrate that the algorithm is efficient and scalable. Categories and Subject.

Paper available at ACM.

Accelerating Sketch-Based Computations with GPU: A Case Study for Network Traffic Change Detection (ACM)

Abstract:

Sketch-based algorithms are widely used in networking applications due to its many good attributes. We propose to use Graphics Processing Unit (GPU) as an accelerating engine to offload heavy sketch computations for network traffic change detection. Our experiment results show that GPU can conduct fast change detection with query operation up to 9 million distinct keys per second. It is capable of processing sketch data structure for wide-range of applications in fine-grained time scale efficiently.

Paper available at ACM.

Interactive GPU-Based Octree Generation and Traversal (ACM)

Abstract:

GPU-based ray casting, as introduced by Krüger and Westermann [2003], is an effective method for volumetric rendering. Unfortunately, conventional methods of Empty Space Skipping (ESS) using spatial partitioning, which accelerate ray casting by culling ray-surface intersection tests in empty parts of the volume, do not align well with GPU architectures. CPUs are usually required for tree generation and parsing, as well as the data transfer from CPU to GPU. Such CPU-based pre-processing is time-consuming, with the result that spatial tree structures are invariably applied to static datasets.

Paper available at ACM.

Gamma Photon Transport on the GPU for PET (ACM)

Abstract:

This paper proposes a Monte Carlo algorithm for gamma-photon transport, that partially reuses random paths and is appropriate for parallel GPU implementation According to the requirements of the application of the simulation results in reconstruction algorithms, the method aims at similar relative rather than absolute errors of the detectors The resulting algorithm is SIMD-like, which is a requirement of efficient GPU implementation, i.e all random paths are built with the same sequence of instructions, thus can be simulated on parallel threads that practically have no conditional branches The algorithm is a combined method that separates the low-dimensional part that cannot be well mimicked by importance sampling and computes it by a deterministic quadrature, while the high-dimensional part that is made low-variation by importance sampling is handled by the Monte Carlo method The deterministic quadrature is based on a geometric interpretation of a direct, i.e non-scattered effect of a photon on all detectors.

Paper available at ACM.

A Comparison of Three Commodity-Level Parallel Architectures: Multi-core CPU, Cell BE and GPU (ACM)

Abstract:

We explore three commodity parallel architectures: multi-core CPUs, the Cell BE processor, and graphics processing units. We have implemented four algorithms on these three architectures: solving the heat equation, inpainting using the heat equation, computing the Mandelbrot set, and MJPEG movie compression. We use these four algorithms to exemplify the benefits and drawbacks of each parallel architecture.

Paper available at ACM.

A Discussion on Calculating Eigenvalues of Real Symmetric Tridiagonal Matrices on a GPU (ACM)

Abstract:

While GPUs are attracting attention as an accelerator in wide-ranged application areas, compatibility between the architecture and selected algorithm is important to effectively bring out their potential performance. This paper focuses on eigenvalue calculation from a given real symmetric tridiagonal matrix and compares GPU implementations for the QR method and the bisection method. Implementation for a total of four different GPU architectures are shown and compared to reveal the affinity between algorithms and architectures.

Paper available at ACM.

Accelerating 3-DES Performance Using GPU (ACM)

Abstract:

Various cryptography algorithms have developed to provide different levels of data security for application domains, such as storage security, personal identification, and secure web browsing. They consume massive amount of resource on the server-side while processing encrypting and decrypting requests from clients. In this paper, we try to utilize GPU (Graphics Processing Unit) to speed up the data encryption and decryption to reduce the computing resource spent on security and to improve the web server throughput. We chose the widely-used 3-DES and implemented it on GPU. In our implementation, we observed the GPU cipher performs 5 times faster than the OpenSSL implementation on CPU. As a result, we show a promising direction for offloading the data encryption and decryption onto GPU.

Paper available at ACM.

Fast Ellipse Detection Algorithm Using Hough Transform on the GPU (ACM)

Abstract:

GPUs (Graphics Processing Units) are specialized microprocessors that accelerate 3D or 2D graphics operations. Recent GPUs, which have many processing units connected with a global memory, can be used for general purpose parallel computation. To utilize the powerful computing ability, GPUs are widely used for general purpose computing. The main purpose of this paper is an ellipse detection algorithm with Hough transform. The feature of our algorithm is that to reduce computational time and space, the parameter spaces in the Hough transform are decomposed for each parameter and each parameter is computed in series. Also, we implemented our algorithm on a modern GPU system. The experimental results show that, for an input image with size of 2040$\times$2040, our GPU implementation can achieve a speedup factor of approximately 64 times over the sequential implementation without the GPU support.

Paper available at ACM.

LRF Algorithm Parallel Computing Based on GPU (ACM)

Abstract:

this paper presents parallel computing for P2P protocol LRF based on GPU. We have investigated two ways to map the LRF algorithm to the GPU. One is by using a single thread to control a peer. Another is by using a single thread to control a data block. The two ways can speed up the distributed block scheduling.

Paper available at ACM.