Stories, Papers, WIKIs

Title Body
A single-pass GPU ray casting framework for interactive out-of-core rendering of massive volumetric datasets

Abstract:

 

We present an adaptive out-of-core technique for rendering massive scalar volumes employing single-pass GPU ray casting. The method is based on the decomposition of a volumetric dataset into small cubical bricks, which are then organized into an octree structure maintained out-of-core. The octree contains the original data at the leaves, and a filtered representation of children at inner nodes. At runtime an adaptive loader, executing on the CPU, updates a view and transfer function-dependent working set of bricks maintained on GPU memory by asynchronously fetching data from the out-of-core octree representation. At each frame, a compact indexing structure, which spatially organizes the current working set into an octree hierarchy, is encoded in a small texture. This data structure is then exploited by an efficient stackless ray casting algorithm, which computes the volume rendering integral by visiting non-empty bricks in front-to-back order and adapting sampling density to brick resolution. Block visibility information is fed back to the loader to avoid refinement and data loading of occluded zones. The resulting method is able to interactively explore multi-gigavoxel datasets on a desktop PC.  

 

Note: Requires SpringerLink subscription or purchase to view in full.

Real-Time Digital Holographic Microscopy Observable in Multi-View and Multi-Resolution

Abstract:

 

We propose a real-time digital holographic microscopy, that enables simultaneous multiple reconstructed images with arbitrary resolution, depth and positions, using Shifted-Fresnel diffraction instead of Fresnel diffraction. In this system, we used four graphics processing units (GPU) for multiple reconstructions in real-time. We show the demonstration of four reconstruction images from a hologram with arbitrary depths,positions, and resolutions.

GPU-Based Frequency Domain Volume Rendering

Abstract: Frequency domain volume rendering (FVR) is a volume rendering technique with lower computational complexity as compared to other techniques. In this paper the FVR algorithm is accelerated by factor of 17 by mapping the rendering stage to the GPU. The overall hardware-accelerated pipeline is discussed and the changes according to previous work are pointed out. The three-dimensional transformation into frequency domain is done in a pre-processing step. The rendering step is computed completely on the GPU. First the projection slice is extracted. Four different interpolation schemes are used for resampling the slice from the data represented by a 3D texture. The extracted slice is transformed back into the spatial domain using the inverse Fast Fourier or Fast Hartley Transform. The rendering stage is implemented through shader programs running on programmable graphics hardware achieving highly interactive framerates.

Instant Sound Scattering

Abstract:

 

Real-time sound rendering engines often render occlusion and early sound reflection effects using geometrical techniques such as ray or beam tracing. They can only achieve interactive rendering for environments of low local complexity resulting in crude effects which can degrade the sense of immersion. However, surface detail or complex dynamic geometry has a strong influence on sound propagation and the resulting auditory perception. This paper focuses on high-quality modeling of first-order sound scattering. Based on a surface-integral formulation and the Kirchhoff approximation, we propose an efficient evaluation of scattering effects, including both diffraction and reflection, that leverages programmable graphics hardware for dense sampling of complex surfaces. We evaluate possible surface simplification techniques and show that combined normal and displacement maps can be successfully used for audio scattering calculations. We present an auralization framework that can render scattering effects interactively thus providing a more compelling experience. We demonstrate that, while only considering first order phenomena, our approach can provide realistic results for a number of practical interactive applications. It can also process highly detailed models containing millions of unorganized triangles in minutes, generating high-quality scattering filters. Resulting simulations compare well with on-site recordings showing that the Kirchhoff approximation can be used for complex scattering problems. 

 

Real-time Generation of Digital Bas-Reliefs

Abstract:

 

     Bas-relief is a form of sculpture where carved or chiseled forms protrude partially and shallowly from the background. Occupying an intermediate place between painting and full 3D sculpture, bas-relief sculpture exploits properties of human visual perception in order to maintain perceptually salient 3D information. In this paper, we present two methods for automatic bas-relief generation from 3D digital shapes. Both methods are inspired by techniques developed for high dynamic range image compression and have the bilateral filter as the main ingredient. We demonstrate that the methods are capable of preserving fine shape features and achieving good compression without compromising the quality of surface details.
     For artists, bas-relief generation starts from managing the viewer's point of view and compositing the scene. Therefore we strive in our work to streamline this process by focusing on easy and intuitive user interaction which is paramount to artistic applications. Our algorithms allow for real time computation thanks to our implementation on graphics hardware. Besides interactive production of stills, this work offers the possibility for generating bas-relief animations. Last but not least, we explore the generation of artistic reliefs that mimic cubism in painting.

     In our framework, we capitalize on the highly parallel nature of our method and exploit the properties of modern graphics hardware. This allowed us to devise an OpenGL application that implements the full algorithmic pipeline of our techniques.

Performance Analysis of the OP2 Framework on Many-core Architectures

Abstract:

This paper presents a performance analysis and benchmarking study of the OP2 \active" library, which provides an
abstraction framework for the solution of parallel unstructured mesh applications. OP2 aims to decouple the scientific
specification of the application from its parallel implementation, and thereby achieve code longevity and near-optimal
performance through re-targeting the back-end to different hardware.

Runtime performance results are presented for a representative unstructured mesh application written using OP2
on a variety of many-core processor systems, including the traditional X86 architectures from Intel (Xeon based on the
older Penryn and current Nehalem micro-architectures) and GPU o erings from NVIDIA (GTX260, Tesla C2050). Our
analysis demonstrates the contrasting performance between the use of CPU (OpenMP) and GPU (CUDA) parallel implementations
for the solution on an industrial sized unstructured mesh consisting of about 1.5 million edges.

Results show the significance of choosing the correct partition and thread-block configuration, the factors limiting
the GPU performance and insights into optimizations for improved performance.

Adaptive Sampling in Three Dimensions for Volume Rendering on GPUs

Abstract:

 

Direct volume rendering of large volumetric data sets on programmable graphics hardware is often limited by the amount of available graphics memory and the bandwidth from main memory to graphics memory. Therefore, several approaches to volume rendering from compact representations of volumetric data have been published that avoid most of the data transfer between main memory and the graphics programming unit (GPU) at the cost of additional data decompression by the GPU. To reduce this performance cost, adaptive sampling techniques were proposed; which are, however, usually restricted to the sampling in view direction. In this work, we present a GPU-based volume rendering algorithm with adaptive sampling in all three spatial directions; i.e., not only in view direction but also in the two perpendicular directions of the image plane. This approach allows us to reduce the number of samples dramatically without compromising image quality; thus, it is particularly well suited for many compressed representations of volumetric data that require a computational expensive GPU-based sampling of data. 

GPU Improvements on the Sorting and Projection of Tetrahedral Meshes for Direct Volume Rendering

 Abstract:

 

Direct Volume Rendering is one of the most popular visualization techniques. Although approaches such as ray-casting or slicing are
fast and well-implemented on graphics hardware for regular and irregular grids, cell projection techniques are still time-consuming for
large tetrahedral meshes. We propose improvements to the pipeline of cell projection techniques based on the SXMPVO [4] and the
Projected Tetrahedra [8] (PT) algorithms. Specifically, we exploit new functionalities of the latest graphics hardware to remove bottlenecks
in the sorting and rendering phases.

Interactive GigaVoxels

Abstract:

 

We propose a new approach for the interactive rendering of large highly detailed scenes. It is based on a new representation and algorithm for large and detailed volume data, especially well suited to cases where detail is concentrated at the interface between free space and clusters of density. This is for instance the case with cloudy sky, landscape, as well as data currently represented as hypertextures or volumetric textures. Existing approaches do not efficiently store, manage and render such data, especially at high resolution and over large extents. Our method is based on a dynamic generalized octree with MIP-mapped 3D texture bricks in its leaves. Data is stored only for visible regions at the current viewpoint, at the appropriate resolution. Since our target scenes contain many sparse opaque clusters, this maintains low memory and bandwidth consumption during exploration. Raymarching allows to quickly stops when reaching opaque regions. Also, we efficiently skip areas of constant density. A key originality of our algorithm is that it directly relies on the ray-marcher to detect missing data. The march along every ray in every pixel may be interrupted while data is generated or loaded. It hence achieves interactive performance on very large volume data sets. Both our data structure and algorithm are well-fitted to modern GPUs. We demonstrate our approach with several typical situations: exploration of a 3D scan (81923 resolution), of hypertextured meshes (163843 virtual resolution), and of a Sierpinski sponge (8:4M3 virtual resolution), all rendered at an interactive frame-rate of 10 to 20 fps and fitting the limited GPU memory budget.

 

GPU Rendering of Relief Mapped Conical Frusta

Abstract:

 

This paper proposes to use relief-mapped conical frusta (cones cut by planes) to skin skeletal objects. Based on this
representation, current programmable graphics hardware can perform the rendering with only minimal communication
between the CPU and GPU. A consistent definition of conical frusta including texture parametrization and
a continuous surface normal is provided. Rendering is performed by analytical ray casting of the relief-mapped
frusta directly on the GPU. We demonstrate both static and animated objects rendered using our technique and
compare to polygonal renderings of similar quality.