Data Structures and Algorithms

Greetings! I'd like to introduce myself to this community and look forward to posting and conversing with you all about algorithms and data structures on the GPU. This is an area that's particularly interesting to me as a researcher; algorithms and data structures have been the core of much of our group's research.


My name is John Owens; I'm an associate professor of electrical and computer engineering at UC Davis. My research group of fantastic graduate students study problems across GPU computing, both the fundamentals of GPU computing (including data structures and algorithms) as well as applications. Beyond just research papers, I am also one of the maintainers of the CUDPP library (the “CUDA Data-Parallel Primitives”) that contains a number of freely available algorithms and data structures; I frequently help teach one-day courses on GPU computing at conferences like Supercomputing, Siggraph, ASPLOS, and IEEE Visualization; and I've been on the conference committee for the Graphics Hardware and High Performance Graphics conferences for several years, and will cochair HPG in 2011.


I first learned about parallel computing as a graduate student on the Imagine project at Stanford, working with (current NVIDIA chief scientist) Bill Dally and an amazing group of graduate student colleagues. Imagine was a data-parallel machine and so a good deal of our collective effort was finding efficient parallel algorithms that would run on Imagine. Learning to “think in parallel” was a lengthy process, but one that I think is increasingly important for computer engineers on today's and tomorrow's computer systems.


So: In this forum I'd like to concentrate on interesting developments in the area of parallel algorithms and data structures, with an emphasis on GPU computing. I particularly want to discuss algorithms and data structures that have general use beyond a single application. One of the benefits of working on CPUs is an enormous knowledge base of data structures and algorithms (often with associated and widely available libraries); in the GPU world we're nowhere near that point. The result is a lot of interesting problems to solve!


I talked with my graduate students and some other colleagues to get an idea of what might be useful to discuss here. I don't want to discuss things in the detail that's in an academic paper—I'll often just refer you to the papers for details—but I hope I can communicate the big ideas behind the work that may be useful for others solving similar problems, or who want to use a particular algorithm or data structure and understand how it works.


I am happy to take suggestions on interesting topics! Here's some topics that my graduate students and other colleagues thought might be interesting to begin:



  • The radix sort algorithm of Satish et al. (IPDPS 2009) (also the sort currently used in CUDPP)

  • Lars Nyland et al.'s all-pairs n-body simulation work (GPU Gems 3). This is one of my favorite GPU computing papers because the authors really motivate the design decisions they made in structuring their implementation of this algorithm.

  • Algorithms that deal with nested or irregular parallelism, such as tree traversal; how do we make these run efficiently on a wide data-parallel machine?

  • Work queues. How do we efficiently distribute work, particularly irregular work that can potentially generate an irregular amount of additional work, to many parallel compute units?

  • The GPU-based hash table work that Dan Alcantara recently presented at Siggraph Asia in December 2009. I was a coauthor on this work and very much enjoyed working with the team that made this paper possible. If you like elegant algorithms, you'll enjoy hearing about “cuckoo hashing.”

Looking forward to your feedback, suggestions, and comments.

Featured Stories and Papers

Latest Stories and Papers

Title New Replies Last postsort icon
A GPU Algorithm Design for Resource Constrained Project Scheduling Problem (IEEE) New 0 new 06-12-2013
Many-Core Accelerated LIBOR Swaption Portfolio Pricing (IEEE) New 0 new 06-06-2013
Evaluating Polynomials in Several Variables and their Derivatives on a GPU Computing Processor (IEEE) New 0 new 06-05-2013
The Parallel Generation of 2-D Hilbert Space-filling Curve on GPU (IEEE) New 0 new 05-20-2013
Poster: Scalable Fast Multipole Methods for Vortex Element Methods (IEEE) New 0 new 05-15-2013
Shepard: A Fast Exact Match Short Read Aligner (IEEE) New 0 new 05-09-2013
An Algorithm to Solve the Dominating Set Problem on GPUs (IEEE) New 0 new 05-09-2013
A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters (IEEE) New 0 new 05-09-2013
Accelerating Cost Aggregation for Real-Time Stereo Matching (IEEE) New 0 new 05-06-2013
Parallel Implementation for SAM Algorithm based on GPU and Distributed Computing (IEEE) New 0 new 05-02-2013
Task-based Parallel Breadth-first Search in Heterogeneous Environments (IEEE) New 0 new 05-01-2013
An OpenCL Implementation of Sketch-Based Network Traffic Change Detection on GPU (IEEE) New 0 new 04-25-2013
GPGPU Implementation of Visual Tracking by Particle Filter with Pixel Ratio Likelihood (IEEE) New 0 new 04-24-2013
NUFFT-based SAR Backprojection on Multiple GPUs (IEEE) New 0 new 04-23-2013
GPU-accelerated Time-domain Marching-onin- degree Solution (IEEE) New 0 new 04-23-2013
Lock-free Hash Table on Graphics Processors (IEEE) New 0 new 04-23-2013

Latest Forum Posts

Latest Wiki Pages

Title New Replies Last postsort icon


WIKI RSS Feed

Unread Items

Type Title Author Replies Last postsort icon