Loading...
Stories, Papers, WIKIs
| Title | Body |
|---|---|
| Introducing GMAC |
We are proud to announce the first public version of GMAC. GMAC is being developed by the Operating System Group at the Universitat Politecnica de Catalunya and the IMPACT Research Group at the Univeristy of Illinois under the University of Illinois/NCSA Open Source License. The project is hosted here. There you can find documentation, code and pre-built Debian packages. |
| Continuous Maximal Flows and Wulff Shapes: Application to MRFs |
Abstract:
Convex and continuous energy formulations for low level vision problems enable efficient search procedures for the corresponding globally optimal solutions. In this work we extend the well-established continuous, isotropic capacity-based maximal flow framework to the anisotropic setting. By using powerful results from convex analysis, a very simple and efficient minimization procedure is derived.
Further, we show that many important properties carry over to the new anisotropic framework, e.g. globally optimal binary results can be achieved simply by thresholding the continuous solution. In addition, we unify the anisotropic continuous maximal flow approach with a recently proposed convex and continuous formulation for Markov random fields, thereby allowing more general smoothness priors to be incorporated. Dense stereo results are included to illustrate the capabilities of the proposed approach....The underlying update equations Eq. 12 are very suitable to be accelerated by a modern GPU. Our current CUDA-based implementation executed on a Geforce 8800 Ultra is able to achieve two frames per second for 320 × 240 images and 32 disparity levels (aiming for a 2% duality gap at maximum).
|
| From Structure-from-Motion Point Clouds to Fast Location Recognition |
Abstract:
Efficient view registration with respect to a given 3D reconstruction has many applications like inside-out tracking in indoor and outdoor environments, and geo-locating images from large photo collections. We present a fast location recognition technique based on structure from motion point clouds. Vocabulary tree-based indexing of features directly returns relevant fragments of 3D models instead of documents from the images database. Additionally, we propose a compressed 3D scene representation which improves recognition rates while simultaneously reducing the computation time and the memory consumption. The design of our method is based on algorithms that efficiently utilize modern graphics processing units to deliver real-time performance for view registration. We demonstrate the approach by matching hand-held outdoor videos to known 3D urban models, and by registering images from online photo collections to the corresponding landmarks.
...we employ a CUDA-based approach executed on the GPU for faster determination of the respective visual words. The speed-up induced by the GPU (about 15 - 20 on a GeForce GTX280 vs. Intel Pentium D 3.2Ghz) approach allows to incorporate more descriptor comparisons, i.e. a deeper tree with a smaller branching factor can be replaced by a shallower tree with a significantly higher number of branches. |
| Parallel Data Mining on Graphics Processors |
Abstract:
We introduce GPUMiner, a novel parallel data mining system that utilizes new-generation graphics processing units (GPUs). Our system relies on the massively multi-threaded SIMD (Single Instruction, Multiple-Data) architecture provided by GPUs. As specialpurpose co-processors, these processors are highly optimized for graphics rendering and rely on the CPU for data input/output as well as complex program control. Therefore, we design GPUMiner to consist of the following three components: (1) a CPU-based storage and buffer manager to handle I/O and data transfer between the CPU and the GPU, (2) a GPU-CPU co-processing parallel mining module, and (3) a GPU-based mining visualization module. We design the GPU-CPU co-processing scheme in mining depending on the complexity and inherent parallelism of individual mining algorithms. We provide the visualization module to facilitate users to observe and interact with the mining process online. We have implemented the k-means clustering and the Apriori frequent pattern mining algorithms in GPUMiner. Our preliminary results have shown significant speedups over state-of-the-art CPU implementations on a PC with a G80 GPU and a quad-core CPU. We will demonstrate the mining process through our visualization module. Code and documentation of GPUMiner are available at http://code.google.com/p/gpuminer/.
|
| Speeding Up Evolutionary Learning Algorithms using GPUs |
Abstract:
This paper propose a multithreaded Genetic Programming classification evaluation model using NVIDIA CUDA GPUs to reduce the computational time due to the poor performance in large problems. Two different classification algorithms are benchmarked using UCI Machine Learning data sets. Experimental results compare the performance using single and multithreaded Java, C and GPU code and show the efficiency far better obtained by our proposal. |
| GPUML: Graphical processors for speeding up kernel machines |
Algorithms based on kernel methods play a central role in statistical machine learning. At their core are a number of linear algebra operations on matrices of kernel functions which take as arguments the training and testing data. These range from the simple matrix-vector product, to more complex matrix decompositions, and iterative formulations of these. Often the algorithms scale quadratically or cubically, both in memory and operational complexity, and as data sizes increase, kernel methods scale poorly. We use parallelized approaches on a multi-core graphical processor (GPU) to partially address this lack of scalability. GPUs are used to scale three different classes of problems, a simple kernelmatrix- vector product, iterative solution of linear systems of kernel function and QR and Cholesky decomposition of kernel matrices. Application of these accelerated approaches in scaling several kernel based learning approaches are shown, and in each case substantial speedups are obtained. The core software is released as an open source package, GPUML. |
| GPU Accelerated Acoustic Likelihood Computations |
Abstract:
This paper introduces the use of Graphics Processors Unit (GPU) for computing acoustic likelihoods in a speech recognition system. In addition to their high availability, GPUs provide high computing performance at low cost. We have used aNVidia GeForce 8800GTX programmed with the CUDA (Compute Unified Device Architecture) which shows the GPU as aparallel coprocessor. The acoustic likelihoods are computed as dot products, operations for which GPUs are highly efficient. The implementation in our speech recognition system shows that GPU is 5x faster than the CPU SSE-based implementation. This improvement led to a speed up of 35% on a large vocabulary task. |
| Large-scale Deep Unsupervised Learning using Graphics Processors |
Abstract: The promise of unsupervised learning methods lies in their potential to use vast amounts of unlabeled data to learn complex, highly nonlinear models with millions of free parameters. We consider two well-known unsupervised learning models, deep belief networks(DBNs) and sparse coding, that have recently been applied to a flurry of machine learning applications (Hinton & Salakhutdinov, 2006; Raina et al., 2007). Unfortunately, current learning algorithms for both models are too slow for large-scale applications, forcing researchers to focus on smaller-scale models, or to use fewer training examples.
In this paper, we suggest massively parallel methods to help resolve these problems.We argue that modern graphics processors far surpass the computational capabilities of multicore CPUs, and have the potential to revolutionize the applicability of deep unsupervised learning methods. We develop general principles for massively parallelizing unsupervised learning tasks using graphics processors. We show that these principles can be applied to successfully scaling up learning algorithms for both DBNs and sparse coding.Our implementation of DBN learning is up to 70 times faster than a dual-core CPU implementation for large models. For example, we are able to reduce the time required to learn a four-layer DBN with 100 million free parameters from several weeks to around a single day. For sparse coding, we develop a simple, inherently parallel algorithm, that leads to a 5 to 15-fold speedup over previous methods. |
| A GPU Based Implementation of Center-Surround Distribution Distance for Feature Extraction and Matching |
The release of general purpose GPU programming environments has garnered universal access to computing performance that was once only available to super-computers. The availability of such computational power has fostered the creation and re-deployment of algorithms, new and old, creating entirely new classes of applications. In this paper, a GPU implementation of the Center-Surround Distribution Distance (CSDD) algorithm for detecting features within images and video is presented. While an optimized CPU implementation requires anywhere from several seconds to tens of minutes to perform analysis of an image, the GPU based approach has the potential to improve upon this by up to 28X, with no loss in accuracy. |
| GPU-Accelerated Large Scale Analytics |
Abstract:
In this paper, we report our research on using GPUs as accelerators for Business Intelligence(BI)
|

BayWebSoft