Loading...
Stories, Papers, WIKIs
| Title | Body |
|---|---|
| Towards Chip-on-Chip Neuroscience: Fast Mining of Neuronal Spike Streams Using Graphics Hardware |
Abstract: |
| Parallelisation of Fuzzy Inference on a Graphics Processor Unit Using the Compute Unified Device Architecture |
Abstract:
The inherently parallel nature of fuzzy inference is rarely exploited by fuzzy systems researchers. Hardware implementations, such |
| A Translation System for Enabling Data Mining Applications on GPUs (ACM) |
Abstract: Modern GPUs offer much computing power at a very modest cost. Even though CUDA and other related recent developments are accelerating the use of GPUs for general purpose applications, several challenges still remain in programming the GPUs. Thus, it is clearly desirable to be able to program GPUs using a higher-level interface. In this paper, we offer a solution that targets a specific class of applications, which are the data mining and scientific data analysis applications. Our work is driven by the observation that a common processing structure, that of generalized reductions, fits a large number of popular data mining algorithms. In our solution, the programmers simply need to specify the sequential reduction loop(s) with some additional information about the parameters. We use program analysis and code generation to map the applications to a GPU. Several additional optimizations are also performed by the system. We have evaluated our system using three popular data mining applications, k-means clustering, EM clustering, and Principal Component Analysis (PCA). The main observations from our experiments are as follows. The speedup that each of these applications achieve over a sequential CPU version ranges between 20 and 50. The automatically generated version did not have any noticeable overheads compared to hand written codes. Finally, the optimizations performed in the system resulted in significant performance improvements. |
| Data Visualization and Mining using the GPU |
Abstract: An exciting development in the computing industry has been the emergence of graphics processing units (the GPU) as a fast general purpose co-processor. Initially designed for gaming applications, todays GPUs demonstrate impressive computing power and high levels of parallelism and are now being used for a variety of applications far removed from traditional graphics rendering settings. |
| Data Mining Using Graphics Processing Units |
Abstract: During the last few years, Graphics Processing Units (GPU) have evolved from simple devices for the display signal preparation into powerful coprocessors that do not only support typical computer graphic tasks such as rendering of 3D scenarios but can also be used for general numeric and symbolic computation tasks such as simulation and optimization. As major advantage, GPUs provide extremely high parallelism (with several hundred simple programmable processors) combined with a high bandwidth in memory transfer at low cost. In this paper, we propose several algorithms for computationally expensive data mining tasks like similarity search and clustering which are designed for the highly parallel environment of a GPU. We define a multidimensional index structure which is particularly suited to support similarity queries under the restricted programming model of a GPU, and define a similarity join method. Moreover, we define highly parallel algorithms for density-based and partitioning clustering. In an extensive experimental evaluation, we demonstrate the superiority of our algorithms running on GPU over their conventional counterparts in CPU. |
| Real-time Foreground Segmentation on GPUs using Local Online Learning and Global Graph Cut Optimization |
Abstract: |
| Locally-Connected Hierarchical Neural Networks for GPU-Accelerated Object Recognition |
Abstract:
Convolutional neural networks have achieved good recognition results on image datasets while being computationally efficient, i.e., scaling well with the number of training patterns and the resolution of the patterns. Here we investigate a neural network model that has a similar hierarchical structure, but does not employ weight sharing. Instead, each neuron has a fixed receptive field with unique connection weights. To deal with the enormous number of weights resulting from this architecture, we implemented a parallel version of the model using Nvidia’s CUDA framework. This implementation is up to 82 times faster than a serial CPU implementation. Our model achieves state-of-the-art recognition performance on the NORB normalized-uniform dataset (2.87% error rate) and good results on the MNIST dataset (0.76% error rate). This suggests that large networks with local, non-shared connections might be an interesting architecture for object recognition tasks. To further evaluate the model, we created a large, publicly available training and testing set, which consists of objects extracted from the LabelMe natural image dataset.
|
| Towards Automated Learning of Object Detectors |
Abstract: |
| Active Structured Learning for High-Speed Object Detection |
Abstract: High-speed smooth and accurate visual tracking of objects in arbitrary, unstructured environments is essential for robotics and human motion analysis. However, building a system that can adapt to arbitrary objects and a wide range of lighting conditions is a challenging problem, especially if hard real-time constraints apply like in robotics scenarios. In this work, we introduce a method for learning a discriminative object tracking system based on the recent structured regression framework for object localization. Using a kernel function that allows fast evaluation on the GPU, the resulting system can process video streams at speed of 100 frames per second or more. Consecutive frames in high speed video sequences are typically very redundant, and for training an object detection system, it is sufficient to have training labels from only a subset of all images. We propose an active learning method that select training examples in a data-driven way, thereby minimizing the required number of training labeling. Experiments on realistic data show that the active learning is superior to previously used methods for dataset subsampling for this task. |
| Learning Two-View Stereo Matching |
Abstract: |

BayWebSoft