Current application of GPU processors for parallel computing tasks show excellent results in terms of speed-ups compared to CPU processors. However, there is no existing framework that enables automatic distribution of data and processing across multiple GPUs, modularity of kernel design, and efficient co-usage of CPU and GPU processors. All these elements are necessary conditions to enable users to easily perform 'Big Data' analysis, and to create their own modules for their desired processing functionality. We propose a framework for in-memory 'Big Text Data' analytics that provides mechanisms for automatic data segmentation, distribution, execution, and result retrieval across multiple cards (CPU, GPU & FPGA) and machines, and a modular design for easy addition of new GPU kernels. The architecture and components of the framework such as multi-card data distribution and execution, data structures for efficient memory access, algorithms for parallel GPU computation, and result retrieval are described in detail, and some of the kernels in the framework are evaluated using Big Data versus multi-core CPUs to demonstrate the performance and feasibility of using it for 'Big Data' analytics, providing alternative and cheaper HPC solution.
Paper available at IEEE.