Massively parallel mappings with near-linear scaling behavior
How do you map your problems to GPGPUs? What sort of scaling and performance do you get?
For machine learning, I use the mapping I originally created for the connection machine when I was a staff scientist at Los Alamos National Laboratory. Please see the attached PowerPoint slides, which are exerpted from the SC09 talk I gave in the TACC (Texas Advanced Computer) booth to demonstrate how the TACC Ranger supercomputer was able to deliver 363 TF/s with near-linear scaling to 60,000 cores. Apparently, you have to login in to see the attached slides.
This same mapping achieves very high performance and near-linear scaling on GPGPUs. Sorry, the one scaling plot for GPUs is several years old and is quite dated. However, the scaling behavior is valid for current systems.
It also works great for optimization problems such as training neural networks on a variety of machine architectures including GPGPUs, SIMD, MIMD, vector, vector parallel, and cluster systems. Perhaps it will benefit your applications.
Aside from ANNs, this same mapping can be generally applied to other problems such as:
- Locally Weighted Linear Regression (LWLR) LWLR
- Naive Bayes (NB)
- Gaussian Discriminative Analysis (GDA)
- Logistic Regression (LR)
- Independent Component Analysis (ICA)
- Expectation Maximization (EM)
- Support Vector Machine (SVM)
- Others: (MDS, Ordinal MDS, etcetera)
I have personally used – and continue to use – this same mapping on a wide-variety of collaborations and publications including:
- Nonlinear systems modeling
- Optimization problems
- Predicting coding regions in Eukaryotic DNA
- Computationally searching for ligands with high binding affinities to desired target substrate
- Numerical bifurcation analysis in Rayleigh-Bérnard convection experiments
- Numerical integration and many other problems
|gpucomputing TACC presentation.pps||1.23 MB|