During GPU-enabled applications optimization a lot of "magic" constants are introduced, such as block size, parallelism granularity, thread weight, and so on. Although these constants affect performance directly, it is impossible to determine their optimal values statistically for the general case because they depend on both the data specificity and the accelerator architecture. This paper describes an approach to these constants tailoring automation that allows to speedup considered benchmarks 20-60 percent at average.

Inst. of Space Technol. & Comput. Sci., Siberian Fed. Univ., Krasnoyarsk, Russia
