Accelerating wavelet-based video coding on graphics hardware using CUDA (IEEE)
The discrete wavelet transform (DWT) has a wide range of applications from signal processing to video and image compression. This transform, by means of the lifting scheme, can be performed in a memory and computation efficient way on modern, programmable GPUs, which can be regarded as massively parallel co-processors through NVidia's CUDA compute paradigm. The method is scalable and the fastest GPU implementation among the methods considered. We have integrated our DWT into the Dirac wavelet video codec (DWVC), of which the overlapped block motion compensation and frame arithmetic have been accelerated using CUDA as well.
Paper available at IEEE.