real time pipelined data processing
I am investigating using a Tesla(s) instead of 64 DSPs for the next generation of an adaptive optics system. With the DSPs, images from a high speed camera are feed directly into the DSPs via DMA transfers, the each DSP processes part of the image and computes part of a large matrix multiplication. The results are summed together into one DSP which then commands a large deformable mirror via another DMA transfer.
I have two basic questions.
1. Is it possible/reasonable to load a program into a Tesla and then continuously feed, via PCI bus DMA transfers, sets of data to the Tesla to be processed?
2. Is it possible/reasonable for the Tesla to DMA the results of processing each data set to another device on the PCI bus?
Another concern I have is programming language. In earlier DSP adaptive optics systems we have designed, we achieved almost a factor of 10 decrease in processing time when we programmed the DSPs in assembly language rather than C. Is assembly language programming of the Tesla supported and reasonable to do?