The attached code is described in the paper "Floating-Point Data Compression at 75 Gb/s on a GPU", available from the ACM (http://dl.acm.org/citation.cfm?doid=1964179.1964189) or at http://www.cs.txstate.edu/~mb92/papers/gpgpu11.pdf.

For up-to-date code, compilation instructions and command-line examples, please refer to: http://www.cs.txstate.edu/~burtscher/research/GFC/

GFC is a GPU-based compressor/decompressor written in CUDA C for binary IEEE 754 64-bit double-precision floating-point data. Two versions are provided: v1 is targeted for compute capability 1.x devices and v2 for devices of compute capability 2.x.

Sample little-endian datasets are available at: http://www.csl.cornell.edu/~burtscher/research/FPC/datasets.html

The executable accepts 3 arguments in compress mode: x, y, and z. It compresses/decompresses the data set using x blocks and y warps/block, as well as an optional z parameter, which specifies the dimensionality of the data set (and which defaults, when not specified, to z=1). Decompression requires no arguments.

Data is compressed/decompressed from standard input and written to standard output. For best performance, x*y is recommended to be the maximum number of warps resident on the target GPU.

Important: The provided code is not meant to be used as is. (While it should work correctly, it is slow due to sequential data transfers.) Rather, it is meant as an example of how to call the parallel compression and decompression routines from your own code.

Note that the raw data file has to be a multiple of 8 bytes long and should contain nothing but binary double-precision values. Only little-endian systems are currently supported.