cudaMalloc in cuda 3.0, Segmentation fault on cudaMalloc

9 replies [Last post]
ArdiSSchool10
Offline
Joined: 07/08/2010

Hi all,

I am new at cuda programming. I am working on a project for parallelizing a sequential program.

I began using the cuda 2.3 version on a GPU X295 card. Then I obtained an access to a server with a Fermi card with cuda 3.0 on it.

When I try to run my program on this server with Fermi on it, I have a segmentation fault on the first cudaMalloc of the program.

I have tried some simple programs and they run well, maybe the problem is on Makefile?

Do I have to change any linkage library or what else can I do to resolve this problem?

Thank you to all the users and guests of this forum.

I attach part of my Makefile:

####### CUDA options

# path of cuda

CUDAPATH = /usr/local/cuda

# path of cuda compiler

NVCC = nvcc

# nvcc flags

#NVCC_FLAGS = -O0 -use_fast_math

NVCC_FLAGS = -G -g

#NVCC_FLAGS = -O0 -arch sm_13

# CXX_FLAGS = -I$(CUDAPATH)/include/

# linking library LD_FLAGS = -L$(CUDAPATH)/lib64/

# necessary at linking phase emulation

#LD_LIBRARIES = -lcublasemu

# necessary at linking phase device

LD_LIBRARIES = -lcuda -lcudart

Nadeem
Offline
Joined: 11/15/2009
cudaMalloc in cuda 3.0, Segmentation fault on cudaMalloc

Hi There,

I have not tried to repro - but I see you still have "-arch sm_13" as an NVCC flag. Fermi is architecture sm_20  , this may have an affect - don't know.

-Nadeem

ArdiSSchool10
Offline
Joined: 07/08/2010
Thank you Nadeem for your

Thank you Nadeem for your answer, but in my Makefile as well as above it is commented. I used these flags when I wanted to test the results of my program in double precision in the x295 card.

John Stratton
Offline
Joined: 12/08/2009
Code snippets?

Without code snippets it's harder to figure out.  Can you post just the lines of code causing the segfault, as well as the lines declaring any of the variables used in those lines?

--John

ArdiSSchool10
Offline
Joined: 07/08/2010
Here I report the snippets. I

Here I report the snippets. I have to underline that in the cuda 2.3 version in the x295 I have not a segmentation fault at this point. The problem is only when I run the program in the server with the Fermi card and cuda3.0.

int* mat;

size_mat=mat_length_max*VProd(cells)*sizeof(int);  // mat_length_max is a constant (in my case it has a value of 40) and VProd is a Macro which does cells.x*cells.y*cells.z (and in my case cells.x=cells.y=cells.z=2 )

cudaMalloc((void**)&mat, size_mat);  // here there is a segmentation fault and this is the first cudaMalloc of the programcudaError_t error=cudaGetLastError();if(error != cudaSuccess)printf("cudaMalloc: %s\n",cudaGetErrorString(error));

 

John Stratton
Offline
Joined: 12/08/2009
Yeah, I can't see anything

Yeah, I can't see anything obviously missing there.  I assume all of that is from the body of the same function?

It might mean there's a bug in your system or the CUDA version you installed, or it's a more complex problem than that. 

If you can run it in a debugger to figure out exactly what the address is that you're segfaulting on, and check the values exposed to that function (address of mat, size_mat, etc.) you might be able to track it down. 

 

--John

ArdiSSchool10
Offline
Joined: 07/08/2010
Hi John, thank you for your

Hi John,

thank you for your answer. I guess there is just a problem with CUDA drivers. It seems now the system adminsitrators of the server have done some reinstallation.

I have now this kind of problem: the program runs normaly, but when I try to debug it with cuda-gdb and the -g -G options it fails with a segmentation fault exactly in the lines I described above.

The system has 2 GPU cards: 1 Fermi and 1 tesla card. Do I have to explicitly target the program to run on one specific GPU card or can it be kind of another problem?

Thank you,

Ardita

 

John Stratton
Offline
Joined: 12/08/2009
Most likely another

Most likely another problem: by default the driver should automatically pick one of the GPUs for you.  You can make sure it's not a problem by adding in a cudaSetDevice call to initialize the runtime and pick as specific device before the malloc. 

ArdiSSchool10
Offline
Joined: 07/08/2010
The messages I have from

The messages I have from cuda-gdb are:

Breakpoint 1, cuda_allocation () at kernel.cu:133
133
size_mat=mat_length_max*VProd(cells)*sizeof(int);
Current language: auto; currently c++
(cuda-gdb) n
134 size_site=nSites*sizeof(Site);
(cuda-gdb)
135 size_neigh=nebrTabMax*sizeof(int);
(cuda-gdb)
136 size_nebrEl=nSites*sizeof(int);
(cuda-gdb)
137 size_inertia = nTypes;
(cuda-gdb)
138 size_massDev = nTypes;
(cuda-gdb)
140 size_interactionTypeDev1 = nebrTabMax;
(cuda-gdb)
141 size_sigDev = nTypes*nTypes;
(cuda-gdb)
142 size_epsDev = nTypes*nTypes;
(cuda-gdb)
143 size_productElectroMomentsDev = nTypes*nTypes;
(cuda-gdb)
145 widthTex = ceil(sqrt(nSites));
(cuda-gdb)
147 channelDesc_V = cudaCreateChannelDesc();
(cuda-gdb)
149 cudaMalloc((void**)&mat, size_mat);
(cuda-gdb)
Segmentation fault

I introduced an instruction of cudaSetDevice(0) at the beginning of this function and controls of any possible cuda errors with cudaGetLastError.

I also introduced control of errors after the cudaCreateChannelDesc() instruction.

The situation doesn't change. I have only noticed there is no cuda error before the segmentation fault.

But when I did the same, introducing cudaSetDevice(1) instead of device 0 (in the server with device quesry I detect device 0 which is a Fermi card, device 1 and device 2 which are GTX295 cards), I had a cuda error after cudaSetDevice(1): invalid argument

ArdiSSchool10
Offline
Joined: 07/08/2010