Global device variables in CUDA: bad practice? -

- January 15, 2013

i designing library has large contingent of cuda kernels perform parallel computations. kernels acting on common object, computational grid, defined using c++ style objects. computational domain doesn't need accessed host side, creating on device side , keeping there makes sense now. i'm wondering if following considered "good practice":

suppose computational grid class called domain. first define global device-side variable store computational domain:
```
__device__ domain* d 
```

then initialize computational domain using cuda kernel

__global__ void initdomain(paramtype p){     d = new domain(p); }

then, perform computations using domain other kernels:

__global__ void docomputation(double *x,double *y){     d->dothing(x,y);     //... }

if domain remains fixed (i.e. kernels don't modify domain once it's created), ok? there better way? tried creating domain object on host side , copying on device, turned out hassle because domain relatively complex type makes pain copy on using e.g. cudamemcpy or thrust::device_new (at least, couldn't work nicely).

yes it's ok.

maybe can improve performance using

__constant__

using keyword, object available in kernels in fast memory.

in order copy object, must use : cudamemcpytosymbol, please note there come restriction : object read-only in device code, , must don't have default constructor.

you can find informations here

if object complex , hard copy, maybe can : unified memory, pass variable value kernel.

Search This Blog

Click Hand

Global device variables in CUDA: bad practice? -

Comments

Post a Comment

Popular posts from this blog

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -

apache - setting document root in antoher partition on ubuntu -

python - pip install -U PySide error -