Global device variables in CUDA: bad practice? -


i designing library has large contingent of cuda kernels perform parallel computations. kernels acting on common object, computational grid, defined using c++ style objects. computational domain doesn't need accessed host side, creating on device side , keeping there makes sense now. i'm wondering if following considered "good practice":

  • suppose computational grid class called domain. first define global device-side variable store computational domain:

    __device__ domain* d 
  • then initialize computational domain using cuda kernel

    __global__ void initdomain(paramtype p){     d = new domain(p); } 
  • then, perform computations using domain other kernels:

    __global__ void docomputation(double *x,double *y){     d->dothing(x,y);     //... } 

if domain remains fixed (i.e. kernels don't modify domain once it's created), ok? there better way? tried creating domain object on host side , copying on device, turned out hassle because domain relatively complex type makes pain copy on using e.g. cudamemcpy or thrust::device_new (at least, couldn't work nicely).

yes it's ok.

maybe can improve performance using

__constant__  

using keyword, object available in kernels in fast memory.

in order copy object, must use : cudamemcpytosymbol, please note there come restriction : object read-only in device code, , must don't have default constructor.

you can find informations here

if object complex , hard copy, maybe can : unified memory, pass variable value kernel.


Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -