Define Documentation

#define CU_DEVICE_CPU ((CUdevice)-1)

#define CU_DEVICE_INVALID ((CUdevice)-2)

Device that represents an invalid device

#define CU_IPC_HANDLE_SIZE 64

CUDA IPC handle size

#define CU_LAUNCH_PARAM_BUFFER_POINTER ((void*)0x01)

Indicator that the next value in the extra parameter to cuLaunchKernel will be a pointer to a buffer containing all kernel parameters used for launching kernel f. This buffer needs to honor all alignment/padding requirements of the individual parameters. If CU_LAUNCH_PARAM_BUFFER_SIZE is not also specified in the extra array, then CU_LAUNCH_PARAM_BUFFER_POINTER will have no effect.

#define CU_LAUNCH_PARAM_BUFFER_SIZE ((void*)0x02)

Indicator that the next value in the extra parameter to cuLaunchKernel will be a pointer to a size_t which contains the size of the buffer specified with CU_LAUNCH_PARAM_BUFFER_POINTER. It is required that CU_LAUNCH_PARAM_BUFFER_POINTER also be specified in the extra array if the value associated with CU_LAUNCH_PARAM_BUFFER_SIZE is not zero.

#define CU_LAUNCH_PARAM_END ((void*)0x00)

End of array terminator for the extra parameter to cuLaunchKernel

#define CU_MEMHOSTALLOC_DEVICEMAP 0x02

If set, host memory is mapped into CUDA address space and cuMemHostGetDevicePointer() may be called on the host pointer. Flag for cuMemHostAlloc()

#define CU_MEMHOSTALLOC_PORTABLE 0x01

If set, host memory is portable between CUDA contexts. Flag for cuMemHostAlloc()

#define CU_MEMHOSTALLOC_WRITECOMBINED 0x04

#define CU_MEMHOSTREGISTER_DEVICEMAP 0x02

If set, host memory is mapped into CUDA address space and cuMemHostGetDevicePointer() may be called on the host pointer. Flag for cuMemHostRegister()

#define CU_MEMHOSTREGISTER_IOMEMORY 0x04

If set, the passed memory pointer is treated as pointing to some memory-mapped I/O space, e.g. belonging to a third-party PCIe device. On Windows the flag is a no-op. On Linux that memory is marked as non cache-coherent for the GPU and is expected to be physically contiguous. It may return CUDA_ERROR_NOT_PERMITTED if run as an unprivileged user, CUDA_ERROR_NOT_SUPPORTED on older Linux kernel versions. On all other platforms, it is not supported and CUDA_ERROR_NOT_SUPPORTED is returned. Flag for cuMemHostRegister()

#define CU_MEMHOSTREGISTER_PORTABLE 0x01

If set, host memory is portable between CUDA contexts. Flag for cuMemHostRegister()

#define CU_PARAM_TR_DEFAULT -1

For texture references loaded into the module, use default texunit from texture reference.

#define CU_STREAM_LEGACY ((CUstream)0x1)

Legacy stream handle

Stream handle that can be passed as a CUstream to use an implicit stream with legacy synchronization behavior.

See details of the .

#define CU_STREAM_PER_THREAD ((CUstream)0x2)

Per-thread stream handle

Stream handle that can be passed as a CUstream to use an implicit stream with per-thread synchronization behavior.

See details of the .

#define CU_TRSA_OVERRIDE_FORMAT 0x01

Override the texref format with a format inferred from the array. Flag for cuTexRefSetArray()

#define CU_TRSF_NORMALIZED_COORDINATES 0x02

Use normalized texture coordinates in the range [0,1) instead of [0,dim). Flag for cuTexRefSetFlags()

#define CU_TRSF_READ_AS_INTEGER 0x01

Read the texture as integers rather than promoting the values to floats in the range [0,1]. Flag for cuTexRefSetFlags()

#define CU_TRSF_SRGB 0x10

#define CUDA_ARRAY3D_2DARRAY 0x01

Deprecated, use CUDA_ARRAY3D_LAYERED

#define CUDA_ARRAY3D_CUBEMAP 0x04

If set, the CUDA array is a collection of six 2D arrays, representing faces of a cube. The width of such a CUDA array must be equal to its height, and Depth must be six. If CUDA_ARRAY3D_LAYERED flag is also set, then the CUDA array is a collection of cubemaps and Depth must be a multiple of six.

#define CUDA_ARRAY3D_DEPTH_TEXTURE 0x10

This flag if set indicates that the CUDA array is a DEPTH_TEXTURE.

#define CUDA_ARRAY3D_LAYERED 0x01

If set, the CUDA array is a collection of layers, where each layer is either a 1D or a 2D array and the Depth member of CUDA_ARRAY3D_DESCRIPTOR specifies the number of layers, not the depth of a 3D array.

#define CUDA_ARRAY3D_SURFACE_LDST 0x02

This flag must be set in order to bind a surface reference to the CUDA array

#define CUDA_ARRAY3D_TEXTURE_GATHER 0x08

This flag must be set in order to perform texture gather operations on a CUDA array.

#define CUDA_VERSION 8000

CUDA API version number

#define MAX_PLANES 3

Maximum number of planes per frame

Typedef Documentation

typedef struct CUarray_st* CUarray

CUDA array

typedef struct CUctx_st* CUcontext

CUDA context

typedef int CUdevice

CUDA device

typedef unsigned int CUdeviceptr

CUDA device pointer CUdeviceptr is defined as an unsigned integer type whose size matches the size of a pointer on the target platform.

enum CUeglFrameType

enum CUeglResourceLocationFlags

Resource location flags- sysmem or vidmem If the producer is on sysmem and CU_EGL_RESOURCE_LOCATION_VIDMEM is set, it will involve additional copy of the resource from sysmem to vidmem.

Enumerator:

CU_EGL_RESOURCE_LOCATION_SYSMEM

Resource location sysmem

CU_EGL_RESOURCE_LOCATION_VIDMEM

Resource location vidmem

enum CUevent_flags

Event creation flags

Enumerator:

CU_EVENT_DEFAULT

Default event flag

CU_EVENT_BLOCKING_SYNC

Event uses blocking synchronization

CU_EVENT_DISABLE_TIMING

Event will not record timing data

CU_EVENT_INTERPROCESS

Event is suitable for interprocess use. CU_EVENT_DISABLE_TIMING must be set

enum CUfilter_mode

Texture reference filtering modes

Enumerator:

CU_TR_FILTER_MODE_POINT

Point filter mode

CU_TR_FILTER_MODE_LINEAR

Linear filter mode

enum CUfunc_cache

Function cache configurations

Enumerator:

CU_FUNC_CACHE_PREFER_NONE

no preference for shared memory or L1 (default)

CU_FUNC_CACHE_PREFER_SHARED

prefer larger shared memory and smaller L1 cache

CU_FUNC_CACHE_PREFER_L1

prefer larger L1 cache and smaller shared memory

CU_FUNC_CACHE_PREFER_EQUAL

prefer equal sized L1 cache and shared memory

enum CUfunction_attribute

Function properties

Enumerator:

CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK

The maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.

CU_FUNC_ATTRIBUTE_SHARED_SIZE_BYTES

The size in bytes of statically-allocated shared memory required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.

CU_FUNC_ATTRIBUTE_CONST_SIZE_BYTES

The size in bytes of user-allocated constant memory required by this function.

CU_FUNC_ATTRIBUTE_LOCAL_SIZE_BYTES

The size in bytes of local memory used by each thread of this function.

CU_FUNC_ATTRIBUTE_NUM_REGS

The number of registers used by each thread of this function.

CU_FUNC_ATTRIBUTE_PTX_VERSION

The PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13. Note that this may return the undefined value of 0 for cubins compiled prior to CUDA 3.0.

CU_FUNC_ATTRIBUTE_BINARY_VERSION

The binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13. Note that this will return a value of 10 for legacy cubins that do not have a properly-encoded binary architecture version.

CU_FUNC_ATTRIBUTE_CACHE_MODE_CA

The attribute to indicate whether the function has been compiled with user specified option '-Xptxas --dlcm=ca' set .

enum CUgraphicsMapResourceFlags

Flags for mapping and unmapping interop resources

enum CUgraphicsRegisterFlags

Flags to register a graphics resource

enum CUipcMem_flags

CUDA Ipc Mem Flags

Enumerator:

CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS

Automatically enable peer access between remote devices as needed

enum CUjit_cacheMode

Caching modes for dlcm

Enumerator:

CU_JIT_CACHE_OPTION_NONE

Compile with no -dlcm flag specified

CU_JIT_CACHE_OPTION_CG

Compile with L1 cache disabled

CU_JIT_CACHE_OPTION_CA

Compile with L1 cache enabled

enum CUjit_fallback

Cubin matching fallback strategies

Enumerator:

CU_PREFER_PTX

Prefer to compile ptx if exact binary match not found

CU_PREFER_BINARY

Prefer to fall back to compatible binary code if exact match not found

enum CUjit_option

Online compiler and linker options

Enumerator:

CU_JIT_MAX_REGISTERS

Max number of registers that a thread may use.

Option type: unsigned int

Applies to: compiler only

CU_JIT_THREADS_PER_BLOCK

IN: Specifies minimum number of threads per block to target compilation for

OUT: Returns the number of threads the compiler actually targeted. This restricts the resource utilization fo the compiler (e.g. max registers) such that a block with the given number of threads should be able to launch based on register limitations. Note, this option does not currently take into account any other resource limitations, such as shared memory utilization.

Cannot be combined with CU_JIT_TARGET.

Option type: unsigned int

Applies to: compiler only

CU_JIT_WALL_TIME

Overwrites the option value with the total wall clock time, in milliseconds, spent in the compiler and linker

Option type: float

Applies to: compiler and linker

CU_JIT_INFO_LOG_BUFFER

Pointer to a buffer in which to print any log messages that are informational in nature (the buffer size is specified via option CU_JIT_INFO_LOG_BUFFER_SIZE_BYTES)

The below jit options are used for internal purposes only, in this version of CUDA

enum CUjit_target

Online compilation targets

Enumerator:

CU_TARGET_COMPUTE_10

Compute device class 1.0

CU_TARGET_COMPUTE_11

Compute device class 1.1

CU_TARGET_COMPUTE_12

Compute device class 1.2

CU_TARGET_COMPUTE_13

Compute device class 1.3

CU_TARGET_COMPUTE_20

Compute device class 2.0

CU_TARGET_COMPUTE_21

Compute device class 2.1

CU_TARGET_COMPUTE_30

Compute device class 3.0

CU_TARGET_COMPUTE_32

Compute device class 3.2

CU_TARGET_COMPUTE_35

Compute device class 3.5

CU_TARGET_COMPUTE_37

Compute device class 3.7

CU_TARGET_COMPUTE_50

Compute device class 5.0

CU_TARGET_COMPUTE_52

Compute device class 5.2

CU_TARGET_COMPUTE_53

Compute device class 5.3

CU_TARGET_COMPUTE_60

Compute device class 6.0. This must be removed for CUDA 7.0 toolkit. See bug 1518217.

CU_TARGET_COMPUTE_61

Compute device class 6.1. This must be removed for CUDA 7.0 toolkit.

CU_TARGET_COMPUTE_62

Compute device class 6.2. This must be removed for CUDA 7.0 toolkit.

enum CUjitInputType

Device code formats

Enumerator:

CU_JIT_INPUT_CUBIN

Compiled device-class-specific device code

Applicable options: none

CU_JIT_INPUT_PTX

PTX source code

Applicable options: PTX compiler options

CU_JIT_INPUT_FATBINARY

Bundle of multiple cubins and/or PTX of some device code

Applicable options: PTX compiler options, CU_JIT_FALLBACK_STRATEGY

CU_JIT_INPUT_OBJECT

Host object with embedded device code

Applicable options: PTX compiler options, CU_JIT_FALLBACK_STRATEGY

CU_JIT_INPUT_LIBRARY

Archive of host objects with embedded device code

Applicable options: PTX compiler options, CU_JIT_FALLBACK_STRATEGY

enum CUlimit

Limits

Enumerator:

CU_LIMIT_STACK_SIZE

GPU thread stack size

CU_LIMIT_PRINTF_FIFO_SIZE

GPU printf FIFO size

CU_LIMIT_MALLOC_HEAP_SIZE

GPU malloc heap size

CU_LIMIT_DEV_RUNTIME_SYNC_DEPTH

GPU device runtime launch synchronize depth

CU_LIMIT_DEV_RUNTIME_PENDING_LAUNCH_COUNT

GPU device runtime pending launch count

enum CUmem_advise

Memory advise values

Enumerator:

CU_MEM_ADVISE_SET_READ_MOSTLY

Data will mostly be read and only occasionally be written to

CU_MEM_ADVISE_UNSET_READ_MOSTLY

Undo the effect of CU_MEM_ADVISE_SET_READ_MOSTLY

CU_MEM_ADVISE_SET_PREFERRED_LOCATION

Set the preferred location for the data as the specified device

CU_MEM_ADVISE_UNSET_PREFERRED_LOCATION

Clear the preferred location for the data

CU_MEM_ADVISE_SET_ACCESSED_BY

Data will be accessed by the specified device, so prevent page faults as much as possible

CU_MEM_ADVISE_UNSET_ACCESSED_BY

Let the Unified Memory subsystem decide on the page faulting policy for the specified device

enum CUmem_range_attribute

Enumerator:

CU_MEM_RANGE_ATTRIBUTE_READ_MOSTLY

Whether the range will mostly be read and only occasionally be written to

CU_MEM_RANGE_ATTRIBUTE_PREFERRED_LOCATION

The preferred location of the range

CU_MEM_RANGE_ATTRIBUTE_ACCESSED_BY

Memory range has CU_MEM_ADVISE_SET_ACCESSED_BY set for specified device

CU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION

The last location to which the range was prefetched

enum CUmemAttach_flags

CUDA Mem Attach Flags

Enumerator:

CU_MEM_ATTACH_GLOBAL

Memory can be accessed by any stream on any device

CU_MEM_ATTACH_HOST

Memory cannot be accessed by any stream on any device

CU_MEM_ATTACH_SINGLE

Memory can only be accessed by a single stream on the associated device

enum CUmemorytype

Memory types

Enumerator:

CU_MEMORYTYPE_HOST

Host memory

CU_MEMORYTYPE_DEVICE

Device memory

CU_MEMORYTYPE_ARRAY

Array memory

CU_MEMORYTYPE_UNIFIED

Unified device or host memory

enum CUoccupancy_flags

Occupancy calculator flag

Enumerator:

CU_OCCUPANCY_DEFAULT

Default behavior

CU_OCCUPANCY_DISABLE_CACHING_OVERRIDE

Assume global caching is enabled and cannot be automatically turned off

enum CUpointer_attribute

Pointer information

Enumerator:

CU_POINTER_ATTRIBUTE_CONTEXT

The CUcontext on which a pointer was allocated or registered

CU_POINTER_ATTRIBUTE_MEMORY_TYPE

The CUmemorytype describing the physical location of a pointer

CU_POINTER_ATTRIBUTE_DEVICE_POINTER

The address at which a pointer's memory may be accessed on the device

CU_POINTER_ATTRIBUTE_HOST_POINTER

The address at which a pointer's memory may be accessed on the host

CU_POINTER_ATTRIBUTE_P2P_TOKENS

A pair of tokens for use with the nv-p2p.h Linux kernel interface

CU_POINTER_ATTRIBUTE_SYNC_MEMOPS

Synchronize every synchronous memory operation initiated on this region

CU_POINTER_ATTRIBUTE_BUFFER_ID

A process-wide unique ID for an allocated memory region

CU_POINTER_ATTRIBUTE_IS_MANAGED

Indicates if the pointer points to managed memory

enum CUresourcetype

Resource types

Enumerator:

CU_RESOURCE_TYPE_ARRAY

Array resource

CU_RESOURCE_TYPE_MIPMAPPED_ARRAY

Mipmapped array resource

CU_RESOURCE_TYPE_LINEAR

Linear resource

CU_RESOURCE_TYPE_PITCH2D

Pitch 2D resource

enum CUresourceViewFormat

Resource view format

Enumerator:

CU_RES_VIEW_FORMAT_NONE

No resource view format (use underlying resource format)

CU_RES_VIEW_FORMAT_UINT_1X8

1 channel unsigned 8-bit integers

CU_RES_VIEW_FORMAT_UINT_2X8

2 channel unsigned 8-bit integers

CU_RES_VIEW_FORMAT_UINT_4X8

4 channel unsigned 8-bit integers

CU_RES_VIEW_FORMAT_SINT_1X8

1 channel signed 8-bit integers

CU_RES_VIEW_FORMAT_SINT_2X8

2 channel signed 8-bit integers

CU_RES_VIEW_FORMAT_SINT_4X8

4 channel signed 8-bit integers

CU_RES_VIEW_FORMAT_UINT_1X16

1 channel unsigned 16-bit integers

CU_RES_VIEW_FORMAT_UINT_2X16

2 channel unsigned 16-bit integers

CU_RES_VIEW_FORMAT_UINT_4X16

4 channel unsigned 16-bit integers

CU_RES_VIEW_FORMAT_SINT_1X16

1 channel signed 16-bit integers

CU_RES_VIEW_FORMAT_SINT_2X16

2 channel signed 16-bit integers

CU_RES_VIEW_FORMAT_SINT_4X16

4 channel signed 16-bit integers

CU_RES_VIEW_FORMAT_UINT_1X32

1 channel unsigned 32-bit integers

CU_RES_VIEW_FORMAT_UINT_2X32

2 channel unsigned 32-bit integers

CU_RES_VIEW_FORMAT_UINT_4X32

4 channel unsigned 32-bit integers

CU_RES_VIEW_FORMAT_SINT_1X32

1 channel signed 32-bit integers

CU_RES_VIEW_FORMAT_SINT_2X32

2 channel signed 32-bit integers

CU_RES_VIEW_FORMAT_SINT_4X32

4 channel signed 32-bit integers

CU_RES_VIEW_FORMAT_FLOAT_1X16

1 channel 16-bit floating point

CU_RES_VIEW_FORMAT_FLOAT_2X16

2 channel 16-bit floating point

CU_RES_VIEW_FORMAT_FLOAT_4X16

4 channel 16-bit floating point

CU_RES_VIEW_FORMAT_FLOAT_1X32

1 channel 32-bit floating point

CU_RES_VIEW_FORMAT_FLOAT_2X32

2 channel 32-bit floating point

CU_RES_VIEW_FORMAT_FLOAT_4X32

4 channel 32-bit floating point

CU_RES_VIEW_FORMAT_UNSIGNED_BC1

Block compressed 1

CU_RES_VIEW_FORMAT_UNSIGNED_BC2

Block compressed 2

CU_RES_VIEW_FORMAT_UNSIGNED_BC3

Block compressed 3

CU_RES_VIEW_FORMAT_UNSIGNED_BC4

Block compressed 4 unsigned

CU_RES_VIEW_FORMAT_SIGNED_BC4

Block compressed 4 signed

CU_RES_VIEW_FORMAT_UNSIGNED_BC5

Block compressed 5 unsigned

CU_RES_VIEW_FORMAT_SIGNED_BC5

Block compressed 5 signed

CU_RES_VIEW_FORMAT_UNSIGNED_BC6H

Block compressed 6 unsigned half-float

CU_RES_VIEW_FORMAT_SIGNED_BC6H

Block compressed 6 signed half-float

CU_RES_VIEW_FORMAT_UNSIGNED_BC7

Block compressed 7

enum CUresult

Error codes

Enumerator:

CUDA_SUCCESS

The API call returned with no errors. In the case of query calls, this can also mean that the operation being queried is complete (see cuEventQuery() and cuStreamQuery()).

CUDA_ERROR_INVALID_VALUE

This indicates that one or more of the parameters passed to the API call is not within an acceptable range of values.

CUDA_ERROR_OUT_OF_MEMORY

The API call failed because it was unable to allocate enough memory to perform the requested operation.

CUDA_ERROR_NOT_INITIALIZED

This indicates that the CUDA driver has not been initialized with cuInit() or that initialization has failed.

CUDA_ERROR_DEINITIALIZED

This indicates that the CUDA driver is in the process of shutting down.

CUDA_ERROR_PROFILER_DISABLED

This indicates profiler is not initialized for this run. This can happen when the application is running with external profiling tools like visual profiler.

CUDA_ERROR_PROFILER_NOT_INITIALIZED

Deprecated

This error return is deprecated as of CUDA 5.0. It is no longer an error to attempt to enable/disable the profiling via cuProfilerStart or cuProfilerStop without initialization.

CUDA_ERROR_PROFILER_ALREADY_STARTED

Deprecated

This error return is deprecated as of CUDA 5.0. It is no longer an error to call cuProfilerStart() when profiling is already enabled.

CUDA_ERROR_PROFILER_ALREADY_STOPPED

Deprecated

This error return is deprecated as of CUDA 5.0. It is no longer an error to call cuProfilerStop() when profiling is already disabled.

CUDA_ERROR_NO_DEVICE

This indicates that no CUDA-capable devices were detected by the installed CUDA driver.

CUDA_ERROR_INVALID_DEVICE

This indicates that the device ordinal supplied by the user does not correspond to a valid CUDA device.

CUDA_ERROR_INVALID_IMAGE

This indicates that the device kernel image is invalid. This can also indicate an invalid CUDA module.

CUDA_ERROR_INVALID_CONTEXT

This most frequently indicates that there is no context bound to the current thread. This can also be returned if the context passed to an API call is not a valid handle (such as a context that has had cuCtxDestroy() invoked on it). This can also be returned if a user mixes different API versions (i.e. 3010 context with 3020 API calls). See cuCtxGetApiVersion() for more details.

CUDA_ERROR_CONTEXT_ALREADY_CURRENT

This indicated that the context being supplied as a parameter to the API call was already the active context.

Deprecated

This error return is deprecated as of CUDA 3.2. It is no longer an error to attempt to push the active context via cuCtxPushCurrent().

CUDA_ERROR_MAP_FAILED

This indicates that a map or register operation has failed.

CUDA_ERROR_UNMAP_FAILED

This indicates that an unmap or unregister operation has failed.

CUDA_ERROR_ARRAY_IS_MAPPED

This indicates that the specified array is currently mapped and thus cannot be destroyed.

CUDA_ERROR_ALREADY_MAPPED

This indicates that the resource is already mapped.

CUDA_ERROR_NO_BINARY_FOR_GPU

This indicates that there is no kernel image available that is suitable for the device. This can occur when a user specifies code generation options for a particular CUDA source file that do not include the corresponding device configuration.

CUDA_ERROR_ALREADY_ACQUIRED

This indicates that a resource has already been acquired.

CUDA_ERROR_NOT_MAPPED

This indicates that a resource is not mapped.

CUDA_ERROR_NOT_MAPPED_AS_ARRAY

This indicates that a mapped resource is not available for access as an array.

CUDA_ERROR_NOT_MAPPED_AS_POINTER

This indicates that a mapped resource is not available for access as a pointer.

CUDA_ERROR_ECC_UNCORRECTABLE

This indicates that an uncorrectable ECC error was detected during execution.

CUDA_ERROR_UNSUPPORTED_LIMIT

This indicates that the CUlimit passed to the API call is not supported by the active device.

CUDA_ERROR_CONTEXT_ALREADY_IN_USE

This indicates that the CUcontext passed to the API call can only be bound to a single CPU thread at a time but is already bound to a CPU thread.

CUDA_ERROR_PEER_ACCESS_UNSUPPORTED

This indicates that peer access is not supported across the given devices.

CUDA_ERROR_INVALID_PTX

This indicates that a PTX JIT compilation failed.

CUDA_ERROR_INVALID_GRAPHICS_CONTEXT

This indicates an error with OpenGL or DirectX context.

CUDA_ERROR_NVLINK_UNCORRECTABLE

This indicates that an uncorrectable NVLink error was detected during the execution.

CUDA_ERROR_INVALID_SOURCE

This indicates that the device kernel source is invalid.

CUDA_ERROR_FILE_NOT_FOUND

This indicates that the file specified was not found.

CUDA_ERROR_SHARED_OBJECT_SYMBOL_NOT_FOUND

This indicates that a link to a shared object failed to resolve.

CUDA_ERROR_SHARED_OBJECT_INIT_FAILED

This indicates that initialization of a shared object failed.

CUDA_ERROR_OPERATING_SYSTEM

This indicates that an OS call failed.

CUDA_ERROR_INVALID_HANDLE

This indicates that a resource handle passed to the API call was not valid. Resource handles are opaque types like CUstream and CUevent.

CUDA_ERROR_NOT_FOUND

This indicates that a named symbol was not found. Examples of symbols are global/constant variable names, texture names, and surface names.

CUDA_ERROR_NOT_READY

This indicates that asynchronous operations issued previously have not completed yet. This result is not actually an error, but must be indicated differently than CUDA_SUCCESS (which indicates completion). Calls that may return this value include cuEventQuery() and cuStreamQuery().

CUDA_ERROR_ILLEGAL_ADDRESS

While executing a kernel, the device encountered a load or store instruction on an invalid memory address. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

This indicates that a launch did not occur because it did not have appropriate resources. This error usually indicates that the user has attempted to pass too many arguments to the device kernel, or the kernel launch specifies too many threads for the kernel's register count. Passing arguments of the wrong size (i.e. a 64-bit pointer when a 32-bit int is expected) is equivalent to passing too many arguments and can also result in this error.

CUDA_ERROR_LAUNCH_TIMEOUT

This indicates that the device kernel took too long to execute. This can only occur if timeouts are enabled - see the device attribute CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT for more information. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_LAUNCH_INCOMPATIBLE_TEXTURING

This error indicates a kernel launch that uses an incompatible texturing mode.

CUDA_ERROR_PEER_ACCESS_ALREADY_ENABLED

This error indicates that a call to cuCtxEnablePeerAccess() is trying to re-enable peer access to a context which has already had peer access to it enabled.

CUDA_ERROR_PEER_ACCESS_NOT_ENABLED

This error indicates that cuCtxDisablePeerAccess() is trying to disable peer access which has not been enabled yet via cuCtxEnablePeerAccess().

CUDA_ERROR_PRIMARY_CONTEXT_ACTIVE

This error indicates that the primary context for the specified device has already been initialized.

CUDA_ERROR_CONTEXT_IS_DESTROYED

This error indicates that the context current to the calling thread has been destroyed using cuCtxDestroy, or is a primary context which has not yet been initialized.

CUDA_ERROR_ASSERT

A device-side assert triggered during kernel execution. The context cannot be used anymore, and must be destroyed. All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA.

CUDA_ERROR_TOO_MANY_PEERS

This error indicates that the hardware resources required to enable peer access have been exhausted for one or more of the devices passed to cuCtxEnablePeerAccess().

CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED

This error indicates that the memory range passed to cuMemHostRegister() has already been registered.

CUDA_ERROR_HOST_MEMORY_NOT_REGISTERED

This error indicates that the pointer passed to cuMemHostUnregister() does not correspond to any currently registered memory region.

CUDA_ERROR_HARDWARE_STACK_ERROR

While executing a kernel, the device encountered a stack error. This can be due to stack corruption or exceeding the stack size limit. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_ILLEGAL_INSTRUCTION

While executing a kernel, the device encountered an illegal instruction. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_MISALIGNED_ADDRESS

While executing a kernel, the device encountered a load or store instruction on a memory address which is not aligned. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_INVALID_ADDRESS_SPACE

While executing a kernel, the device encountered an instruction which can only operate on memory locations in certain address spaces (global, shared, or local), but was supplied a memory address not belonging to an allowed address space. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_INVALID_PC

While executing a kernel, the device program counter wrapped its address space. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_LAUNCH_FAILED

An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointer and accessing out of bounds shared memory. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.

CUDA_ERROR_NOT_PERMITTED

This error indicates that the attempted operation is not permitted.

CUDA_ERROR_NOT_SUPPORTED

This error indicates that the attempted operation is not supported on the current system or device.

CUDA_ERROR_UNKNOWN

This indicates that an unknown internal error has occurred.

enum CUsharedconfig

Shared memory configurations

Enumerator:

CU_SHARED_MEM_CONFIG_DEFAULT_BANK_SIZE

set default shared memory bank size

CU_SHARED_MEM_CONFIG_FOUR_BYTE_BANK_SIZE

set shared memory bank width to four bytes

CU_SHARED_MEM_CONFIG_EIGHT_BYTE_BANK_SIZE

set shared memory bank width to eight bytes

enum CUstream_flags

Stream creation flags

Enumerator:

CU_STREAM_DEFAULT

Default stream flag

CU_STREAM_NON_BLOCKING

Stream does not synchronize with stream 0 (the NULL stream)

enum CUstreamBatchMemOpType

Operations for cuStreamBatchMemOp

Enumerator:

CU_STREAM_MEM_OP_WAIT_VALUE_32

Represents a cuStreamWaitValue32 operation

CU_STREAM_MEM_OP_WRITE_VALUE_32

Represents a cuStreamWriteValue32 operation

CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES

This has the same effect as CU_STREAM_WAIT_VALUE_FLUSH, but as a standalone operation.

Follow the wait operation with a flush of outstanding remote writes. This means that, if a remote write operation is guaranteed to have reached the device before the wait can be satisfied, that write is guaranteed to be visible to downstream device work. The device is permitted to reorder remote writes internally. For example, this flag would be required if two remote writes arrive in a defined order, the wait is satisfied by the second write, and downstream work needs to observe the first write.

enum CUstreamWriteValue_flags

Flags for cuStreamWriteValue32

Enumerator:

CU_STREAM_WRITE_VALUE_DEFAULT

Default behavior

CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER

Permits the write to be reordered with writes which were issued before it, as a performance optimization. Normally, cuStreamWriteValue32 will provide a memory fence before the write, which has similar semantics to __threadfence_system() but is scoped to the stream rather than a CUDA thread.