SkePU  1.2
 All Classes Namespaces Files Functions Variables Enumerations Friends Macros Groups Pages
Public Member Functions | List of all members
skepu::DeviceMemPointer_CU< T > Class Template Reference

A class representing a CUDA device memory allocation for container. More...

#include <device_mem_pointer_cu.h>

Inherits skepu::MemPointerBase.

Public Member Functions

 DeviceMemPointer_CU (T *start, size_t numElements, Device_CU *device, std::string name="")
 
 DeviceMemPointer_CU (T *start, size_t rows, size_t cols, Device_CU *device, bool usePitch=false, std::string name="")
 
 ~DeviceMemPointer_CU ()
 
void copyHostToDevice (size_t numElements=0) const
 
void copyDeviceToHost (size_t numElements=0) const
 
void copiesOverlapInf (DeviceMemPointer_CU< T > *otherCopy, UpdateInf< T > *updateStruct, size_t &sizeUpdStr)
 
T * getDeviceDataPointer () const
 
unsigned int getDeviceID () const
 
void changeDeviceData ()
 
bool deviceDataHasChanged () const
 
void markCopyInvalid ()
 
bool isCopyValid () const
 
bool doCopiesOverlap (DeviceMemPointer_CU< T > *otherCopy, bool oneUnitCheck=false)
 
bool doOverlapAndCoverFully (DeviceMemPointer_CU< T > *otherCopy)
 
bool doRangeOverlap (T *hostPtr, size_t numElements)
 
void copyInfFromHostToDevice (UpdateInf< T > *updateStruct, size_t &sizeUpdStr)
 
void copyAllRangesToDevice (UpdateInf< T > *updateStruct, const size_t sizeUpdStr, size_t streamID=0)
 
void resetRanges ()
 

Detailed Description

template<typename T>
class skepu::DeviceMemPointer_CU< T >

A class representing a CUDA device memory allocation for container.

This class represents a CUDA device 1D memory allocation and controls the data transfers between host and device.

Constructor & Destructor Documentation

template<typename T >
skepu::DeviceMemPointer_CU< T >::DeviceMemPointer_CU ( T *  start,
size_t  numElements,
Device_CU device,
std::string  name = "" 
)

The constructor allocates a certain amount of space in device memory and stores a pointer to some data in host memory.

Parameters
startPointer to data in host memory.
numElementsNumber of elements to allocate memory for.
devicepointer to Device_CU object of a valid CUDA device to allocate memory on.

ranges that should be checked for overlap with other copies

References skepu::DeviceAllocations_CU< T >::addAllocation(), skepu::Device_CU::getDeviceID(), and skepu::DeviceAllocations_CU< T >::getInstance().

Here is the call graph for this function:

template<typename T >
skepu::DeviceMemPointer_CU< T >::DeviceMemPointer_CU ( T *  start,
size_t  rows,
size_t  cols,
Device_CU device,
bool  usePitch = false,
std::string  name = "" 
)

The constructor allocates a certain amount of space in device memory and stores a pointer to some data in host memory.

Parameters
startPointer to data in host memory.
rowsNumber of rows to allocate memory for.
colsNumber of columns to allocate memory for.
devicepointer to Device_CU object of a valid CUDA device to allocate memory on.
usePitchTo specify whether to use padding to ensure proper coalescing for row-wise access from CUDA global memory.

ranges that should be checked for overlap with other copies

References skepu::DeviceAllocations_CU< T >::addAllocation(), skepu::Device_CU::getDeviceID(), and skepu::DeviceAllocations_CU< T >::getInstance().

Here is the call graph for this function:

template<typename T >
skepu::DeviceMemPointer_CU< T >::~DeviceMemPointer_CU ( )

The destructor releases the allocated device memory.

References skepu::DeviceAllocations_CU< T >::getInstance(), and skepu::DeviceAllocations_CU< T >::removeAllocation().

Here is the call graph for this function:

Member Function Documentation

template<typename T >
void skepu::DeviceMemPointer_CU< T >::changeDeviceData ( )
template<typename T >
void skepu::DeviceMemPointer_CU< T >::copiesOverlapInf ( DeviceMemPointer_CU< T > *  otherCopy,
UpdateInf< T > *  updateStruct,
size_t &  sizeUpdStr 
)

Finds out what (and how much) elements needs be copied in the current device copy from the passed copy. If the passed copy is a superset then all required elements can be copied otherwise some ranges may need to be copied from other sources...

Parameters
otherCopyThe other device copy
updateStructThe array of structure which is updated with new entries regarding data copy information
sizeUpdStrthe length of updateStruct array

get pointer and size information for a range that is not copied yet

first check whether this range that is still not copied overlaps with this potential source copy, i.e., otherCopy, if not skip this range for further processing

Scenario 1: otherCopy covers the whole what needs to copied... no need to copy anything from host....

delete this configuration...

Scenario 2: otherCopy is fully nested from both sides... need to copy from some other source both to left and to the right....

delete this configuration...

add configuration for left part

add configuration for right part

Scenario 3: otherCopy is partially nested from left side... need to copy from host to left....

delete this configuration...

add configuration for left part

Scenario 4: otherCopy is partially nested from right side... need to copy from host to right....

delete this configuration...

add configuration for right part

References skepu::DeviceMemPointer_CU< T >::doRangeOverlap(), MAX_COPYINF_SIZE, and MAX_RANGES.

Referenced by skepu::Vector< T >::copyDataToAnInvalidDeviceCopy(), and skepu::Matrix< T >::copyDataToAnInvalidDeviceCopy().

Here is the call graph for this function:

template<typename T >
void skepu::DeviceMemPointer_CU< T >::copyAllRangesToDevice ( UpdateInf< T > *  updateStruct,
const size_t  sizeUpdStr,
size_t  streamID = 0 
)

Copies all ranges from other device copies (in same or different device memories) and from the main-copy that resides in host memory. A copy plan is passed as argument that specifies what needs to be copied from what source. TODO: The method may optimize data transfers by overlapping possible communications.

Parameters
updateStructan array of structs containing information about different HTD/DTD/DTH copies that need to be carried out
sizeUpdStrmarks the length of the updateStruct array.
streamIDthe CUDA Stream ID to possibly overlap HtD transfers with Kernel executions (define USE_MULTI_STREAM and USE_PINNED_MEMORY)

how can we copy into something that has changed... we can copy from something that has changed but not this way???

copy could be either from host memory or from some copy in current device memory or from other device memory (possible only when peer-peer memory access enabled...

dont support yet. TBA in future

Referenced by skepu::Vector< T >::copyDataToAnInvalidDeviceCopy(), and skepu::Matrix< T >::copyDataToAnInvalidDeviceCopy().

template<typename T >
void skepu::DeviceMemPointer_CU< T >::copyDeviceToHost ( size_t  numElements = 0) const

Copies data from device memory to host memory. Only copies if data on device has been marked as changed.

Parameters
numElementsNumber of elements to copy, default value 0 = all elements.

Referenced by skepu::Reduce< ReduceFunc, ReduceFunc >::CU().

template<typename T >
void skepu::DeviceMemPointer_CU< T >::copyHostToDevice ( size_t  numElements = 0) const

Copies data from host memory to device memory.

Parameters
numElementsNumber of elements to copy, default value 0 = all elements.

set that the copy is valid now

Referenced by skepu::Vector< T >::updateDevice_CU(), skepu::SparseMatrix< T >::updateDevice_CU(), skepu::Matrix< T >::updateDevice_CU(), and skepu::SparseMatrix< T >::updateDevice_Index_CU().

template<typename T >
void skepu::DeviceMemPointer_CU< T >::copyInfFromHostToDevice ( UpdateInf< T > *  updateStruct,
size_t &  sizeUpdStr 
)

copies data from host to device for remaining portions of the copy assumes that host copy is valid...

Parameters
updateStructthe array of structures that keep track of what needs to be copied
sizeUpdStrthe length of updateStruct array

get pointer and size information for a range that is not copied yet

delete this configuration...

References MAX_COPYINF_SIZE.

Referenced by skepu::Vector< T >::copyDataToAnInvalidDeviceCopy(), and skepu::Matrix< T >::copyDataToAnInvalidDeviceCopy().

template<typename T >
bool skepu::DeviceMemPointer_CU< T >::deviceDataHasChanged ( ) const
template<typename T >
bool skepu::DeviceMemPointer_CU< T >::doCopiesOverlap ( DeviceMemPointer_CU< T > *  otherCopy,
bool  oneUnitCheck = false 
)

it returns true if there exist any range (needs to be written) that is overlapping to the otherCopy

Referenced by skepu::Vector< T >::updateDevice_CU(), and skepu::Matrix< T >::updateDevice_CU().

template<typename T >
bool skepu::DeviceMemPointer_CU< T >::doOverlapAndCoverFully ( DeviceMemPointer_CU< T > *  otherCopy)

Checks whether the copy passed as argument has a subset of elements range to the one that object points to.

template<typename T >
bool skepu::DeviceMemPointer_CU< T >::doRangeOverlap ( T *  hostDataPointer,
size_t  numElements 
)

Checks whether there exists some overlap between elements range covered by current copy to the one passed as argument.

Referenced by skepu::DeviceMemPointer_CU< T >::copiesOverlapInf().

template<typename T >
T * skepu::DeviceMemPointer_CU< T >::getDeviceDataPointer ( ) const
template<typename T >
unsigned int skepu::DeviceMemPointer_CU< T >::getDeviceID ( ) const
Returns
The device ID of the CUDA device that has the allocation.
template<typename T >
bool skepu::DeviceMemPointer_CU< T >::isCopyValid ( ) const
template<typename T >
void skepu::DeviceMemPointer_CU< T >::markCopyInvalid ( )

Marks the copy as invalid. Any further read operation would require first copying data to this copy Also sets modified flag to false...

References skepu::DeviceAllocations_CU< T >::getInstance(), and skepu::DeviceAllocations_CU< T >::removeAllocation().

Here is the call graph for this function:

template<typename T>
void skepu::DeviceMemPointer_CU< T >::resetRanges ( )
inline

ranges that should be checked for overlap with other copies

Referenced by skepu::Vector< T >::copyDataToAnInvalidDeviceCopy(), and skepu::Matrix< T >::copyDataToAnInvalidDeviceCopy().


The documentation for this class was generated from the following file: