A class representing a execution environment. More...
#include <environment.h>
Public Member Functions | |
void | finishAll_CL () |
void | finishAll_CU (int lowID=-1, int highID=-1) |
void | finishAll () |
void | createOpenCLProgramForMatrixTranspose () |
Static Public Member Functions | |
static Environment * | getInstance () |
Public Attributes | |
std::vector< std::pair< int, int > > | m_peerCopyGpuIDsVector |
Protected Member Functions | |
Environment () | |
virtual | ~Environment () |
Protected Attributes | |
std::vector< std::pair< int, BackEnd > > | m_groupMapping |
A class representing a execution environment.
The Environment is used by the skeleton objects to define a execution environment for them to use. It mainly keeps track of which devices are available and gives access to them. It is implemented as a singleton so that only one environment is actually used and the skeletons stores a pointer to this instance which is created by the first defined skeleton.
|
protected |
The constructor initializes the devices.
References skepu::measureOrLoadCUDABandwidth().
|
protectedvirtual |
The constructor initializes the devices.
void skepu::Environment< T >::createOpenCLProgramForMatrixTranspose | ( | ) |
A function called by the constructor. It creates the OpenCL program for the Matrix Transpose and saves a handle for the kernel. The program is built from a string containing the above mentioned generic transpose kernel. The type and function names in the generic kernel are relpaced by specific code before it is compiled by the OpenCL JIT compiler.
Also handles the use of doubles automatically by including "#pragma OPENCL EXTENSION cl_khr_fp64: enable" if doubles are used.
References skepu::read_file_into_string(), skepu::replaceTextInString(), and skepu::TransposeKernelNoBankConflicts_CL().
void skepu::Environment< T >::finishAll | ( | ) |
Wrapper for CUDA and OpenCL variants. Does not do anything if neither is used. Makes code more portable.
Referenced by skepu::Map< MapFunc >::finishAll(), and skepu::Generate< GenerateFunc >::finishAll().
void skepu::Environment< T >::finishAll_CL | ( | ) |
Finish all OpenCL functions on all devices.
void skepu::Environment< T >::finishAll_CU | ( | int | lowID = -1 , |
int | highID = -1 |
||
) |
Finish all CUDA functions on all devices. Optionally can specify a range of IDs to block for.
lowID | optional. specifies the lowest CUDA ID to do synchronization on. |
highID | optional. specifies the highest CUDA ID to do synchronization on. |
|
static |
Gets pointer to first instance, at first call a new instance is created.
Referenced by skepu::Vector< T >::copyDataToAnInvalidDeviceCopy(), skepu::Matrix< T >::copyDataToAnInvalidDeviceCopy(), skepu::cpu_tune_wrapper_map(), skepu::cpu_tune_wrapper_maparray(), skepu::cpu_tune_wrapper_mapoverlap(), skepu::cpu_tune_wrapper_mapreduce(), skepu::cpu_tune_wrapper_reduce(), skepu::createDefaultConfiguration(), skepu::cuda_tune_wrapper_map(), skepu::cuda_tune_wrapper_maparray(), skepu::cuda_tune_wrapper_mapoverlap(), skepu::cuda_tune_wrapper_mapreduce(), skepu::cuda_tune_wrapper_reduce(), skepu::Generate< GenerateFunc >::Generate(), skepu::Map< MapFunc >::Map(), skepu::MapArray< MapArrayFunc >::MapArray(), skepu::MapOverlap< MapOverlapFunc >::MapOverlap(), skepu::MapOverlap2D< MapOverlap2DFunc >::MapOverlap2D(), skepu::MapReduce< MapFunc, ReduceFunc >::MapReduce(), skepu::Matrix< T >::Matrix(), skepu::omp_tune_wrapper_map(), skepu::omp_tune_wrapper_maparray(), skepu::omp_tune_wrapper_mapoverlap(), skepu::omp_tune_wrapper_mapreduce(), skepu::omp_tune_wrapper_reduce(), skepu::Reduce< ReduceFuncRowWise, ReduceFuncColWise >::Reduce(), skepu::Reduce< ReduceFunc, ReduceFunc >::Reduce(), skepu::Scan< ScanFunc >::Scan(), skepu::Vector< T >::updateDevice_CU(), skepu::SparseMatrix< T >::updateDevice_CU(), skepu::Matrix< T >::updateDevice_CU(), and skepu::SparseMatrix< T >::updateDevice_Index_CU().
|
protected |
This attribute allows multiple skeleton implementations to be scheduled on the same backend
std::vector<std::pair<int, int> > skepu::Environment< T >::m_peerCopyGpuIDsVector |
0 means not enabled, 1 means enabled between all gpu combinations, -1 means enabled between some of the GPUs