SkePU

Autotunable Multi-Backend Skeleton Programming Framework for Multicore CPU and Multi-GPU Systems

Previous Releases


Previous Releases (Source Code and Documentation)

  1. Standalone SkePU: Version 1.1.1 (Last release=16/05/2014, latest patch=26/09/2014):
    Download source-code.
    Major new features:

    • adaptive off-line tuning mechanism for context-aware implementation selection at skeleton calls
      (described in our APPT-2013 paper and Chapter 3.4 of Usman Dastgeer's PhD thesis) and
    • new memory management mechanism for Vector and Matrix containers
      (described in our IJPP-2015 article and Chapter 4 of Usman Dastgeer's PhD thesis).
    • Patch v1.1.1 (26/09/2014): tests for Smart Matrices updated for CUDA 6.0. A missing function (isMatrixOnDevice) has been restored. The rest of the distribution is the same as of 16/05/2014.
    See also the html documentation generated by doxygen.

  2. SkePU with StarPU integration: Version 0.8 (Last Updated=06/11/2012):
    Download source-code.
    It contains seven data-parallel and one task-parallel (farm) skeletons for vector and matrix operands, with multi-GPU and hybrid execution support. Contains several enhancements in comparison to the previous release. See the 'CHANGES' file for more information.
    See also the html documentation generated by doxygen.
    Tested with (StarPU 1.0.4, CUDA/nvcc 4.2, GCC 4.7.1)

Older Releases

  1. Version 0.6: (2010)
    The first public release of SkePU.
    Download source-code.
    It contains seven data-parallel skeletons for vector operands with multi-GPU support.
    See also the html documentation generated by doxygen.
  2. Version 0.7: (2011)
    Download source-code. It contains seven data-parallel skeletons for vector and (partial) matrix operands with efficient multi-GPU support on CUDA using a single host thread. See html documentation generated by doxygen.
  3. Version 1.0: (2012)
    Download source-code.
    It contains seven data-parallel skeletons for vector, dense and sparse matrix operands with efficient multi-GPU support on CUDA and OpenCL using a single host thread. Contains several enhancements in comparison to the previous release. See the 'CHANGES' file for more information.
    See also the html documentation generated by doxygen.
  4. Version 0.7 with StarPU integration: (2011)
    Download source-code.
    It contains seven data-parallel, one task-parallel (farm) skeletons for vector and matrix operands, multi-GPU and hybrid execution support.
    See also the html documentation generated by doxygen.