Hide menu
PELAB

OPEN MASTER THESIS PROJECTS

Research Group on Compiler Technology and Parallel Computing

Prof. Christoph Kessler

Internal Thesis Projects

Note:

All projects in this list require a solid background in either compiler construction or parallel programming (some both); at least one major course (preferably at master level including programming labs) in these areas should be passed successfully.

Note to non-LIU students (FAQ): If you want to do a thesis project with us, you must be registered on a master (or bachelor) program at Linköping university. It is generally not possible to do such projects remotely.

  • Smarter Containers in SkePU (30hp)

    SkePU is a C++ based open-source skeleton programming library for portable high-level programming of heterogeneous multicore systems, being developed by our group at Linköping University in the context of two EU FP7 projects. The different back-ends (basically, implementation variants) provided for the SkePU skeletons allow it to support different types of processing units (PUs), which also opens for automated tuning of the execution flow by selecting the expected fastest implementation at runtime depending on the execution context. The public SkePU distribution currently supports multicore CPUs and GPUs, and there are also experimental back-ends for MPI and for Movidius Myriad1 developed in previous thesis projects.
    An important feature in SkePU are the so-called smart containers, currently Vector and Matrix, which are generic, STL-like abstractions of aggregate data that transparently perform optimizations of data transfer and memory management at runtime, implementing a generalized software caching scheme with sequential memory consistency. For details, see a recent article or Chapter 4 of Dastgeer's PhD thesis.
    In this project you will extend the smart container idea and implementation to include multiple different representations of data that have different performance implications on different types of execution units (e.g. CPU, GPU). For further information please contact us directly.
    Prerequisites: Multithreaded (OpenMP) and GPU (CUDA, OpenCL) programming (e.g. TDDD56), advanced C++ programming skills, interest in data structures, algorithms, and parallel computer architecture.

  • Smart Copying Techniques for Smart Matrix Containers in SkePU (30hp)
    An important feature in SkePU are the so-called smart containers, currently Vector and Matrix, which are generic, STL-like abstractions of aggregate data that transparently perform optimizations of data transfer and memory management at runtime, implementing a generalized software caching scheme with sequential memory consistency. For details, see a recent article or Chapter 4 of Dastgeer's PhD thesis.
    At read or write accesses to vector/matrix elements, smart containers may trigger data copy operations to update stale local copies of elements before being accessed. For that, a copy plan is calculated to reduce transfer costs. However, the current solution and implementation for 2D data (Matrix container) still has ample room for improvements. In this project you will develop, implement and evaluate smart copying techniques to speed up the coherence copying operations at submatrix accesses.
    This is a research-oriented project. If the result looks publishable, we will encourage you to jointly write and submit a research paper to a conference and support your presentation.
    Prerequisites: Multithreaded (OpenMP) and GPU (CUDA, OpenCL) programming (e.g. TDDD56), advanced C++ programming skills, interest in optimization.

  • [TAKEN] Automatic parallelization of legacy C programs for the low-power mobile multicore processor Myriad2 (30hp)
    Myriad is a low-power embedded processor family developed by Movidius for high-throughput image processing on mobile devices. Efficient code for this platform needs to leverage its specific hardware features such as the SIMD instructions, code loading and memory mapping, and to optimize the data transfer between host, master and slave cores.
    This project will build upon a previous toolchain for domain-specific automatic parallelization and develop a back-end for the Myriad2 processor. Further information will be given on request.
    This is a research-oriented project. If the result looks publishable, we will encourage you to jointly write and submit a research paper to a conference and support your presentation.

    Requirements: TDDB44 Compiler construction, or similar course. TDDD56 Multicore and GPU Programming, or similar course. Advanced computer architecture. Good programming skills in C++ and assembler language.

    Contact: Christoph Kessler, Erik Hansson.

  • Drake-based Thesis Topics
    The challenge of programming efficiently parallel architectures grows as chips embed more and more processors. As the demand in throughput keeps increasing, power considerations become more and more important, making efficient programming even more challenging. Dataflow programming proposes to abstract parallel computation with design parallel programs in sequential tasks communicating through channels. Parallelization can be achieved through the concurrent execution of tasks (task parallelism) and by forwarding intermediate data from task to task as early as possible, so successive tasks can run concurrently to process different elements of a unique data stream (pipeline parallelism). More performance can be achieved by running some tasks on several processors and energy can be saved by scaling voltage and frequency of cores running tasks. We proposed Crown Scheduling to compute the scheduling of parallel streaming tasks with frequency scaling. We developed Drake, a C framework to implement architecture-independent streaming applications while abstracting details such as task communications and scheduling to architecture-dependent backend implementations and scheduling algorithms. This enables some performance portability and opens a variety of scheduling-related research questions. We propose several master thesis works related to scheduling for performance of streaming tasks and experiments with Drake.

    • Drake Back-end for Myriad 2 (30hp)
      Myriad is a low-power multi-core vision processor family developed by Movidius). The student will implement a Drake back-end for Myriad 2 and also some streaming applications such as Demosaicing/deBayering and/or other applications, and monitor its performance in time and/or energy spent. The work integrates with the EU project EXCESS.
    • Drake Benchmark Framework
      Define a performance benchmark for static schedulers of streaming application optimizing energy performance. The task consists in implementing several real-word and synthetic applications that reflects challenges static schedulers have to overcome to produce good schedules, using the Drake Framework. This work can integrate in the Mimer toolchain which automatizes most of the benchmark management work. Such work can be used by the research community to evaluate and directly compare the effectivness of scheduling techniques.
    • Comparative Study of Stream Programming Frameworks
      This project will evaluate the performance of Drake in comparison to other stream programming software, such as Streamit, Fastflow or Preesm. It includes writing several equivalent versions of relevant applications for all frameworks and identifying their strengths and weaknesses: code performance, programmability, etc.
    Contact: Nicolas Melot
    Start: As soon as possible, but no later than August 2016.

  • SkePU extensions and optimizations for the Myriad2 back-end (30hp)
    SkePU is a C++ based open-source skeleton programming library for portable high-level programming of heterogeneous multicore systems, being developed by our group at Linköping University in the context of two EU FP7 projects. The different back-ends (basically, implementation variants) provided for the SkePU skeletons allow it to support different types of processing units (PUs), which also opens for automated tuning of the execution flow by selecting the expected fastest implementation at runtime depending on the execution context. The public SkePU distribution currently supports multicore CPUs and GPUs, and there are also experimental back-ends for MPI and for Movidius Myriad1 developed in previous thesis projects.
    Myriad is a low-power embedded processor family developed by Movidius for high-throughput image processing on mobile devices. Efficient code for this platform needs to leverage its specific hardware features such as the SIMD instructions, code loading and memory mapping, and to optimize the data transfer between host, master and slave cores.
    This project will extend SkePU with new domain-specific skeletons, further develop the current SkePU backend for Myriad2, and specifically investigate example applications from the domain of artificial neural networks and image processing.

    Requirements: TDDD56 Multicore and GPU Programming, or similar course. Advanced computer architecture. Good programming skills in C++ and assembler language.

    Contact: Christoph Kessler, Lu Li.

  • [TAKEN] Language embedding and compiler support for performance-portable skeleton programming (30hp)
    SkePU is a C++ based open-source skeleton programming library for portable high-level programming of heterogeneous multicore systems, being developed by our group at Linköping University in the context of two EU FP7 projects. SkePU skeletons (map, reduce, mapoverlap, scan etc.) are templated generic functor objects that are parameterized in sequential user functions; by instantiating them with problem-specific user code, customized multi-variant code is generated automatically. In this way, SkePU hides all platform specific coding details internally, exposing only a sequential-looking programming interface to the user.
    The different back-ends (basically, implementation variants) provided for the SkePU skeletons allow it to support different types of processing units (PUs), which also opens for automated tuning of the execution flow by selecting the expected fastest implementation at runtime depending on the execution context. The public SkePU distribution currently supports multicore CPUs and GPUs, and there are also experimental back-ends for MPI and for Movidius Myriad1 developed in previous thesis projects.
    The current parametrization mechanism in SkePU uses ad-hoc constructs such as preprocessor macros for defining and generating user functions. This should instead be better integrated in the C++ language. In this project you will re-design the API for SkePU user function specification, using a combination of template metaprogramming and compiler frontend extensions. The implementation will be based on the ROSE C++ source-to-source pre-compiler. This redesign will not only provide a cleaner, more type-safe programming interface but also enable new optimizations in generating platform-specific code, which is to be developed as a second step in the project, and which we will discuss further in our first meeting.

    Requirements: TDDB44 Compiler construction or similar course. TDDD56 Multicore and GPU Programming (OpenMP, CUDA, OpenCL). Good programming skills in C++ (esp. template programming).

    Contact: Christoph Kessler, Erik Hansson, Lu Li.

  • Extension of the design and toolchain of a system architecture description language enabling holistic energy optimization (30hp)
    XPDL is a novel XML-based system architecture description language that allows to express high-level descriptions of the structure and relevant properties of computer systems (hardware components and system software) in order to support generic tools that access this information e.g. for target-specific code generation or run-time optimizations. A particular focus of XPDL is on modeling hardware and system software features that are relevant for the energy efficiency of an application. A prototype parser, intermediate representation (IR) and simple back-end for a subset of XPDL have been developed in an earlier project.
    The purpose of this project is to improve the design of XPDL and extend its implementation (IR, compiler, code generator, runtime support) for parsing, representing, processing and querying XPDL models.
    The development is done in the context of a running international research project concerned with portable, holistic optimization of energy efficiency across the entire application and system software stack.

    Prerequisites: Compiler construction (TDDB44 or similar); some background in modern computer systems architecture and programming, including accelerator technology (e.g. TDDD56, TDDC78 or similar); some background in XML technology e.g. XSD, XSLT; good skills in C/C++ programming, Linux.

    Contact: Lu Li, Christoph Kessler.

  • Support for generalized stencil computations in SkePU (30hp)
    SkePU is an open-source C++ template library for portable and efficient high-level programming of GPU-based systems, using so-called skeletons. A skeleton is a generic software component modeling a specific pattern of computation; its implementation encapsulates platform-specific technical details such as parallelism and accelerator handling, communication, synchronization etc., while exposing a sequential programming interface to the programmer. SkePU currently provides one task-parallel and a number of data-parallel skeletons, including one that models stencil computations, i.e., computations that update each element of a matrix or image as a filter operation applied to its nearest neighbor elements.
    This project will, as a case study, consider an open-source high-performance computing application from medical image processing that is currently implemented in C++ and CUDA, investigate the requirements for expressing its performance-critical parts with existing (SkePU) skeletons, and develop the possibly required extensions to the SkePU library that allow to more conveniently express the application with SkePU skeletons.
    Prerequisites: TDDD56 Multicore and GPU Programming, or similar course on parallel programming. Advanced C/C++ programming skills.
    Contact: Christoph Kessler

  • Systematic Concurrent Debugging (30hp or 2x30hp)

    Contact: Ahmed Rezine or Christoph Kessler

  • Dynamic Optimization of Interprocessor Communication in the MPI Back-End of the SkePU skeleton programming library (30hp)
    By harnessing the computational power of modern GPUs via General-Purpose Computing on Graphics Processing Units (GPGPU), very fast calculations can be performed with a GPU cluster.
    This thesis project is about extending an existing MPI cluster back-end implementation of the SkePU skeleton programming library by data types that allow for the dynamic optimization of inter-node communication, and evaluating the implementation with several test programs including a computationally intensive application.
    The overall problem includes developing methods for determining the optimal partitioning of the problem, automated performance tuning for the best use of resources, possibly in a non-dedicated environment; also, devising new SkePU skeletons for some computations / communication patterns in the considered scientific computing problem. An application from computational fluid dynamics will be used as a case study.
    This Master thesis project covers the following tasks:
    - Research survey of related work.
    - Design and implementation of new skeleton backends in C/C++, MPI and CUDA/OpenCL.
    - Skeleton-based refactoring of the given benchmark application and experimental evaluation.
    - Documentation of the results in thesis report.
    Begin: ASAP.
    Prerequisites: Courses in programming of parallel computers and GPU computing (TDDC78 and TDDD56 or equivalent). Good background in OpenCL, CUDA, MPI, C/C++, algorithms, Linux.
    Contact: Christoph Kessler.
  • [taken] Sparse-Matrix support for the SkePU library for portable CPU/GPU programming (30hp)
    This thesis project will extend the functionality of the SkePU library for high-level, portable programming of GPU-based systems, which was developed in our group.
    A matrix is called sparse if most of its entries are zeroes such that a compressed storage format is more time and space efficient than the traditional 2D array representation. In this master thesis project you will extend SkePU with support for sparse matrix computations. In particular, you will design a smart container data structure for representation of generic 2D sparse matrices and implement several of the data-parallel skeletons of SkePU so that they can be applied to sparse matrices in the same way as to dense matrices, with back-ends in sequential C++, OpenMP, CUDA and OpenCL. The implementation will be evaluated quantitatively on several GPU based platforms. Further information is available on request, see the contact information below.
    The library is developed in C++, OpenMP, and has implementations for CUDA and OpenCL. The prerequisites for this Master thesis project are good C++ programming skills and knowledge of GPU and parallel programming (e.g., TDDD56 and TDDC78).
    This is a research oriented project.
    Contact: Christoph Kessler.

Further thesis projects in compiler technology and parallel programming
on request (chrke at ida.liu.se).


External Thesis Projects

in cooperation with local industry partners
  • Performance optimization of security functions in IoT devices (30hp)

    See separate project description. In cooperation with Ericsson Research, Lund.
    Prerequisites: Solid background in computer networks, security, embedded systems, multicore architecture and programming, compilers, and C/C++. Ability to work independently.


Back to my master thesis students page

More thesis projects at PELAB



Responsible for this page: Christoph Kessler, IDA

Page responsible: Webmaster
Last updated: 2016-05-04