Hide menu

OPEN MASTER THESIS PROJECTS

Research Group on Compiler Technology and Parallel Computing

Prof. Christoph Kessler

Internal Thesis Projects

Note:

Most projects in this list require a solid background in either compiler construction or parallel programming (some both); at least one major course (preferably at master level including programming labs) in these areas should be passed successfully. Specific prerequisites are listed below.

Note to non-LIU students (FAQ): If you want to do a thesis project with us, you must be registered on a master (or bachelor) program at Linköping University. It is generally not possible to do such projects remotely.

  • [TAKEN (M.Å.)]

    Editor integration with source code analysis and debugging for the SkePU high-level parallel programming framework (30hp or 2x30hp)
    Background: High-level parallel programming aims to abstract challening aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming framworks and libraries such as SkePU implement skeletons as C++ constructs and provides "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clusters. The skeletons are typically provided as libraries, or in the case of SkePU, as a framework with both library and a custom compiler toolchain. In effect, SkePU forms a skeleton programming language "embedded" in C++. While mostly C++-compatible, a SkePU program (when executing in a parallel context) introduces additional rules and semantics for certain programming constructs, in particular the user-functions. If the programmer violates these requirements, the result is a run-time fault such as aborted execution or non-deterministic output. In contrast, errors in the source code syntax will result in compile-time faults. However, as SkePU's library component is implemented as a header-only template metaprogramming library, compiler errors tend to be very long, deeply nested and with unintelligible implementation details exposed to the high-level user.
    Task: In this project, we aim to develop source-code editor integration for the SkePU framework. The programming environment shall be aware of fundamental SkePU constructs such as skeletons, user-functions, and smart data-containers. This integration is intended to help the programmer to write correct source code from the start (e.g. by providing code completion) as well as to simplify debugging of already written code. The main candidate approach for implementing the editor integration is by conforming to the Language Server Protocol (LSP). LSP is an open source JSON-based message specification for communication between IDEs (or other editors) and "language servers", separate binaries or libraries providing language-specific information to the editor about the files being processed. Using LSP for this project has two main benefits:
    1. The resulting implementation is open and editor-agnostic.
    2. An LSP server is available in LLVM/clang by the clangd project. SkePU's precompiler is already based on LLVM/clang, and may be possible to integrate with clangd.
    Prerequisites: Mandatory: Advanced C++ programming; Compiler construction fundamentals; Basic understanding of parallel programming concepts.
    Useful: Experience with Linux, LLVM, JSON, CUDA, OpenCL.
    Contact August Ernstsson or Christoph Kessler for further information on this project.

  • Nested Parallelism in Algorithmic Skeleton Programming Frameworks (30hp or 2x30hp)
    Background: High-level parallel programming aims to abstract challening aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming framworks and libraries such as SkePU and Muesli implement skeletons as C++ constructs and provides "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clus- ters. This can result in a very high degree of available parallelism in the target system. For simple programs, the skeleton abstraction works well and can utilize the parallelism expressedciently with very few lines of code. However, with more complex applications the choice of the right skeleton patterns to use can be differentcult, and sometimes there are no suitable patterns available in the provided skeleton set.
    Task: This project aims to extend the skeleton abstraction in SkePU and/or Muesli with multi-level or "nested" parallelism. The goal is to investigate whether the option to invoke new skeleton patterns from within a user-function can improve parallelization efficiency, programmer productivity, or both, and in which type(s) of applications this feature is advantageous. The execution context outside and within a skeleton/user-function dif- fer greatly in the implementation of SkePU, which makes the addition of nested parallelism nontrivial. There are open questions regarding the syntax of nested skeleton calls, whether the set of available skeleton calls should be restricted for nested calls (likely to be the case), and how allocation of resources is acted by the introduction of nested parallelism.
    The aim is to incur no overhead from this feature when it is not used, and minimal overhead also for programs using nested parallelism. (It is therefore not advised to dynamically allocate resources during execution of a nested skeleton. Heuristics, static analysis, or other tools could be used to predict a sufficient amount of resources beforehand.)
    International collaboration: This project can, depending on the time frame, be conducted in collaboration with researchers from the University of Münster, Germany.
    Prerequisites:
    Mandatory: Advanced C/C++ programming; Good understanding of parallel programming concepts; Basic GPU programming with CUDA and/or OpenCL.
    Useful: Prior experience with SkePU, e.g. through the TDDD56 lab series.
    Contact August Ernstsson or Christoph Kessler for further information on this project.

  • Software Testing Methodology and Framework for High-Level Parallel Programs (30hp, 2x30hp, or 16hp)
    SkePU is an open-source programming framework for portable, high-level, single-source programming of heterogeneous parallel computer systems, such as systems with GPU-accelerated multicore CPUs. In SkePU programs, parallelism is expressed using so-called (algorithmic) skeletons, which are generic, high-level programming constructs derived from higher-order functions such as map, reduce, scan, stencil etc., that can be instantiated by customization in problem-specific sequential code, and for which efficient parallel and accelerator-specific implementations are provided. SkePU programs look like well-structured sequential C++ code; instantiated skeletons can be invoked like any manually written C++ function, but inherit all parallel implementations from the different generic parallel implementations (also known as back-ends) of the skeleton.
    Different from most other high-level parallel programming frameworks, the SkePU skeletons are variadic (can take any number of data-container operands) and polymorphic in both operand shape (accepting data-container operands of any shape, i.e., vectors, matrices, tensors) and element type. In addition, many skeletons can also be configured to specialize their behavior. Hence, many possible such combinations may occur in practice. However, only a few of these combinations are currently actually tested for. For SkePU development, it is nevertheless desirable to automatically check that after changes made to a specific data-container shape or a specific skeleton type, SkePU still works consistently across all/many possible combinations. A possible approach to automatizing this is fuzz-testing.
    This thesis project will develop a methodology for systematically generating test cases for SkePU programs and, depending on scope, also realize distributed parallel testing on GPU clusters.
    The project scope and depth can be configured to match a 16hp, 30hp or 2x30hp project.
    This is a research-oriented project. If the result looks publishable, we will encourage you to jointly write and submit a research paper to a conference and support your presentation.
    Prerequisites: Multithreaded (OpenMP) and GPU (CUDA, OpenCL) programming (e.g. TDDD56), advanced C++ programming skills; good background in software engineering, esp. software testing. Linux.
    Contact: Christoph Kessler.

  • Parallel I/O for skeleton programs in SkePU (30hp or 16hp)
    The C++ based portable skeleton programming framework SkePU for heterogeneous multicore systems and clusters is designed to work on data types usually residing in main memory, so-called data-containers.
    This thesis project will investigate how SkePU data-containers can efficiently interface with the Hadoop Distributed File System HDFS in order to provide distributed parallel I/O on large distributed files. The solution will be prototypically implemented in the open-source SkePU framework and evaluated with several simple big-data analytics computations.
    The project can be configured for Master or Bachelor thesis level.
    Prerequisites: Advanced C++ (esp., template metaprogramming), Big-Data Analytics and/or parallel programming courses, some familiarity with Linux, git, cmake, HDFS.
    Contact: Christoph Kessler.

  • [taken (E.F.)] Visualization of skeleton execution and data movement for SkePU (16hp or 30hp)
    SkePU is a C++ based framework for portable high-level programming for heterogeneous parallel systems, developed in our group as an open-source effort.
    The purpose of this thesis project is to investigate suitable ways of visualizing skeleton program execution on heterogeneous and cluster systems, and then to extend SkePU to generate a suitable graphical visualization of SkePU program execution over time (as in VAMPIR or Paraver for MPI execution or VITE for heterogeneous systems).
    Basic functionality might be achieved by extending SkePU to add target code that generates log files during execution, and converting these to the required input format for an existing open-source visualization tool (to be selected in the project), ideally capable of interactive (online) visualization.
    More advanced features may include the design and implementation of entirely new visualization methods for SkePU-specific features, such as container data access patterns and cache performance.
    The generated visualization must be target-platform-independent and ideally have only few/lightweight UI framework dependencies, such as HTML or Javascript. It should allow to relate runtime events (skeleton instance executions, data transfers) to source code lines and variables, and provide means to properly display such information for cross-referencing.
    This project can be configured either as a bachelor or master thesis project.
    Prerequisites: Advanced C++ programming, Linux. Some background in parallel/GPU programming is useful but not absolutely required. Experience with using supercomputing resources at NSC, as in TDDC78, is useful.
    Contact: Christoph Kessler.
  • [taken (A.S.)] Distributed Stream Processing with SkePU (30hp)
    SkePU is a C++ based framework for portable high-level programming for heterogeneous parallel systems, developed in our group as an open-source effort.
    The purpose of this thesis project is to design, implement and evaluate an extension of SkePU for data-flow driven distributed computing and for automatically deploying such SkePU applications on multi-node computer systems, such as MPI clusters or smaller client-server configurations. Inspirations for the design may be taken from stream-oriented programming models such as Spark Streaming or Flink, and from software architecture languages.
    Prerequisites: TDDC78 Programming of parallel computers, TDDD56 Multicore and GPU Programming, TDDD25 Distributed systems, Advanced C++ (template metaprogramming), Linux. Also useful: TDDE31 Big data analytics.
    Contact and more information: Christoph Kessler.

Further thesis projects for LiU-based students with interest in compiler technology and/or parallel programming are available on request, please contact me.


External Thesis Projects

in cooperation with partners in industry or research institutes


Back to my master thesis students page

More thesis projects at PELAB



Responsible for this page: Christoph Kessler, IDA

Page responsible: Webmaster
Last updated: 2023-12-07