OPEN MASTER THESIS PROJECTS

Research Group on Compiler Technology and Parallel Computing at PELAB

Prof. Christoph Kessler

Internal thesis projects currently available in my group
External thesis projects in cooperation with partners in industry or research institutes

Internal Thesis Projects

Note:
Most projects in this list require a solid background in either compiler construction or parallel programming (some both); at least one major course (preferably at master level including programming labs) in these areas should be passed successfully. Specific prerequisites are listed below.

Note to non-LIU students (FAQ): If you want to do a thesis project with us, you must be registered on a master (or bachelor) program at Linköping University. It is generally not possible to do such projects remotely.

[TAKEN (F.B.)] SkePU backend for CUDA tensor cores (30hp)
Background: High-level parallel programming aims to abstract challenging aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming framworks and libraries such as SkePU implement skeletons as C++ constructs and provides "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clusters. The skeletons are typically provided as libraries, or in the case of SkePU, as a framework with both library and a custom compiler toolchain. The SkePU library is implemented in modern C++ and involves template metaprogramming. SkePU is a long-term open-source effort at PELAB, Linköping University.
This project will explore AI accelerator architectures as new SkePU targets, such as Google TPU, the NPU, and Nvidia Tensor Cores. Implementation and experimentation work will be limited to one of these platforms, namely, Tensor Cores.
Task: This thesis project will explore several different parallel AI accelerators, identify methods for efficiently mapping BLAS and CNN operations to these, and develop a new SkePU backend for Nvidia tensor cores and evaluate its performance. The possibilities and performance implications of hybrid computing involving both CUDA and tensor cores in parallel [Ho et al. 2022] should be investigated, too, possibly taking inspiration from earlier work demonstrating hybrid computing on CPU and CUDA cores [Öhberg et al. 2020].
Prerequisites: TDDD56 Multicore and GPU Programming (mandatory), Advanced Programming in C++ (mandatory), TDDE65 Programming of parallel computers (recommended). Linux programming skills.
Contact: August Ernstsson, Christoph Kessler

High-level program optimization in the SkePU precompiler (30hp)
Background: High-level parallel programming aims to abstract challenging aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming framworks and libraries such as SkePU implement skeletons as C++ constructs and provides "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clusters. The skeletons are typically provided as libraries, or in the case of SkePU, as a framework with both library and a custom pre-compiler toolchain. The pre-compiler is a source-to-source compiler based on LLVM clang. It performs a rather light-weight source code transformation; in particular, generates platform-specific code variants from the user-functions that can be used with the different SkePU back-ends.
SkePU is a long-term open-source effort at PELAB, Linköping University.
Task: In this project, the pre-compiler will be extended by more advanced code transformations. For example, pattern-matching on the clang intermediate program representation can be applied to rewrite identified code structures into equivalent ones that are better supported in SkePU or in platform-specific libraries.
Please contact us directly for further information.
Prerequisites: TDDB44 Compiler Construction (mandatory), TDDD56 Multicore and GPU Programming (mandatory), Advanced Programming in C++ (recommended). Programming skills in Linux. Some background in artificial neural networks (DNN, CNN, ...) can be useful.
Contact: August Ernstsson, Christoph Kessler

[TAKEN (M.Å.)]

Editor integration with source code analysis and debugging for the SkePU high-level parallel programming framework (30hp or 2x30hp)
Background: High-level parallel programming aims to abstract challenging aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming framworks and libraries such as SkePU implement skeletons as C++ constructs and provides "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clusters. The skeletons are typically provided as libraries, or in the case of SkePU, as a framework with both library and a custom compiler toolchain. In effect, SkePU forms a skeleton programming language "embedded" in C++. While mostly C++-compatible, a SkePU program (when executing in a parallel context) introduces additional rules and semantics for certain programming constructs, in particular the user-functions. If the programmer violates these requirements, the result is a run-time fault such as aborted execution or non-deterministic output. In contrast, errors in the source code syntax will result in compile-time faults. However, as SkePU's library component is implemented as a header-only template metaprogramming library, compiler errors tend to be very long, deeply nested and with unintelligible implementation details exposed to the high-level user.
Task: In this project, we aim to develop source-code editor integration for the SkePU framework. The programming environment shall be aware of fundamental SkePU constructs such as skeletons, user-functions, and smart data-containers. This integration is intended to help the programmer to write correct source code from the start (e.g. by providing code completion) as well as to simplify debugging of already written code. The main candidate approach for implementing the editor integration is by conforming to the Language Server Protocol (LSP). LSP is an open source JSON-based message specification for communication between IDEs (or other editors) and "language servers", separate binaries or libraries providing language-specific information to the editor about the files being processed. Using LSP for this project has two main benefits:
1. The resulting implementation is open and editor-agnostic.
2. An LSP server is available in LLVM/clang by the clangd project. SkePU's precompiler is already based on LLVM/clang, and may be possible to integrate with clangd.
Prerequisites: Mandatory: Advanced C++ programming; Compiler construction fundamentals; Basic understanding of parallel programming concepts.
Useful: Experience with Linux, LLVM, JSON, CUDA, OpenCL.
Contact August Ernstsson or Christoph Kessler for further information on this project.

Nested Parallelism in Algorithmic Skeleton Programming Frameworks (30hp or 2x30hp)
Background: High-level parallel programming aims to abstract challening aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming framworks and libraries such as SkePU and Muesli implement skeletons as C++ constructs and provides "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clus- ters. This can result in a very high degree of available parallelism in the target system. For simple programs, the skeleton abstraction works well and can utilize the parallelism expressedciently with very few lines of code. However, with more complex applications the choice of the right skeleton patterns to use can be differentcult, and sometimes there are no suitable patterns available in the provided skeleton set.
Task: This project aims to extend the skeleton abstraction in SkePU and/or Muesli with multi-level or "nested" parallelism. The goal is to investigate whether the option to invoke new skeleton patterns from within a user-function can improve parallelization efficiency, programmer productivity, or both, and in which type(s) of applications this feature is advantageous. The execution context outside and within a skeleton/user-function dif- fer greatly in the implementation of SkePU, which makes the addition of nested parallelism nontrivial. There are open questions regarding the syntax of nested skeleton calls, whether the set of available skeleton calls should be restricted for nested calls (likely to be the case), and how allocation of resources is acted by the introduction of nested parallelism.
The aim is to incur no overhead from this feature when it is not used, and minimal overhead also for programs using nested parallelism. (It is therefore not advised to dynamically allocate resources during execution of a nested skeleton. Heuristics, static analysis, or other tools could be used to predict a sufficient amount of resources beforehand.)
International collaboration: This project can, depending on the time frame, be conducted in collaboration with researchers from the University of Münster, Germany.
Prerequisites:
Mandatory: Advanced C/C++ programming; Good understanding of parallel programming concepts; Basic GPU programming with CUDA and/or OpenCL.
Useful: Prior experience with SkePU, e.g. through the TDDD56 lab series.
Contact August Ernstsson or Christoph Kessler for further information on this project.

[taken]Software Testing Methodology and Framework for High-Level Parallel Programs (30hp, 2x30hp, or 16hp)
SkePU is an open-source programming framework for portable, high-level, single-source programming of heterogeneous parallel computer systems, such as systems with GPU-accelerated multicore CPUs. In SkePU programs, parallelism is expressed using so-called (algorithmic) skeletons, which are generic, high-level programming constructs derived from higher-order functions such as map, reduce, scan, stencil etc., that can be instantiated by customization in problem-specific sequential code, and for which efficient parallel and accelerator-specific implementations are provided. SkePU programs look like well-structured sequential C++ code; instantiated skeletons can be invoked like any manually written C++ function, but inherit all parallel implementations from the different generic parallel implementations (also known as back-ends) of the skeleton.
Different from most other high-level parallel programming frameworks, the SkePU skeletons are variadic (can take any number of data-container operands) and polymorphic in both operand shape (accepting data-container operands of any shape, i.e., vectors, matrices, tensors) and element type. In addition, many skeletons can also be configured to specialize their behavior. Hence, many possible such combinations may occur in practice. However, only a few of these combinations are currently actually tested for. For SkePU development, it is nevertheless desirable to automatically check that after changes made to a specific data-container shape or a specific skeleton type, SkePU still works consistently across all/many possible combinations. A possible approach to automatizing this is fuzz-testing.
This thesis project will develop a methodology for systematically generating test cases for SkePU programs and, depending on scope, also realize distributed parallel testing on GPU clusters.
The project scope and depth can be configured to match a 16hp, 30hp or 2x30hp project.
This is a research-oriented project. If the result looks publishable, we will encourage you to jointly write and submit a research paper to a conference and support your presentation.
Prerequisites: Multithreaded (OpenMP) and GPU (CUDA, OpenCL) programming (e.g. TDDD56), advanced C++ programming skills; good background in software engineering, esp. software testing. Linux.
Contact: Christoph Kessler.

[RESERVED (V.E.)] Skeleton computing as a service (30hp)
Background: High-level parallel programming aims to abstract challenging aspects of parallel and heterogeneous systems for non-expert programmers. Algorithmic skeletons is an interface approach based on computational patterns, such as map, reduce, and stencil operations. These patterns can be instantiated by providing a custom operator ("user-function"), which is then applied to a supplied dataset in parallel according to the particular pattern semantics. Skeleton programming frameworks and libraries such as SkePU implement skeletons as C++ constructs and provide "backends" for parallelism in multi-core CPUs, GPU accelerators, and multi-node clusters. The skeletons are typically provided as libraries, or in the case of SkePU, as a framework with both library and a custom compiler toolchain. The SkePU library is implemented in modern C++ and involves template metaprogramming. SkePU is a long-term open-source effort at PELAB, Linköping University.
Task: This thesis project will develop a method for setting up SkePU skeleton instantiations and computations as microservices on heterogeneous parallel computing resources in the cloud or in edge computing resources for portable remote execution. This includes the specification and generation of efficient interfaces and efficient operand data transfer, the remote deployment of a SkePU microservice with skeleton instantiation and invocation mechanisms, and the evaluation of the implementation for performance, portability, ease of use, and for security weaknesses. The project should also elaborate on suitable (remotely verifiable) restrictions on user functions to be used with such services to avoid security loopholes, and implement a simple rule-based source code checker for user functions to statically verify absence of "dangerous" constructs, or at least avoid known attack patterns with high probability.
Inspiration for the service implementation can be taken from CORBA and subsequent component-based frameworks, from MapReduce and Spark, and from a recent master thesis project that extended SkePU for execution of stream-parallel applications in distributed systems. An experimental testbed with a number of Raspberry Pi units and GPU-accelerated servers is available for the evaluation.
Prerequisites: TDDD56 Multicore and GPU Programming (mandatory), TDDD25 Distributed Systems (mandatory), Advanced Programming in C++ (mandatory), Linux, operating systems, network programming.
Contact: Christoph Kessler, August Ernstsson.

Parallel I/O for skeleton programs in SkePU (30hp or 16hp)
The C++ based portable skeleton programming framework SkePU for heterogeneous multicore systems and clusters is designed to work on data types usually residing in main memory, so-called data-containers.
This thesis project will investigate how SkePU data-containers can efficiently interface with the Hadoop Distributed File System HDFS in order to provide distributed parallel I/O on large distributed files. The solution will be prototypically implemented in the open-source SkePU framework and evaluated with several simple big-data analytics computations.
The project can be configured for Master or Bachelor thesis level.
Prerequisites: Advanced C++ (esp., template metaprogramming), Big-Data Analytics and/or parallel programming courses, some familiarity with Linux, git, cmake, HDFS.
Contact: Christoph Kessler.

Dynamic Task Migration in Generalized Stream Processing Pipelines for Heterogeneous Distributed Systems (30hp)
We consider distributed soft real-time applications that process continuous data streams such as sensor data or video contents. Such applications can be computationally expensive, and are often organized for parallel processing by pipelining. In heterogeneous parallel and distributed systems, such as the IoT device / edge / cloud continuum, the different pipeline tasks can be internally parallel again and be more or less suitable for running on accelerators such as GPUs. In general, we have for each task multiple equivalent implementations to choose from. These could, for example, be provided by the programmer or generated from a high-level specification, such as SkePU code. SkePU is a C++ based framework for portable high-level programming for heterogeneous parallel systems, developed in our group as an open-source effort.
In a previous master thesis project, a prototype framework for specifying and deploying pipelines of SkePU-defined tasks has been designed and implemented.
The purpose of this thesis project is to make this framework dynamic, i.e., to allow for changing deployment options at runtime while the application pipeline is processing data, without corrupting data or having to restart the computation from the beginning.
More information is available from us on request.
Prerequisites: TDDD56 Multicore and GPU Programming and/or TDDE65 Programming of parallel computers, TDDD25 Distributed systems, Advanced C++ (template metaprogramming), Linux. Also useful: TDDE31 Big data analytics.
Contact and more information: Christoph Kessler.

Further thesis projects for LiU-based students with interest in compiler technology and/or parallel programming are available on request, please contact me.

External Thesis Projects

in cooperation with partners in industry or research institutes

For parallel computing related external thesis projects centered around real-time signal processing of LHC sensor data at CERN, Switzerland, please contact me for further information.

External thesis projects at Vector Sweden AB:
- Webassembly for adaptive AUTOSAR applications
- Analysis of Feasibility of 10GB/s Ethernet connections for MICROSAR Adaptive
Open master thesis projects at J. Keller's group at FU Hagen, Germany, with topics in the areas of Internet security, parallel computing and fault tolerance (NB descriptions in German).

Back to my master thesis students page

More thesis projects at PELAB

Responsible for this page: Christoph Kessler, IDA