OPEN MASTER THESIS PROJECTS
Research Group on Compiler Technology and Parallel Computing
The following master thesis projects are currently available in my group:
All projects on this page require a solid background in either compiler construction or parallel programming (preferably both); at least one major course (preferably at master level including programming labs) in these areas should be passed successfully.
Note to non-LIU students (FAQ): If you want to do a thesis project with us, you must be registered on a master (or bachelor) program at Linköping university. It is generally not possible to do such projects remotely.
Design and toolchain for a system architecture description language enabling holistic energy optimization (30hp)
XPDL is a novel XML-based system architecture description language that allows to express high-level descriptions of the structure and relevant properties of computer systems (hardware components and system software) in order to support generic tools that access this information e.g. for target-specific code generation or run-time optimizations. A particular focus of XPDL is on modeling hardware and system software features that are relevant for the energy efficiency of an application.
The purpose of this project is to elaborate the design of XPDL, in particular to specify its XML schema and semantics, and to develop a toolchain (esp., a compiler) for parsing, representing, processing and querying XPDL models.
The development is done in the context of a running international research project concerned with portable, holistic optimization of energy efficiency across the entire application and system software stack.
Prerequisites: Background in XML technology e.g. XSD, XSLT; compiler construction (TDDB44 or similar); component-based software (TDDD05 or similar); some background in modern computer systems architecture and programming, including accelerator technology (e.g. TDDD56, TDDC78 or similar); good skills in C/C++ programming, Linux.
Support for generalized stencil computations in SkePU (30hp)
SkePU is an open-source C++ template library for portable and efficient high-level programming of GPU-based systems, using so-called skeletons. A skeleton is a generic software component modeling a specific pattern of computation; its implementation encapsulates platform-specific technical details such as parallelism and accelerator handling, communication, synchronization etc., while exposing a sequential programming interface to the programmer. SkePU currently provides one task-parallel and a number of data-parallel skeletons, including one that models stencil computations, i.e., computations that update each element of a matrix or image as a filter operation applied to its nearest neighbor elements.
This project will, as a case study, consider an open-source high-performance computing application from medical image processing that is currently implemented in C++ and CUDA, investigate the requirements for expressing its performance-critical parts with existing (SkePU) skeletons, and develop the possibly required extensions to the SkePU library that allow to more conveniently express the application with SkePU skeletons.
Prerequisites: TDDD56 Multicore and GPU Programming, or similar course on parallel programming. Advanced C/C++ programming skills.
Contact: Christoph Kessler
Systematic Concurrent Debugging (30hp or 2x30hp)Ahmed Rezine or Christoph Kessler
Performance modeling for multi-kernel GPU computing (30hp)
GPU programming using CUDA is getting popular as GPUs are increasingly becoming part of mainstream computing. Already, 62 systems in TOP500 are GPU-based systems (Nov. 2012 listing) and millions of GPUs are sold every year for mobile and traditional computing domains. Modern GPUs have already become general-purpose and task-parallel with introduction of caches and possibility of concurrent execution of multiple computations.
The goal of this master thesis project is to investigate concurrent execution capabilities of modern NVIDIA (Fermi and Kepler) GPUs. The idea is to take different applications with different performance (computation, communication) characteristics and see how they behave when running concurrently with each other. Based on experimental findings, we will try to build a model to predict execution behaviour of a computational kernel when running with other computational kernels, given the information about GPU resource, and computational/communication needs of each computational kernel.
Prerequisites: TDDD56 Multicore and GPU Programming, or equivalent course that includes CUDA and OpenCL programming. Background in C/C++ programming and computer architecture.
Contact: Usman Dastgeer or Christoph Kessler.
- Dynamic Optimization of Interprocessor Communication
in the MPI Back-End of the SkePU skeleton programming library (30hp)
By harnessing the computational power of modern GPUs via General-Purpose Computing on Graphics Processing Units (GPGPU), very fast calculations can be performed with a GPU cluster.
This thesis project is about extending an existing MPI cluster back-end implementation of the SkePU skeleton programming library by data types that allow for the dynamic optimization of inter-node communication, and evaluating the implementation with several test programs including a computationally intensive application.
The overall problem includes developing methods for determining the optimal partitioning of the problem, automated performance tuning for the best use of resources, possibly in a non-dedicated environment; also, devising new SkePU skeletons for some computations / communication patterns in the considered scientific computing problem. An application from computational fluid dynamics will be used as a case study.
This Master thesis project covers the following tasks:
- Research survey of related work.
- Design and implementation of new skeleton backends in C/C++, MPI and CUDA/OpenCL.
- Skeleton-based refactoring of the given benchmark application and experimental evaluation.
- Documentation of the results in thesis report.
Prerequisites: Courses in programming of parallel computers and GPU computing (TDDC78 and TDDD56 or equivalent). Good background in OpenCL, CUDA, MPI, C/C++, algorithms, Linux.
Contact: Christoph Kessler.
- Generating target specific code for automatically detected
algorithmic patterns in C source programs (30 ECTS)
The goal of this project is to combine and extend an already existing tool for automatic pattern recognition in C code with one or several code generators for specific target back-ends.
Motivation of using patterns: Using patterns to describe programs has three main goals. The first one is that given pattern instance combinations can easily be mapped to combinations of kernel implementations for a given architecture; this yields a high level of reuse of kernel implementations. Secondly the patterns can be seen as a high level programming abstraction for the architecture, leading to a component-based programming style. The third goal is to automatically categorizing legacy C program parts as occurrences of patterns, using pattern matching techniques on existing source code to provide an automated migration path and improved portability.
Project work description
A prototype of a pattern recognition tool has already been developed in an earlier project. Now it is your task is to combine this tool with code generators to be a able to generate target specific code. This work includes among other things:
- Extend the set of already existing patterns to raise the recognition rate.
- Select target architectures, both high-level such as OpenMP, Posix threads, etc and low level such as different hardware architectures, such as GPUs. It can also involve runtime systems such as StarPU.
- Implement the code generators for the selected architectures.
- Introduce performance aware target components.
- Test and evaluate.
(- If the result is satisfactory: write, submit and possibly present a scientific paper about it at a scientific workshop or conference.)
Prerequisites: TDDB44 Compiler construction or similar course. Good programming skills in C and Java. Computer architecture course. Course in component based software. Parallel programming course.
Since this is a cross domain project work you will probably be assigned two supervisors (handledare).
Contact for further information:
Supervisor Erik Hansson or examiner Christoph Kessler (christoph.kessler (at) liu.se)
- [taken] Sparse-Matrix support for the SkePU library for portable CPU/GPU programming (30hp)
This thesis project will extend the functionality of the SkePU library for high-level, portable programming of GPU-based systems, which was developed in our group.
A matrix is called sparse if most of its entries are zeroes such that a compressed storage format is more time and space efficient than the traditional 2D array representation. In this master thesis project you will extend SkePU with support for sparse matrix computations. In particular, you will design a smart container data structure for representation of generic 2D sparse matrices and implement several of the data-parallel skeletons of SkePU so that they can be applied to sparse matrices in the same way as to dense matrices, with back-ends in sequential C++, OpenMP, CUDA and OpenCL. The implementation will be evaluated quantitatively on several GPU based platforms. Further information is available on request, see the contact information below.
The library is developed in C++, OpenMP, and has implementations for CUDA and OpenCL. The prerequisites for this Master thesis project are good C++ programming skills and knowledge of GPU and parallel programming (e.g., TDDD56 and TDDC78).
This is a research oriented project.
Contact: Christoph Kessler.
- Source-to-source Translator from Fork to CUDA (30hp)
Modern graphics processing units (GPUs) such as those produced by NVIDIA and AMD/ATI offer massive computing power for data parallel computations with hundreds of parallel threads. At the same time, synchronization is fast.
These are actually properties that are characteristic for the classical PRAM (Parallel Random Access Machine) model of parallel computation (see e.g. the book Practical PRAM Programming).
A previous thesis project in Germany described how how the classical PRAM model of parallel execution can be mapped to CUDA GPUs and how especially the PRAM programming language Fork (or a subset of it) could be mapped to CUDA.
This project will retarget the existing Fork compiler to generate code in CUDA, the current programming platform for modern NVIDIA GPUs, and develop optimizations in the translation process to improve performance.
Prerequisites: Programming in C, Reading German language, Compiler construction (e.g. TDDB44, TDDD16, TDDC86), Programming parallel computers (e.g. TDDC78).
Further thesis projects in compiler technology and
on request (chrke at ida.liu.se).
Responsible for this page: Christoph Kessler, IDA
Page responsible: Webmaster
Last updated: 2014-09-17