OPEN MASTER THESIS PROJECTS
Research Group on Compiler Technology and Parallel Computing
The following master thesis projects are currently available in my group:
All projects on this page require a solid background in either compiler construction
or parallel programming (preferably both); at least one major course (preferably
at master level including programming labs) in these areas should be passed successfully.
Note to non-LIU students (FAQ):
If you want to do a thesis project with us, you must be registered
on a master (or bachelor) program at Linköping university.
It is generally not possible to do such projects remotely.
Design and toolchain for a system architecture description language
enabling holistic energy optimization (30hp)
XPDL is a novel XML-based system architecture description language
that allows to express high-level descriptions of
the structure and relevant properties of computer systems
(hardware components and system software) in order to support
generic tools that access this information e.g. for target-specific
code generation or run-time optimizations. A particular focus
of XPDL is on modeling hardware and system software features that are
relevant for the energy efficiency of an application.
The purpose of this project is to elaborate the design of XPDL,
in particular to specify its XML schema and semantics, and
to develop a toolchain (esp., a compiler) for parsing, representing,
processing and querying XPDL models.
The development is done in the context of a running international
research project concerned with portable, holistic
optimization of energy efficiency across the entire
application and system software stack.
Background in XML technology e.g. XSD, XSLT;
compiler construction (TDDB44 or similar);
component-based software (TDDD05 or similar);
some background in modern computer systems architecture and programming,
including accelerator technology (e.g. TDDD56, TDDC78 or similar);
good skills in C/C++ programming, Linux.
Support for generalized stencil computations in SkePU (30hp)
SkePU is an open-source C++ template library
for portable and efficient high-level programming
of GPU-based systems, using so-called skeletons.
A skeleton is a
generic software component modeling a specific pattern of computation;
its implementation encapsulates platform-specific technical
details such as parallelism and accelerator handling,
communication, synchronization etc., while exposing a sequential
programming interface to the programmer. SkePU currently provides
one task-parallel and a number of data-parallel skeletons, including
one that models stencil computations, i.e., computations that update
each element of a matrix or image as a filter operation applied to
its nearest neighbor elements.
This project will, as a case study,
consider an open-source high-performance computing
application from medical image processing that is currently
implemented in C++ and CUDA, investigate the requirements for
expressing its performance-critical parts with existing (SkePU) skeletons,
and develop the possibly required extensions to the SkePU library
that allow to more conveniently express the application with SkePU skeletons.
Prerequisites: TDDD56 Multicore and GPU Programming, or
similar course on parallel programming. Advanced C/C++ programming skills.
Contact: Christoph Kessler
Systematic Concurrent Debugging (30hp or 2x30hp)
Contact: Ahmed Rezine
or Christoph Kessler
Performance modeling for multi-kernel GPU computing (30hp)
GPU programming using CUDA is getting popular as GPUs are increasingly becoming
part of mainstream computing.
Already, 62 systems in TOP500
are GPU-based systems (Nov. 2012 listing) and millions of GPUs are sold
every year for mobile and traditional computing domains.
Modern GPUs have already become general-purpose and task-parallel with
introduction of caches and possibility of concurrent execution of multiple
The goal of this master thesis project is to investigate concurrent execution
capabilities of modern NVIDIA (Fermi and Kepler) GPUs.
The idea is to take different applications with different
performance (computation, communication) characteristics and see how
they behave when running concurrently with each other.
Based on experimental findings, we will try to build a model
to predict execution behaviour of a computational kernel when running with other
computational kernels, given the information about GPU resource, and
computational/communication needs of each computational kernel.
Prerequisites: TDDD56 Multicore and GPU Programming, or equivalent course
that includes CUDA and OpenCL programming.
Background in C/C++ programming and computer architecture.
Contact: Usman Dastgeer or Christoph Kessler.
- Dynamic Optimization of Interprocessor Communication
in the MPI Back-End of the SkePU skeleton programming library (30hp)
By harnessing the computational power of modern GPUs
via General-Purpose Computing on Graphics Processing Units (GPGPU),
very fast calculations can be performed with a GPU cluster.
This thesis project is about extending an existing MPI
cluster back-end implementation of the
SkePU skeleton programming library
by data types that allow for the dynamic optimization of
and evaluating the implementation with several test programs
including a computationally intensive application.
The overall problem includes developing methods for
determining the optimal partitioning
of the problem, automated performance tuning for the best use of
resources, possibly in a non-dedicated environment;
also, devising new SkePU skeletons for
some computations / communication patterns
in the considered scientific computing problem.
An application from computational fluid dynamics
will be used as a case study.
This Master thesis project covers the following tasks:
- Research survey of related work.
- Design and implementation of new skeleton backends in C/C++, MPI and CUDA/OpenCL.
- Skeleton-based refactoring of the given benchmark application and experimental evaluation.
- Documentation of the results in thesis report.
Prerequisites: Courses in programming of parallel computers and
GPU computing (TDDC78 and TDDD56 or equivalent).
Good background in OpenCL, CUDA, MPI, C/C++, algorithms, Linux.
- Generating target specific code for automatically detected
algorithmic patterns in C source programs (30 ECTS)
The goal of this project is to combine and extend an already existing
tool for automatic pattern recognition in C code with one or several
code generators for specific target back-ends.
Motivation of using patterns:
Using patterns to describe programs has three main goals.
The first one is that given pattern instance combinations
can easily be mapped to combinations of kernel implementations
for a given architecture; this yields a high level of reuse
of kernel implementations. Secondly the patterns can be seen
as a high level programming abstraction for the architecture,
leading to a component-based programming style.
The third goal is to automatically categorizing legacy C program
parts as occurrences of patterns, using pattern matching techniques
on existing source code to provide an automated migration path
and improved portability.
Project work description
A prototype of a pattern recognition tool has already been developed in an earlier project. Now it is your task is to combine this tool with code generators to be a able to generate target specific code. This work includes among other things:
- Extend the set of already existing patterns to raise the recognition rate.
- Select target architectures, both high-level such as OpenMP, Posix threads, etc and low level such as different hardware architectures, such as GPUs. It can also involve runtime systems such as StarPU.
- Implement the code generators for the selected architectures.
- Introduce performance aware target components.
- Test and evaluate.
(- If the result is satisfactory: write, submit and possibly present a scientific paper about it at a scientific workshop or conference.)
TDDB44 Compiler construction or similar course.
Good programming skills in C and Java.
Computer architecture course.
Course in component based software.
Parallel programming course.
Since this is a cross domain project work you will probably be assigned two supervisors (handledare).
Contact for further information:
Supervisor Erik Hansson
or examiner Christoph Kessler (christoph.kessler (at) liu.se)
- [taken] Sparse-Matrix support for the SkePU library for portable CPU/GPU programming (30hp)
This thesis project will extend the functionality of the
SkePU library for high-level, portable programming of
GPU-based systems, which was developed in our group.
A matrix is called sparse if most of its entries are zeroes
such that a compressed storage format is more time and space efficient
than the traditional 2D array representation.
In this master thesis project you will extend SkePU with
support for sparse matrix computations.
In particular, you will design a smart container data structure for
representation of generic 2D sparse matrices and implement several of the
data-parallel skeletons of SkePU so that they can be applied to sparse matrices
in the same way as to dense matrices, with back-ends in sequential C++,
OpenMP, CUDA and OpenCL.
The implementation will be evaluated quantitatively on several GPU based platforms.
Further information is available on request, see the contact information below.
The library is developed in C++, OpenMP, and has implementations for
CUDA and OpenCL. The prerequisites for this Master thesis
project are good C++ programming skills and knowledge of GPU and parallel
programming (e.g., TDDD56 and TDDC78).
This is a research oriented project.
- Source-to-source Translator from Fork to CUDA (30hp)
Modern graphics processing units (GPUs) such as those produced by
NVIDIA and AMD/ATI offer massive computing power for data parallel
computations with hundreds of parallel threads.
At the same time, synchronization is fast.
These are actually properties that are characteristic for the classical
PRAM (Parallel Random Access Machine) model of parallel computation
(see e.g. the book
Practical PRAM Programming).
A previous thesis project in Germany described how
how the classical PRAM model
of parallel execution can be mapped to CUDA GPUs
and how especially the PRAM programming language
(or a subset of it) could be mapped to CUDA.
This project will retarget
the existing Fork compiler to generate code in CUDA,
the current programming platform for modern NVIDIA GPUs,
and develop optimizations in the translation process
to improve performance.
Prerequisites: Programming in C, Reading German language,
Compiler construction (e.g. TDDB44, TDDD16, TDDC86),
Programming parallel computers (e.g. TDDC78).
Further thesis projects in compiler technology and
on request (chrke at ida.liu.se).
Back to my master thesis students page
More thesis projects at PELAB
Responsible for this page: Christoph Kessler, IDA