Prof. Dr. Christoph Kessler
PELAB - Programming Environments Laboratory
Software and Systems Division
Department for Computer and Information Science (IDA)
S - 581 83
phone +46 13 28 2406
mobil +46 70 3666687
fax +46 13 28 58 99
email: Christoph.Kessler \at liu.se
URKUND-address: chrke55.liu \at analys.urkund.se
- Parallel computing
- Parallel programming models, languages, compilers, tools, libraries, algorithms
- especially for heterogeneous multi-/many-core platforms such as Cell, GPU-based systems
- Composition of parallel programs from parallel components
- Optimized composition, autotuning
- Mapping, resource allocation, scheduling of parallel computations
- Compiler technology
- Code generation for instruction-level parallel and embedded processors
- especially, clustered VLIW DSP processors
- Optimization problems in code generation
- Program analysis and transformation
- Automatic and semiautomatic parallelization
List of publications
DSP Platform for Emerging Telecommunication and Multimedia (ePUMA)
Optimizing DSP streaming applications for memory access cost
on a new reconfigurable chip multiprocessor.
WP3: Classification of memory access patterns in DSP applications;
program analysis for memory access structures, and
automatic selection of most suitable network configuration for
parallel memory access.
Funded 2008-2011 by SSF
PRT Pattern Recognition Tool
Generic tool for automated recognition of computational patterns in legacy C programs,
e.g. for pattern-based automatic parallelization.
OPTIMIST: Optimization algorithms for integrated code generation
OPTIMIST is a retargetable, highly optimizing code generator
for superscalar, VLIW, clustered VLIW, DSP and embedded processor architectures.
To achieve high code quality,
it simultaneously considers the optimization problems for
instruction selection (including cluster assignment and
and register allocation.
Partially funded 2001-2007 by CENIIT
and 2004-2005 by SSF RISE.
Integrated Software Pipelining
Optimal code generation for loops, integrating both instruction selection,
scheduling and register allocation including optimal spill code generation and scheduling,
for embedded, VLIW and clustered VLIW processors.
Funded 2006-2008 and 2010-2012
by Vetenskapsrådet (VR)
and 2006-2011 by the CUGS
On-chip pipelining of memory-intensive computations
on the Cell/B.E. processor
Restructuring memory-intensive, streamable computations such as parallel mergesort
to use on-chip forwarding of intermediate data between Cell SPEs
allows to reduce the overall volume of off-chip memory accesses,
making the application less memory bound and resulting in faster computation.
We develop mapping algorithms that optimize trade-offs between computational load balance,
on-chip buffer requirements and on-chip communication volume in on-chip pipelining.
Applied to mergesort on Cell, this speeds up the dominating global
merge phase of CellSort by up to 70% on QS-20 and up to 143% on PlayStation-3,
see our paper at Euro-Par 2010.
We are now extending the approach to other multicore architectures, such as Intel SCC.
SeRC-OpCoReS: Optimized Composition and Runtime Support for e-Science
Partially funded by the Swedish e-Science Research Center (SeRC),
Core section on Parallel and Distributed Algorithms and Tools, 2011.
Generic parallel components (skeletons):
Skeleton programming library for Cell.
Skeleton programming library for hybrid CPU/GPU and multi-GPU systems.
project (contract research).
This project realizes a CRCW PRAM on a chip. We are developing
high-level language and system support and a compiler backend
for the REPLICA architecture.
PELAB research group on compiler technology and parallel computing
Fork95 Language Definition and Compiler
a scalable, massively parallel shared memory MIMD computer
with uniform memory access time that works synchronously at the instruction level.
The complete project is described in my recent
The compiler and tools developed for the SB-PRAM are now used in programming
teaching parallel algorithms.
ForkLight: a Fork-like parallel programming
language for asynchronous shared-memory multiprocessors
A tool for automatic detection of sparse matrix computations and data structures
in application programs by static and dynamic pattern matching techniques,
which is very useful for automatic parallelization and aggressive program
(The successor of the former
project at Saarbrücken.)
Funded 1997-2000 by Deutsche Forschungsgemeinschaft (DFG)
Design and implementation of a MIMD parallel global address space (PGAS) language
based on the BSP (bulk-synchronous parallel) programming model,
supporting shared variables and nested parallelism
on top of message passing architectures.
NestStep provides deadlock-free, deterministic parallel execution with
BSP-compliant synchronicity and memory consistency.
NestStep has been implemented for MPI clusters and for the
heterogeneous multicore processor Cell/B.E.
Interactive Invasive Parallelization
User-guided composition of parallel software with an incremental
aspect-oriented parallelization tool.
Covers both automatic parallelization,
skeleton-based structured parallel programming and semiautomatic
Support for automatic roundtrip engineering in aspect weaving.
Part of the RISE project
funded 2002-2005 and 2006-2007
Extending NestStep to a shared-memory programming environment
for nested BSP parallelism on computational grids
and computational P2P systems.
Partially funded 2003-2006 by VINNOVA GridModelica.
Integrating NestStep constructs in the imperative part of the
Modelica modeling and simulation language.
Safe and Secure Modeling and Simulation on the GRID
Partially funded by VINNOVA, 2006-2009.
Some recent / upcoming events:
Dagstuhl research seminar 10191 on
Program Composition and Optimization:
Autotuning, Scheduling, Metaprogramming and Beyond, May 9-12, 2010
- PhD defence Mattias Eriksson: Integrated Code Generation,
Linköping, IDA/Visionen, June 7, 2011, 13:15
MCC-2011 Fourth Swedish Multicore Computing Workshop, Linköping, Sweden, 23-25 Nov 2011
MuCoCoS-2013 6th International Workshop on Multi-/Many-Core Computing Systems, Sep. 7 or 8 TBD, 2013, in connection with PACT'13, Edinburgh, Sep. 2013
List of all courses ever given
Master thesis projects
Multicore Lab (inauguration: 14/9/2012 15:30)
programvaruteknik och realtidssystem /
software engineering and realtime systems.
IEEE Computer Society
- TCSC Scalable Computing
HiPEAC European Network of Excellence on High Performance and Embedded Architecture and Compilation
EAPLS European Association
for Programming Languages and Systems
GI Gesellschaft für Informatik
- GI/ITG-Fachgruppe PARS Parallel-Algorithmen, -rechnerstrukturen und -systemsoftware
- GI-Fachgruppe 2.1.4 Programmiersprachen und Rechenkonzepte
- GI-Arbeitskreis Software Engineering für parallele Systeme (SEPARS)
VDI Verein Deutscher Ingenieure
The Swedish Multicore Initiative
SeRC Swedish E-Science Research Center
Christoph Kessler (chrke \at ida.liu.se)