CHRISTOPH KESSLER
FORMER PROJECTS
- Connecting Education and Research Communities for an Innovative Resource Aware Society (CERCIRAS)
EU COST action, about resource-aware parallel computing in cyberphysical systems, 2021-2024 -
Enhancing Programmability and boosting Performance Portability
for Exascale Computing Systems (EXA2PRO)
EU H2020 FETHPC project, May 2018-AprJuly 2021
My group developed the high-level parallel programming model and its toolchain components for portable and convenient programming of high-performance applications for efficient execution on heterogeneous and distributed parallel systems. - HPC-Europa3 (H2020, 2017-2021) scientific host
-
IC1406 High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)
EU H2020 ICT COST Action, 2015-2019
Vice leader of Working Group 2 (Parallel Programming Models) -
Execution Models for Energy-Efficient Computing Systems (EXCESS)
EU FP7 project, Sep. 2013-Aug. 2016.- Language and tool infrastructure for energy-aware application synthesis, system modeling, performance and energy modeling, optimization techniques and autotuning for holistic energy optimization for heterogeneous multicore systems
- Partly based on our previous work for FP7 project PEPPHER
- Leading Workpackage 1 (Execution, Platform and Programming Models for Energy Optimization)
- SkePU: auto-tunable skeleton programming library for Multicore CPU and Multi-GPU systems
- MeterPU: generic, portable measurement abstraction library for Multicore CPU and Multi-GPU systems
- Global Composition Framework
- XPDL extensible platform description language
- Automated performance modeling for guiding automatic selection in multi-variant computations
- Short overview of our contributions to EXCESS (Proc. EXCESS workshop Gothenburg, Sweden, Aug. 2016)
-
Performance Portability and Programmability for Heterogeneous
Many-core Architectures (PEPPHER)
EU FP7 project, Jan. 2010 - Dec. 2012.
- Leading Workpackage 1 (Compositional parallel software development)
- SkePU - Auto-tunable, Multi-Backend Skeleton Programming Framework for Multicore and Multi-GPU Systems
- The PEPPHER Composition Tool
- Survey article in IEEE Micro, 2011
- Skeleton and Pattern Based Programming Environments
- BlockLib: Skeleton programming library for Cell/B.E.
- PRT Pattern Recognition Tool: Generic tool for automated recognition of computational patterns in legacy C programs, e.g. for pattern-based automatic parallelization.
-
SeRC-OpCoReS: Optimized Composition and Runtime Support for e-Science, 2011-2018,
Swedish e-Science Research Center (SeRC), core section on Parallel and Distributed Algorithms and Tools (2011-2015) and Parallel Software and Data Engineering (2016-2018).
- Integrated Code Generation for Instruction-Level Parallel Architectures
-
OPTIMIST: Optimization algorithms for integrated code generation
OPTIMIST is a retargetable, highly optimizing code generator for superscalar, VLIW, clustered VLIW, DSP and embedded processor architectures.
To achieve high code quality, it simultaneously considers the optimization problems for instruction selection (including cluster assignment and resource allocation), instruction scheduling, and register allocation.
Partially funded 2001-2007 by CENIIT and 2004-2005 by SSF RISE. - Integrated Software Pipelining
Optimal code generation for loops, integrating both instruction selection, cluster assignment, scheduling and register allocation including optimal spill code generation and scheduling, for embedded, VLIW and clustered VLIW processors.
Funded 2006-2008 and 2010-2012 by Vetenskapsrådet (VR) and 2006-2011 by the CUGS graduate school.
-
OPTIMIST: Optimization algorithms for integrated code generation
-
REPLICA
project (contract research).
This VTT project developed a reconfigurable shared memory chip multiprocessor supporting strong memory consistency (CRCW PRAM on a chip). We developed a high-level parallel programming language, a compiler backend and system support for the REPLICA architecture. -
DSP Platform for Emerging Telecommunication and Multimedia (ePUMA)
Optimizing DSP streaming applications for memory access cost on a new reconfigurable chip multiprocessor.
WP3: Classification of memory access patterns in DSP applications; program analysis for memory access structures, and automatic selection of most suitable network configuration for parallel memory access.
Funded 2008-2011 by SSF-
PRT Pattern Recognition Tool
Generic tool for automated recognition of computational patterns in legacy C programs, e.g. for pattern-based automatic parallelization.
-
PRT Pattern Recognition Tool
-
On-chip pipelining of memory-intensive computations
on multi-/manycore processors (Cell/B.E. and Intel SCC)
Restructuring memory-intensive, streamable computations such as parallel mergesort to use on-chip forwarding of intermediate data between Cell SPEs allows to reduce the overall volume of off-chip memory accesses, making the application less memory bound and resulting in faster computation. We develop mapping algorithms that optimize trade-offs between computational load balance, on-chip buffer requirements and on-chip communication volume in on-chip pipelining.
Applied to mergesort on Cell, this speeds up the dominating global merge phase of CellSort by up to 70% on QS-20 and up to 143% on PlayStation-3, see our paper at Euro-Par 2010. -
Interactive Invasive Parallelization
User-guided composition of parallel software with an incremental aspect-oriented parallelization approach. Part of the RISE project 2002-2007 (SSF). -
NestStep
Design and implementation of a MIMD parallel global address space (PGAS) language based on the BSP (bulk-synchronous parallel) programming model, supporting shared variables and nested parallelism on top of message passing architectures.
NestStep provides deadlock-free, deterministic parallel execution with BSP-compliant synchronicity and memory consistency.
NestStep has been implemented for MPI clusters and for the heterogeneous multicore processor Cell/B.E. - Fork: Fork95 Language Definition and Compiler for the SB-PRAM, a scalable, massively parallel shared memory MIMD computer with uniform memory access time that works synchronously at the instruction level. The complete project is described in a book. The compiler and tools developed for the SB-PRAM have been used for research purposes and in programming labs for teaching parallel algorithms.
- SPARAMAT A tool for automatic detection of sparse matrix computations and data structures in application programs by static and dynamic pattern matching techniques, which can be used for automatic parallelization and aggressive program transformations. (The successor of the former PARAMAT project at Saarbrücken.) Funded 1997-2000 by Deutsche Forschungsgemeinschaft (DFG)