Hide menu

Multicore computing

DF21500, 2011VT

Status Archive
School National Graduate School in Computer Science (CUGS)
Division PELAB
Owner Christoph Kessler
Homepage http://www.ida.liu.se/~chrke/courses/MULTI/

  Log in  

Course plan


Ca. 32h, given in block format in 2 intensive weeks in February and early March 2011.

Recommended for

Graduate (CUGS, CIS, ISY, ...) students interested in the areas of parallel computer architecture, parallel programming, software engineering, optimization, compiler construction, or algorithms and complexity.

The course was last given

in VT2009.


The course emphasizes fundamental aspects of shared-memory parallel programming such as shared memory parallel architecture concepts, programming models, performance models, parallel algorithmic paradigms, parallelization techniques and strategies, scheduling algorithms, optimization, composition of parallel programs, and concepts of modern parallel programming languages and systems. Practical exercises help to apply the theoretical concepts of the course to solve concrete problems in a real-world multicore system.


Data structures and algorithms are absolutely required; some knowledge in complexity theory and compiler construction is useful. Some basic knowledge of computer architecture is assumed. A basic course in concurrent programming (e.g. TDDB68) and parallel programming (e.g. TDDC78 or TANA77) are recommended.
Programming in C and some familiarity with Linux (or similar OS) is necessary for the practical exercises.


(some advanced topics may be added or changed depending on availability of guest lecturers)

I. Architecture
* Multicore architecture issues (incl SMT, SMP, CC-NUMA, NCC-NUMA)
* Short repetition: Cache locality and memory hierarchy
* Shared memory emulation and consistency issues
* Heterogeneous multicores
* GPU computing

II. Languages and environments
* pthreads
* Cilk
* OpenMP 3.0
* New HPC languages: X10, Chapel, Fortress
* Stream processing and GPU languages: Cg, Brook, Cuda, OpenCL
* Offload C++

III. Parallelization techniques
* Design patterns for concurrency / synchronization
* Dependence analysis
* Automatic parallelization
* Runtime parallelization and speculative parallelization
* Lock-free synchronization
* Transactional programming
* Task scheduling and clustering

IV. Optimizations
* Feedback directed optimization
* Task mapping and on-chip pipelining
* Skeleton based parallel programming
* Optimized composition of parallel programs from parallel components
* Scheduling malleable task graphs


Lectures (ca. 32h), programming exercises, optional theoretical exercises for self-assessment, programming lab assignment, student presentations.
The lecture series of the course will be held in block format with two intensive weeks in Linköping.


To be announced on the course homepage.


Christoph Kessler, Linköpings universitet
Welf Löwe, Univ. Växjö, and further guest lecturers


Christoph Kessler, Linköpings universitet


TEN1: Written exam 4.5hp.
UPG1: Programming exercise (lab report) 1.5hp.
UPG2: Paper presentation, opposition, and written summary (1-2 pages) of the presented paper, 1.5hp.


7.5hp if all examination moments are fulfilled.
Partial credits can be given, too.

Organized by



There is a partial overlap in contents with FDA125 Advanced Parallel Programming, which corresponds to 1p = 1.5hp in the written exam.

In contrast to FDA125, we focus in this course on shared memory systems and add more material on software engineering techniques for parallel systems, such as design patterns for synchronization and optimized composition of parallel components. We also give more emphasis to parallelization techniques that are especially useful for multicore systems, such as speculative parallelization, transactional programming and lock-free synchronization. Scheduling for shared memory systems will be considered in more detail, in particular scheduling algorithms for malleable task graphs.

Page responsible: Director of Graduate Studies
Last updated: 2012-05-03