Advanced Parallel Programming: Models, Languages, Algorithms (5p)FDA125, 2007VT
|
|
Course plan
Lectures
32 hours
Recommended for
Graduate (CUGS, ECSEL, ...) students interested in the areas of parallel
computer architecture, parallel programming, compiler construction, or
algorithms and complexity.
The course is hosted by the CUGS graduate school and classified as an Advanced
Course.
Nevertheless, interested undergraduate students with strong background in
software engineering are also welcome.
The course was last given
2003. Normally, this course is given every second year; it alternates with
FDA001 Advanced Compiler Construction.
This course complements the existing graduate course FDA101 on Parallel
Programming (4p), which is aliased to the undergraduate course TDDB78/TDDC78.
Goals
The course emphasizes fundamental aspects of parallel programming such as parallel architectures and programming models, performance models, parallel complexity classes, parallel algorithmic paradigms, parallelization strategies, scheduling methods, and concepts of parallel programming languages. Practical exercises help to apply the theoretical concepts of the course to solve concrete problems in different parallel programming models.
Prerequisites
Data structures and algorithms (e.g. TDDB57) are absolutely required; knowledge
in complexity theory and compiler construction is useful. Processprogramming
(e.g. TDDB63/68/72) and Parallel Programming (e.g. FDA101/TDDC78 or TANA77) are
recommended.
Programming in C is necessary for the practical exercises.
Contents
I. General introduction: Parallel computer architectures (*).
II. Basic theory of parallel computation: PRAM model. Time, work, cost. NC.
Speedup and Amdahl's Law (*), Self-simulation and Brent's Theorem, Scalability
and Gustafssons Law (*). Fundamental PRAM algorithms (reduction, parallel
prefix, list ranking). PRAM variants, simulation results and separation
theorems.
III. PRAM emulations on distributed-memory architectures: Hashed address space,
Pipelining and Multithreading. Ranade's Fluent Machine and its cost-effective
massively parallel realization in hardware. Low-level synchronization
mechanisms.
IV. Message passing models: Delay model, classical BSP model of Valiant,
BSP-model of McColl, LogP, LogGP.
V. Distributed shared memory realizations in CC-NUMA architectures and software
DSM emulations. Memory consistency models.
VI. Concepts of parallel programming languages and environments,
covering e.g. Fork, BSPlib, MPI (*), OpenMP (*), HPF (*), Split-C, NESL, ZPL,
Cilk, Linda, ...
VII. Parallel algorithmic paradigms and programming techniques, with example
problems: Parallel loops, static and dynamic loop scheduling. Data parallelism
(*). Parallel divide-and-conquer, recursive doubling. Synchronization
mechanisms in asynchronous computations and parallel data structures. Parallel
task queues. Synchronous and asynchronous pipelining. Domain decomposition and
irregular parallelism.
VIII. Generic parallel programming with skeletons. Skeleton-based programming
environments. Parallel program composition.
IX. Compiling for parallel computers: Dependence analysis, loop
transformations, loop parallelization, idiom recognition. Speculative
execution.
X. Optimization of data layout.
XI. Clustering and scheduling of tasks.
XII. Advanced issues as time permits.
Organization
Lectures (usually 2 intensive weeks in february / early march), programming exercises, optional theoretical exercises for self-assessment.
Literature
Keller, Kessler, Träff: Practical PRAM Programming. Wiley Interscience, 2000.
ISBN 0-471-35351-5.
Further literature is announced on the course homepage.
Lecturers
Christoph Kessler, Welf Löwe.
Examiner
Christoph Kessler.
Examination
TEN1 Written or oral exam (3p).
UPG1 Programming exercise Fork (1p).
UPG2 Programming exercise MPI (1p).
Credit
5 credits
Comments
There is a partial overlap in contents with FDA101 Parallel Programming /
TDDC78. Those topics, marked by an asterisk (*) in the Contents section above,
appear in the lectures to make the course self-contained but will not be taken
up in the exam (TEN1).
A main difference to FDA101 / TDDC78 / TANA77 is not only in depth but also in
scope: While FDA101 emphasizes scientific computing and numerical applications,
this course focuses on theory, concepts, and non-numerical algorithms.
Course homepage
See the course homepage
http://www.ida.liu.se/~chrke/courses/APP/
for announcements, schedule, and further literature.
Note on interest registration
OBS: We announce this course in parallel with FDA001 Advanced Compiler
Construction (6p). Due to the limited number of PhD students we have to assume
that at most one of the two courses will be given. Hence, unless you are
explicitly interested in only one of the two courses, we strongly recommend you
to sign up for both of them, as one of them is very likely to be cancelled.
Page responsible: Director of Graduate Studies