FDA125
Advanced Parallel Programming: Models, Languages, Algorithms (5p)

VT1/2003

Recommended for
Graduate (CUGS, ECSEL, ...) students interested in the areas of parallel computer architecture, parallel programming, compiler construction, or algorithms and complexity.
The course is hosted by CUGS as an Advanced Course.
Interested undergraduate students are also welcome.

Registration / List of participants (internal access only)

Organization
Lectures (ca. 32h), programming exercises, optional theoretical exercises for self-assessment.
The course will be held as block course with two intensive weeks in Linköping.

The course was last given
This is a new course. It complements the existing graduate course FDA101 on Parallel Programming (4p), which is aliased to the undergraduate course TDDB78.

Goals
The course emphasizes fundamental aspects of parallel programming such as parallel architectures and programming models, performance models, parallel complexity classes, parallel algorithmic paradigms, parallelization strategies, and the design and implementation of parallel programming languages. Practical exercises help to apply the theoretical concepts of the course to solve concrete problems in different parallel programming models.

Prerequisites
Data structures and algorithms (e.g. TDDB57) are absolutely required; knowledge in complexity theory and compiler construction is useful. Processprogramming (e.g. TDDB63/68/72) and Parallel Programming (e.g. TDDB78 or TANA77) are useful but not required. Most of the contents of FDA101, i.e. TDDB78, will be summarized during this course. Programming in C is necessary for the practical exercises.

Contents/Schedule

Date Time Location Topic (with links to slides (PS/PDF) where available) Lecturer
Monday
Feb 17, 2003
14-16 Linköping
Knuth
I. Parallel computer architecture concepts (*). PS C. Kessler
" 16-18 Linköping
Knuth
II. Basic theory of parallel computation
PRAM model. Time, work, cost. NC. Speedup and Amdahl's Law (*), Self-simulation and Brent's Theorem, Scalability and Gustafssons Law (*). Fundamental PRAM algorithms (reduction, parallel prefix, list ranking). PRAM variants, simulation results and separation theorems. PS
C. Kessler
Tuesday
Feb 18, 2003
09-11 Linköping Turing II. (remainder)
C. Kessler
" 11-13 Linköping Turing Lesson: An introduction to PRAM programming in Fork. Fork language principles. Using the compiler, PRAM simulator and system software. PS C. Kessler
Wednesday
Feb 19, 2003
09-11 Linköping
Knuth
IV. Message passing models: Delay model, classical BSP model of Valiant, BSP-model of McColl, LogP, LogGP. (PDF) W. Löwe
" 11-13 Linköping
Knuth
V. Distributed shared memory: realizations in CC-NUMA architectures and software DSM emulations. Memory consistency models. (PS) C. Kessler
Thursday
Feb 20, 2003
09-13 Linköping
Knuth
VI. Parallel programming languages and environments: MPI (*) (PS), OpenMP (*) (PS), HPF (*) (PS), C. Kessler
Friday
Feb 21, 2003
09-11 Linköping
Knuth
VI. (cont.) Cilk (PS), Linda (PS) C. Kessler
" 11-12 Linköping
Knuth
VI.2 BSP - LogP simulation (PDF), W. Löwe
Week 11:
Monday
Mar 10, 2003
14-18 Linköping
vNeumann
VII. Parallel algorithmic paradigms and programming techniques, with example problems.
Parallel loops: data dependences in loops, loop transformations, loop parallelization, static and dynamic loop scheduling. Data parallelism. Parallel divide-and-conquer, recursive doubling. Accelerated cascading. Synchronization mechanisms in asynchronous parallel data structures: parallel FIFO queue. Pipelining. Domain decomposition and irregular parallelism. (PS)
C. Kessler
Tuesday
Mar 11, 2003
09-11 Linköping
Knuth
VIII. Generic parallel programming with skeletons. Concepts and skeleton-based programming environments. C. Kessler
"11-13 Linköping
Knuth
III. Realization of PRAMs
PRAM emulations on distributed-memory architectures: Hashed address space, Pipelining and Multithreading. Ranade's Fluent Machine and its cost-effective massively parallel realization in hardware. Low-level synchronization mechanisms. PS
C. Kessler
Wednesday
Mar 12, 2003
09-11 Linköping
Knuth
IX. Compiling for parallel computers: Dependence analysis, loop transformations, loop parallelization, idiom recognition.
(PDF)
W. Löwe
" 11-13 " X. Optimization of data layout. (PDF) W. Löwe
Thursday
Mar 13, 2003
09-11 Linköping
Knuth
XI. Clustering and scheduling of processes. (PDF) W. Löwe
" 11-12 " Guest lecture (45min): Automatic parallelization of Modelica code. (PPT) P. Aronsson
" 12-13 " Wrap-up. Discussion. Evaluation of programming assignment. Self-evaluation. -
Lunch in Vallfarten.
all
Friday
Mar 21, 2003
13-173B:474Oral exam
13:00 Mikhail Chalabine
13:30 Peter Aronsson
14:00 Adrian Pop
14:30 Magnus Wahlström
15:30 Iakov Nakhimovski
all
Friday
Apr 4, 2003
13-153B:474Oral exam (late examination)
13:15 Håkan Mattsson
13:45 Vilhelm Dahllöf
14:15 Andrzej Bednarski
C. Kessler

Labs

For doing the labs you need (remote) access to a SUN computer running Solaris.
OBS:
The Makefile used to build the compiler and its documentation uses dvips to create postscript files. However, IDAs version of dvips automatically sends the ps file to the printer, which is not intended. In my own configuration I locally redefined dvips to match this behaviour. Another simple workaround is to comment out these lines in the Makefile or to use the IDA version's flags -o to specify the output file name. - C.K.

Exercises

  1. Exercises for Lecture 1 / Solution proposal
  2. Exercises for Lecture 2 / Solution proposal
  3. Exercises for Lecture 3 / Solution proposal
  4. Exercises for Lecture 5 / Solution proposal
  5. Exercises for Lecture 6 a (MPI) / Solution proposal
  6. Exercises for Lecture 6 b (Cilk) / Solution proposal
  7. Exercises for Lecture 7 / Solution proposal
These exercises are for self-assessment. Printed copies of the current exercises will be distributed at each lecture. Students are encouraged to meet and solve exercises together in small, self-organizing working groups after the lectures. There is no submission of solutions and no correction. We will publish solution proposals later during the course.

Literature

More references

Teachers
Christoph Kessler (course leader, examiner)
Welf Löwe (examiner)
Peter Aronsson (guest lecturer)

Examination
TEN1: Written or oral exam (Welf, Christoph) 4p
(if there should be many students in the course we will have a written exam, otherwise an oral exam will be arranged.)
Time and place for the exam will be announced later; probably it will be end of March.
UPG1: Programming exercise (lab report) (Christoph) 1p. Deadline: 31/03/2003.

Credit
5 credits

Comments
There is a partial overlap in contents with FDA101 Parallel Programming / TDDB78. These topics are part of the lectures to make the course self-contained but will not be taken up in the exam (TEN1). Accordingly, some of the lectures are optional for those students who already took FDA101; these are marked by an asterisk (*) in the Contents section above.

A main difference to FDA101 / TDDB78 / TANA77 is not only in depth but also in scope: While FDA101 emphasizes scientific computing, numerical applications, and tools, this course focuses on theory and non-numerical algorithms.


This page is maintained by Christoph Kessler (chrke \at ida.liu.se)