TDDD56 Multicore and GPU Programming
Timetable and Lecture Plan
Schedule (as available on the LiU schedule server)
Certain lecture notes and other handouts with restricted access
are located here.
The lecture notes and other material may be updated during the course as appropriate.
Christoph Kessler (CK), Ingemar Ragnemalm (IR).
Assistants: August Ernstsson (AE), Ingemar Ragnemalm (IR)
Those lectures marked by asterisks overlap fully (**) or partly (*) with similar lectures in TDDC78 Programming of Parallel Computers - Methods and Tools. These lectures are optional for those who have already taken TDDC78, but might be a useful repetition anyway. This repetition of the common core topics is necessary to allow you to take the courses individually or in arbitrary order.
All lecture slides can be found here.
- Lecture 1:
Motivation: The Multicore Challenge. Multicore Architecture Concepts. (CK)
Lecture 2: (**)
Shared memory architecture concepts and performance issues. (CK)
- Lecture 3a:
Parallel Programming with Threads. (CK)
- Lecture 4:
Non-blocking synchronization. (CK)
- Lecture 3b (45min) (CK)
Parallel Programming with Tasks. (CK)
- Lesson 1: (45min)
Introduction to Lab1 and Lab2
- Lecture 5: (*)
Theory: Parallel programming and cost models. Analysis of parallel algorithms.
Lecture 6: (*)
Theory (cont.): Brent's Theorem. Speedup anomalies. Amdahl's Law. Fundamental parallel algorithms: parallel prefix sums, parallel list ranking. (CK)
- Lecture 7:
Parallel sorting algorithms: Simple parallel quicksort, Fully parallel quicksort, Parallel samplesort, Bitonic sort, Parallel Mergesort. (CK)
- Lecture 8: (45min)
Parallel algorithmic design patterns: Towards skeleton programming. (AE)
Lesson 2: (45min)
Introduction to skeleton programming in SkePU, and to CPU Lab 3. (AE)
- Lecture 9:
GPU architecture and trends (IR)
- Lecture 10:
Introduction to CUDA programming. (IR)
- Lecture 11:
CUDA programming. GPU lab introduction. (IR)
- Lecture 12:
Sorting on GPU. Advanced CUDA issues. (IR)
- Lecture 13:
Introduction to OpenCL. (IR)
- Lesson 3:
OpenCL. Shader programming.
Selected exercises. (IR)
- Lesson 4:
Selected CPU/theory exercises. (AE)
Please solve suggested exercises in advance to be prepared. See our compendium on Design and Analysis of Parallel Algorithms for background information, important definitions, and further exercises.
- Lecture 14: (**)
Parallelization of sequential programs. (CK)
We have two lab passes,
see the schedule.
During each lab pass there are 2 lab groups in parallel.
- Group_A: up to 32 students in total:
32 students in room Olympen (groups A1, A2)
jointly supervised by August Ernstsson (A1 (max 16) and Johan Ahlqvist (A2(max 16)) (CPU labs v46-48),
and by Ingemar Ragnemalm and Johan Ahlqvist (GPU labs v49-51, A1+A2).
- Group_B: 32 students, room Olympen, supervised by August Ernstsson (B1) and Johan Ahlqvist (B2) (CPU+GPU v46-51).
Remarks: Groups in pass A are recommended for Norrköping-based students (wednesday afternoons 13-17). Group A-subgroups A1 and A2 run in parallel in the same room. Note that A1 and A2 have different assistants in the CPU and GPU part.
Register for one of these groups in webreg by friday in the first week,
thereafter remaining places will be given to the persons on the waiting list.
The maximum course capacity is 64 students.
Find a lab mate; we will merge any singleton groups and migrate between groups as necessary, as the course is fully booked.
Presence in the lab sessions is mandatory.
Deadlinessee the lab page.
Page responsible: Christoph W Kessler
Last updated: 2021-10-29