TDDD56 Multicore and GPU Programming
Timetable and Lecture Plan
Schedule (as available on the LiU schedule server)
Certain lecture notes and other handouts with restricted access
are located here.
The lecture notes and other material may be updated during the course as appropriate.
Christoph Kessler (CK), Ingemar Ragnemalm (IR).
Assistants: August Ernstsson (AE), Ingemar Ragnemalm (IR)
Those lectures marked by asterisks overlap fully (**) or partly (*) with similar lectures in TDDC78 Programming of Parallel Computers - Methods and Tools. These lectures are optional for those who have already taken TDDC78, but might be a useful repetition anyway. This repetition of the common core topics is necessary to allow you to take the courses individually or in arbitrary order.
All lecture slides can be found here.
- Lecture 1:
Organization, Overview. (2022)
Motivation: The Multicore Challenge. Multicore Architecture Concepts. (CK)
Lecture 2: (**)
Shared memory architecture concepts and performance issues. (CK)
- Lecture 3a:
Parallel Programming with Threads. (CK)
- Lecture 4:
Non-blocking synchronization. (CK)
- Lecture 3b (45min) (CK)
Parallel Programming with Tasks. (CK)
- Lesson 1: (45min)
Introduction to Lab1 and Lab2
- Lecture 5: (*)
Theory: Parallel programming and cost models. Analysis of parallel algorithms.
Lecture 6: (*)
Theory (cont.): Brent's Theorem. Speedup anomalies. Amdahl's Law. Fundamental parallel algorithms: parallel prefix sums, parallel list ranking. (CK)
- Lecture 7:
Parallel sorting algorithms: Simple parallel quicksort, Fully parallel quicksort, Parallel samplesort, Bitonic sort, Parallel Mergesort. (CK)
- Lecture 8: (45min)
Parallel algorithmic design patterns: Towards skeleton programming. (AE)
Lesson 2: (45min)
Introduction to skeleton programming in SkePU, and to CPU Lab 3. (AE)
- Lecture 9:
GPU architecture and trends (IR)
- Lecture 10:
Introduction to CUDA programming. (IR)
- Lecture 11:
CUDA programming. GPU lab introduction. (IR)
- Lecture 12:
Sorting on GPU. Advanced CUDA issues. (IR)
- Lecture 13:
Introduction to OpenCL. (IR)
- Lesson 3:
Exam training: Selected CPU/theory exercises. (AE)
See also our compendium on Design and Analysis of Parallel Algorithms for background information, important definitions, and further exercises.
- Lesson 4:
OpenCL. Shader programming.
Selected exercises. (IR)
- Lecture 14: (**)
Parallelization of sequential programs. (CK)
We have two lab passes,
see the schedule.
During each lab pass there are 2 lab groups in parallel.
- Group_A: up to 32 students in total, room Olympen,
supervised by August Ernstsson
(CPU labs v45-47),
and by Ingemar Ragnemalm (GPU labs v48-50).
- Group_B: up to 32 students, room Olympen, supervised by August Ernstsson (CPU+GPU v45-50).
Remarks: Groups in pass A are recommended for Norrköping-based students (wednesday afternoons 13-17).
Register for one of these groups in webreg by friday in the first week,
thereafter remaining places will be given to the persons on the waiting list.
The maximum course capacity is 64 students.
Find a lab mate; we will merge any singleton groups and migrate between groups as necessary, as the course is fully booked.
Presence in the lab sessions is mandatory.
Deadlinessee the lab page.
Page responsible: Christoph W Kessler
Last updated: 2022-12-06