TDDD56 Multicore and GPU Programming

Timetable and Lecture Plan

Schedule

Schedule (as available on the LiU schedule server)

Lecture/Lesson Plan

Certain lecture notes and other handouts with restricted access are located here.
The lecture notes and other material may be updated during the course as appropriate.

Lecturers: Christoph Kessler (CK), Ingemar Ragnemalm (IR).
Assistants: Sehrish Qummar (SQ), Sajad Khosravi (SK), Ingemar Ragnemalm (IR)

Those lectures marked by asterisks overlap fully (**) or partly (*) with similar lectures in TDDC78 Programming of Parallel Computers - Methods and Tools. These lectures are optional for those who have already taken TDDC78, but might be a useful repetition anyway. This repetition of the common core topics is necessary to allow you to take the courses individually or in arbitrary order.

All lecture slides can be found here.

Lecture 1:
Organization, Overview. (2023)
Motivation: The Multicore Challenge. Multicore Architecture Concepts. (CK)
Lecture 2: (**)
Shared memory architecture concepts and performance issues. (CK)
Lecture 3a:
Parallel Programming with Threads. (CK)
Lecture 4:
Non-blocking synchronization. (CK)
Lecture 3b (45min) (CK)
Parallel Programming with Tasks. (CK)
Lesson 1: (45min)
Introduction to Lab1 and Lab2 (SQ/CK)
Lecture 5: (*)
Theory: Parallel programming and cost models. Analysis of parallel algorithms.
Lecture 6: (*)
Theory (cont.): Brent's Theorem. Speedup anomalies. Amdahl's Law. Fundamental parallel algorithms: parallel prefix sums, parallel list ranking. (CK)
Lecture 7:
Parallel sorting algorithms: Simple parallel quicksort, Fully parallel quicksort, Parallel samplesort, Bitonic sort, Parallel Mergesort. (CK)
Mid-term evaluation.
Lecture 8: (45min)
Parallel algorithmic design patterns: Towards skeleton programming. (CK)
Lesson 2: (45min)
Introduction to skeleton programming in SkePU, and to CPU Lab 3. (SQ/CK)
Lecture 9:
GPU architecture and trends (IR)
Lecture 10:
Introduction to CUDA programming. (IR)
Lecture 11:
CUDA programming. GPU lab introduction. (IR)
Lecture 12:
Sorting on GPU. Advanced CUDA issues. (IR)
Lecture 13:
Introduction to OpenCL. (IR)
Lesson 3:
Exam training: Selected CPU/theory exercises. (CK)
See also our compendium on Design and Analysis of Parallel Algorithms for background information, important definitions, and further exercises.
Lesson 4:
OpenCL. Shader programming.
Selected exercises. (IR)
Lecture 14: (**)
Parallelization of sequential programs. (CK)
Outlook.

Lab schedule

We have two lab passes, see the schedule.
During each lab pass there are 2 lab groups in parallel.

Group_A (A1, A2): up to 32 students in total, room Olympen,
jointly supervised by Sehrish Qummar (SQ) and Sajad Khosravi (SK) (CPU labs v45-47),
and by Ingemar Ragnemalm (IR) and Sajad Khosravi (SK) (GPU labs v48-50).
Group_B (B1, B2): up to 32 students, room Olympen,
jointly supervised by Sehrish Qummar (SQ) and Sajad Khosravi (SK) (CPU+GPU v45-50).

Remarks: Groups in pass A are recommended for Norrköping-based students (mostly wednesday afternoons 13-17).

Find a lab mate and register for one of these groups in webreg by friday in the first week, thereafter remaining places will be given to the persons on the waiting list. The maximum course capacity is 64 students.
We reserve us for merging singleton groups and migrating teams between groups/passes as necessary in order to balance the groups.

Presence in the lab sessions is mandatory.

Deadlines

see the lab page.

Page responsible: Christoph W Kessler
Last updated: 2023-11-14

IDA - Department of Computer and Information Science