Multicore Computing

2017HT

Status	Cancelled
School	National Graduate School in Computer Science (CUGS)
Division	PELAB
Owner	Christoph Kessler
Homepage	http://www.ida.liu.se/~chrke/courses/MULTI/


	Log in

Course plan

Lectures

Ca. 32h, usually given in block format in 2 intensive weeks in the first half of september, next time in 2017.

Recommended for

Graduate (CUGS, CIS, ISY, ...) students interested in the areas of parallel computer architecture, parallel programming, general-purpose GPU programming, software engineering, optimization, compiler construction, or algorithms and complexity.

The course was last given

in HT2013.

Goals

The course emphasizes fundamental aspects of shared-memory parallel programming and accelerator (GPU) programming, such as shared memory parallel architecture concepts, programming models, performance models, parallel algorithmic paradigms, parallelization techniques and strategies, scheduling algorithms, optimization, composition of parallel programs, and concepts of modern parallel programming languages and systems. Practical exercises help to apply the theoretical concepts of the course to solve concrete problems in a real-world multicore system.

Prerequisites

Data structures and algorithms are absolutely required; some knowledge in complexity theory and compiler construction is useful. Some basic knowledge of computer architecture is assumed. A basic course in concurrent programming (e.g. TDDB68) and parallel programming (e.g. TDDC78 or TANA77) are recommended.
Programming in C and some familiarity with Linux (or similar OS) is necessary for the practical exercises.

I. Architecture
* Multicore architecture issues (incl SMT, SMP, CC-NUMA, NCC-NUMA)
* Short repetition: Cache locality and memory hierarchy
* Shared memory emulation and consistency issues
* Heterogeneous multicores
* GPU and GPGPU architectures

II. Languages and environments
* pthreads
* Task-based parallel programming; futures
* Cilk
* OpenMP 4.x
* High-level parallel programming languages and frameworks, skeleton programming
* Stream processing and GPU languages: CUDA, OpenCL
* Frameworks for parallel data mining: MapReduce, Spark

III. Parallel Algorithms
* Theory: Parallel time, work, cost, speedup. Speedup anomalies. Amdahl's Law. Work-time rescheduling and Brent's Theorem.
* Core parallel algorithmic building blocks: Parallel reduction, parallel prefix, list ranking.
* Parallel algorithmic design patterns
* Parallel sorting

IV. Parallelization techniques
* Design patterns for concurrency / synchronization
* Dependence analysis
* Automatic parallelization
* Runtime parallelization and speculative parallelization
* Lock-free synchronization
* Transactional programming
* Task scheduling and clustering

V. Optimizations
* Autotuning for performance and energy efficiency
* Task mapping and on-chip pipelining
* Optimized composition of parallel programs from multi-variant parallel building blocks

Organization

Lectures (ca. 32h), programming exercises, optional theoretical exercises for self-assessment, programming lab assignment, student presentations.
The lecture series of the course will be held in block format with two intensive weeks in Linköping.

Literature

To be announced on the course homepage.

Lecturers

Christoph Kessler, Linköpings universitet,
and guest lecturer(s)

Examiner

Christoph Kessler, Linköpings universitet

Examination

TEN1: Written exam 3.5p, mandatory.
UPG3: Programming exercise (lab report) 3p, optional.
UPG2: Paper presentation, opposition, and written summary (1-2 pages) of the presented paper, 1.5p, optional.

Credit

7.5hp if all examination moments are fulfilled.
Partial credits can be given, too, but the exam must be passed.
Admission to the exam requires presence in at least 50% of lectures and lessons.

Organized by

CUGS

Comments

Lecture and lab contents mostly overlaps with TDDD56 Multicore and GPU Programming.

Page responsible: Anne Moe

IDA - Department of Computer and Information Science