TDDE78 Reinforcement Learning

Examination Structure

The examination consists of two components:

Module	ECTS	Description
LAB1	4 hp	5 lab assignments - group code submission
LAB2	2 hp	Individual written report

Both components are graded on a scale of U / 3 / 4 / 5. The final course grade is computed in two steps:

LAB1 grade = mean of the 5 individual lab grades (rounded to nearest integer)
Final grade = mean of LAB1 grade and LAB2 report grade (rounded to nearest integer)

Important: A grade of U in any single lab or the report results in a U for the entire course. All components must receive at least a 3 to pass.

General Rules

Labs may be completed individually or in groups of 2-3. Please sign up in WebReg before April 14.
All group members must be able to explain and defend all submitted code and results.
Each lab consists of two parts:
- Part A - Implementation: Build the algorithm(s) from scratch in PyTorch following the TODO markers in the starter notebook.
- Part B - Experiments: Run systematic ablation studies and comparisons, produce plots, and write a short analysis.
Clone the course repository and follow the README.md for setup instructions: https://gitlab.liu.se/amaso88/tdde78lab
Each lab's starter_code/ directory contains:
- A Jupyter notebook with TODO placeholders for the algorithm(s)
- networks.py : Neural Network architectures with TODO placeholders to implement
- utils.py : Fully implemented helper functions

LAB1 Overview

Lab	Topic	Core Methods	Environment(s)
Lab 1	Value-Based Deep RL	DQN, Double DQN, Dueling DQN	CartPole-v1, LunarLander-v3
Lab 2	Policy Gradient	REINFORCE (with/without baseline), PPO	CartPole-v1
Lab 3	Actor-Critic	A2C (with GAE), SAC	LunarLanderContinuous-v3
Lab 4	Model-Based Deep RL	Dyna-Q (neural WorldModel), MCTS	CliffWalking-v1
Lab 5	Multi-Agent Deep RL	MAPPO, MADDPG	simple_spread_v3 (PettingZoo)

Code Submission

The code submission is per group - submit one .zip archive named labX_[liu-id1]_[liu-id2]_[liu-id3].zip containing:

labX_[liu-ids]/
 - labX_[name].ipynb      # Completed notebook, all cells executed
 - networks.py            # Your implementation

Important: Before submitting, make sure the Summary section at the end of the notebook is fully filled in. All experiment questions from Part B must be answered there.

Submit via email to amath.sow@liu.se.

Deadlines

All labs are available from the start of the course (Wed 1 Apr).

Lab	Submission Deadline
Lab 1	Fri 17 Apr, 23:59
Lab 2	Fri 1 May, 23:59
Lab 3	Fri 8 May, 23:59
Lab 4	Fri 15 May, 23:59
Lab 5	Fri 29 May, 23:59

Late submissions will not be accepted without prior approval from the course responsible.

Grading

Each lab is graded individually on a scale of 3 / 4 / 5. The final LAB grade is the mean of all 5 lab grades.

Grade	Criteria
3	Part A (implementation) correctly completed
4	Part A and Part B (experiments) both completed
5	Part A and Part B completed + Summary section fully and insightfully answered

LAB2 - Individual Written Report

The report is individual - every student submits their own report regardless of lab group.

Each student chooses one of the following two options:

Option A - Lab Summary Report

Write a report summarising your work and findings across all 5 labs.

The report must be at most 6 pages (excluding figures and references) and cover:

Overview - briefly describe each algorithm implemented and the environment it was tested on
Key results - one or two main findings per lab (learning curves, ablation conclusions)
Comparative analysis - compare methods across labs (e.g., sample efficiency, stability, scalability)
Reflection - what worked well, what was challenging, and what you learned

Option B - Paper Presentation: RL for LLMs

Select a published research paper on the application of reinforcement learning to large language models (e.g., RLHF, PPO for alignment, GRPO, reward modelling).

The report must be at most 6 pages (excluding figures and references) and include:

Paper summary - problem setting, proposed method, key contributions
Connection to course content - which algorithms from the labs are used or related
Critical analysis - strengths, limitations, and open questions
References - at least 3 published papers from established venues

The paper must be approved by the course responsible before you start writing. Send a short paragraph proposal with the paper title to amath.sow@liu.se.

Submission and Deadline

Submit your individual report.pdf via email to amath.sow@liu.se.

Deadline: Fri 5 Jun, 23:59

Grading

The report is graded on a scale of 3 / 4 / 5 and counts as one component in the final mean.

Grade	Criteria
3	Report submitted, options followed, findings clearly described
4	Grade 3 criteria met + good depth of analysis and well-structured writing
5	Grade 4 criteria met + critical insight, strong argumentation, and excellent writing

Contact

For any questions, send an email to amath.sow@liu.se.

Page responsible: Fredrik Heintz
Last updated: 2026-03-26

IDA - Department of Computer and Information Science

TDDE78 Reinforcement Learning

Examination Structure

General Rules

LAB1 Overview

Code Submission

Deadlines

Grading

LAB2 - Individual Written Report

Option A - Lab Summary Report

Option B - Paper Presentation: RL for LLMs

Submission and Deadline

Grading

Contact