Hide menu

TDDE78 Reinforcement Learning


Examination Structure

The examination consists of two components:

ModuleECTSDescription
LAB14 hp5 lab assignments - group code submission
LAB22 hpIndividual written report

Both components are graded on a scale of U / 3 / 4 / 5. The final course grade is computed in two steps:

  1. LAB1 grade = mean of the 5 individual lab grades (rounded to nearest integer)
  2. Final grade = mean of LAB1 grade and LAB2 report grade (rounded to nearest integer)

Important: A grade of U in any single lab or the report results in a U for the entire course. All components must receive at least a 3 to pass.


General Rules

  • Labs may be completed individually or in groups of 2-3. Please sign up in WebReg before April 14.
  • All group members must be able to explain and defend all submitted code and results.
  • Each lab consists of two parts:
    • Part A - Implementation: Build the algorithm(s) from scratch in PyTorch following the TODO markers in the starter notebook.
    • Part B - Experiments: Run systematic ablation studies and comparisons, produce plots, and write a short analysis.
  • Clone the course repository and follow the README.md for setup instructions: https://gitlab.liu.se/amaso88/tdde78lab
  • Each lab's starter_code/ directory contains:
    • A Jupyter notebook with TODO placeholders for the algorithm(s)
    • networks.py : Neural Network architectures with TODO placeholders to implement
    • utils.py : Fully implemented helper functions

LAB1 Overview

LabTopicCore MethodsEnvironment(s)
Lab 1Value-Based Deep RLDQN, Double DQN, Dueling DQNCartPole-v1, LunarLander-v3
Lab 2Policy GradientREINFORCE (with/without baseline), PPOCartPole-v1
Lab 3Actor-CriticA2C (with GAE), SACLunarLanderContinuous-v3
Lab 4Model-Based Deep RLDyna-Q (neural WorldModel), MCTSCliffWalking-v1
Lab 5Multi-Agent Deep RLMAPPO, MADDPGsimple_spread_v3 (PettingZoo)

Code Submission

The code submission is per group - submit one .zip archive named labX_[liu-id1]_[liu-id2]_[liu-id3].zip containing:

labX_[liu-ids]/
 - labX_[name].ipynb      # Completed notebook, all cells executed
 - networks.py            # Your implementation

Important: Before submitting, make sure the Summary section at the end of the notebook is fully filled in. All experiment questions from Part B must be answered there.

Submit via email to amath.sow@liu.se.

Deadlines

All labs are available from the start of the course (Wed 1 Apr).

LabSubmission Deadline
Lab 1Fri 17 Apr, 23:59
Lab 2Fri 1 May, 23:59
Lab 3Fri 8 May, 23:59
Lab 4Fri 15 May, 23:59
Lab 5Fri 29 May, 23:59

Late submissions will not be accepted without prior approval from the course responsible.

Grading

Each lab is graded individually on a scale of 3 / 4 / 5. The final LAB grade is the mean of all 5 lab grades.

GradeCriteria
3Part A (implementation) correctly completed
4Part A and Part B (experiments) both completed
5Part A and Part B completed + Summary section fully and insightfully answered

LAB2 - Individual Written Report

The report is individual - every student submits their own report regardless of lab group.

Each student chooses one of the following two options:

Option A - Lab Summary Report

Write a report summarising your work and findings across all 5 labs.

The report must be at most 6 pages (excluding figures and references) and cover:

  1. Overview - briefly describe each algorithm implemented and the environment it was tested on
  2. Key results - one or two main findings per lab (learning curves, ablation conclusions)
  3. Comparative analysis - compare methods across labs (e.g., sample efficiency, stability, scalability)
  4. Reflection - what worked well, what was challenging, and what you learned

Option B - Paper Presentation: RL for LLMs

Select a published research paper on the application of reinforcement learning to large language models (e.g., RLHF, PPO for alignment, GRPO, reward modelling).

The report must be at most 6 pages (excluding figures and references) and include:

  1. Paper summary - problem setting, proposed method, key contributions
  2. Connection to course content - which algorithms from the labs are used or related
  3. Critical analysis - strengths, limitations, and open questions
  4. References - at least 3 published papers from established venues

The paper must be approved by the course responsible before you start writing. Send a short paragraph proposal with the paper title to amath.sow@liu.se.

Submission and Deadline

Submit your individual report.pdf via email to amath.sow@liu.se.

Deadline: Fri 5 Jun, 23:59

Grading

The report is graded on a scale of 3 / 4 / 5 and counts as one component in the final mean.

GradeCriteria
3Report submitted, options followed, findings clearly described
4Grade 3 criteria met + good depth of analysis and well-structured writing
5Grade 4 criteria met + critical insight, strong argumentation, and excellent writing

Contact

For any questions, send an email to amath.sow@liu.se.



Page responsible: Fredrik Heintz
Last updated: 2026-03-26