TDDE19 Advanced Project Course - AI and Machine Learning
Projects
Projects
Students will be divided into a group of around 4-5 students. Each group is assigned to a project (according to the preference and skills of each groups).
At the end of the project, the students are expected to provide:
- Source code of library/program in a gitlab repository
- Documentation of how to use and install the software (API, command line...)
- A group report describing the work that has been accomplished (which algorithms are used, what kind of results were obtained...)
- An individual list of contributions to the project.
Important: the deadline for the selection of projects is Wednesday 4th of September 2024 at 13:00, you need to send me an email (mattias.tiger at liu.se) with the following information:
- The group number from webreg.
- A ranked list of all projects (details on the projects below)
Project list
Semantic Mapping
Customer/Supervisor: Piotr Rudol

(Image from WARA-PS)
Context: the goal of this project is to explore the use of machine learning to build semantic map of the environment using ground robotic system such as Boston Dynamics Spot robot operating in an indoor environment. Such map would include locations of detected objects, for example TV, chair, desk, vehicle, etc. The task could be achieved using vision-based CNN models in combination with sensor fusion algorithms such as Extended Kalman Filter. Semantic information would be stored in the robot’s knowledge database using Resource Description Framework (RDF) Graphs. Initially, the work would be performed on pre-recorded log files, with the possibility of running and testing the final source code on real Spot robot.
Task suggestions:
- Study literature.
- Review and evaluation of available ML models for extraction of semantic information.
- Use of sensor fusion techniques for estimation and tracking of object locations.
- Deployment and evaluation of the project results on real robotic system.
References:
- YOLOv10
- Toward General-Purpose Robots via Foundation Models
- Hastily formed knowledge networks and distributed situation awareness for collaborative robotics
- Previous related student project
Solve Sokoban with AlphaZero
Customer/Supervisor: Markus Fritzshe


(Right image by Carloseow)
Context: The project is about adapting AlphaZero to train an agent to solve Sokoban puzzles. Instead of creating a single model for various Sokoban instances, we aim to develop a unique model for each instance. This approach aims to solve very difficult puzzles faster than traditional search algorithms.
Task suggestions:
- Set up a sokoban gym environment
- Generate a benchmark of sokoban puzzles of different difficulty (board size, number boxes)
- Implement MCTS for the benchmark
- Implement a basline neural network, e.g. a CNN, for predicting a heuristic value given a Sokoban state
- Conduct literature research on Sokoban heuristics to evaluate the sampled states during the simulation steps of MCTS
References:
- Balancing Exploration and Exploitation in Classical Planning
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- gym-sokoban
- AlphaZero Explained
Safe Autonomous Systems
Customer/Supervisor: Leo Jarhede | Emil Wiman

Context: In order for robots to patrolling the streets, making us coffee from the couch and solving all mundane houshold tasks, they must first be safe enough to be trusted. This project focus on building a autonomous systems basis and integrating it with state-of-the-art AI methods for monitoring, planning and execution of structured motion. You will use a Turtlebot 4 platform and assist researchers to integrate and evaluate techniques to realize safe operations in human environments.
Task suggestions:
- Read up and set up simulation of Turtlebot 4
- Integrate various SLAM alternative from github
- Integrate monitoring of safety constraints written in probabilistic logic over uncertain and predicted information
- Integrate 3D exploration and motion planning methods
References:
- Turtlebot 4
- Turtlebot 4 manual
- Incremental reasoning in probabilistic signal temporal logic
- 3D Exploration with Dynamical Obstacles
LLM From Scratch
Customer/Supervisor: Kevin Glocker


Context: Large Language Models (LLMs) have shown strong fluent language generation, task-solving, and reasoning capabilities. However, proprietary APIs hinder companies and governments from applying them to sensitive data, and they cannot be efficiently adapted for specialized tasks, making self-hosting the preferred option. Recent research suggests that models with as few as 10 million parameters can still produce coherent English. The feasibility for other languages remains unknown, and scaling the approach for solving more complex tasks is an open challenge. Improvements in this area would facilitate model customization and deployment at scale, including on edge devices, while reducing the carbon footprint. Possible research questions include whether recent successes of highly parameter-efficient language models can be achieved in Swedish. You will apply state-of-the-art techniques to train and evaluate models at different scales.
Task suggestions:
- Implement pre-training code optimized for our high-performance compute clusters based on, e.g., the Nanotron framework
- Integrate MLOps workflows and tools for experiment monitoring and tracking
- Collect or synthesize Swedish text data
- Train parameter-efficient transformer language models
- Evaluate your LLM using intrinsic metrics and benchmarks
- Set up an interactive demo
References:
- Detailed project intro
- SmolLM - blazingly fast and remarkably powerful
- TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
- The Llama 3 Herd of Models
- Nanotron
- MLflow
- MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
- ScandEval
- OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models
Natural Language to Query
Customer/Supervisor: Piotr Rudol

Context: the goal of this project is to transform a question in English into a query for a custom query language (called scQL). We have a small dataset for the training, which needs to be extended. This project has been run for a few years now, and the goal for this year would be to expand on the work done during previous years. We also would like to investigate the possibility to use LLM to generate such queries. This is not a trivial task, as the LLM model is unlikely to have ever seen such queries, a possibility is to constrain the output of the LLM with a formal grammar definition.
Task suggestions:
- Study literature.
- Apply the hydranet model to scQL.
- Use of LLM for generating scQL queries.
References:
- [Text-to-SQL] Learning to query tables with natural language
- WikiSQL
- SPBERT: A Pre-trained Model for SPARQL Query Language
- Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
- Natural Language to SQL Query using an Open Source LLM
- HydraNet
- Previous year project (2022): Source code and Report
- Previous year project (2023): Source code and Report
- Use grammars with llama.cpp
Deep Fakes in Practice
Customer/Supervisor: Mattias Tiger

(Image by Budiey)
Context: The techniques for, and use of, deep fake have exploded in the last years. It can be used for both good and bad purposes. In order to make the public and decision makers aware of the potential consequences of its use, it is important that there exists accessible demos that showcase what is currently possible (and not) with available techniques. This project is focused on developing a number of such demos, to demonstrate deep fake over modalities such as images, video and sound.
Task suggestions:
- Survey closed source projects for inspiration
- Survey open source projects (Github/Arxiv) for interesting candidates
- Replicate locally
- Evaluate strength/weaknesses
- Integrate and develop an easy-to-use demo for non-technical persons
- Investigate offline vs online deep fakes (photos vs video, audio file vs live audio)
- Get it to work for both Swedish and English
References:
- Deepfake Generation and Detection: A Benchmark and Survey
- Commercial deepfake video
- Deepfake of live video
- Voice cloning
Previous years projects:
- 2023
- 2022 (with source code and reports)
- 2021 (with source code and reports)
- 2020 (with source code and reports)
- 2019 (with source code and reports)
- 2018
- 2017
Page responsible: infomaster
Last updated: 2024-09-04