Hide menu

Understanding and Building Large Language Models

6FIDA19, 2026VT, 6.0 credits

Status Active - open for registrations
School IDA-gemensam (IDA)
Division ReaL
Owner Fredrik Heintz
Homepage https://www.ida.liu.se/~frehe08/llm/

  Log in  




Course plan

No of lectures

10

Recommended for

Everyone interesting in how large language models work and are built.

The course was last given

Spring 2025

Goals

The aim of the course is to explain the methods and techniques used by large generative AI models such as large language models work and explore how to build them. The focus is on the technical aspects such as methods and techniques. The course is thus more about machine learning than about natural language processing.

Knowledge and understanding
After completing the course, the student should be able to:
* Explain the technical underpinnings of large language models.
* Explain the processes involved in training a large language model.

Competence and skills
After completing the course, the student should be able to:
* Implement and train a basic large language model from scratch in PyTorch.
* Read and comprehend recent, academic papers on LLMs and have knowledge of the common terms used in them (alignment, scaling laws, RLHF, prompt engineering, instruction tuning, etc.).

Judgement and approach
After completing the course, the student should be able to:
* Understand and discuss concepts and terminology of state-of-the-art LLMs.
* Develop an ability to distinguish fact from fantasy in this fast-moving field.

Prerequisites

Students are expected to have enough CS and AI background to be able to follow technical descriptions of deep learning methods. Basic understanding of deep learning is expected. Students are also expected to be able to implement deep learning solutions using Python and PyTorch.

Organization

The course consists of 10 lectures where the course material is presented. The students are expected to complete lab assignments on their own, either individually or in pairs.

Lectures are expected to be given Monday afternoons 15-17 in a hybrid mode, either on site in Linköping or online through Zoom. The lectures will be recorded and made available afterwards.

Content

* Overview - NLP-tasks, historic development
* Large language models - overview and architectures (encoder only BERT, decoder only GPT, encoder-decoder T5)
* Learning probability distributions - Generative AI - VAE, GAN, Diffusion models, NERF, Gaussian Splatter
* Learning sequence to sequence mappings - LSTM, GRU, Seq2Seq, Transformers
* Learning embeddings (representation learning? vector semantics) - word2vec,
* Data pre-processing and tokenization
* Pre-training, scaling laws, fine-tuning
* Alignment, RLHF, RLAIF
* Distillation, continued pre-training, unlearning and editing
* Privacy and security, including attacks on LLMs and how to prevent leaks
* Inference, prompting, in-context learning, RAG, test-time computing (scaling) and reasoning
* Evaluation and benchmarking
* Interpretability and explainability
* Multi-modal models, world models

Literature

Large Language Models: A Deep Dive - Bridging Theory and Practice by Uday Kamath, Kevin Keenan, Garrett Somers, and Sarah Sorenson, 2024. (https://link.springer.com/book/10.1007/978-3-031-65647-7)

Building a Large Language Model (from Scratch) by Sebastian Raschka, 2024. (https://www.manning.com/books/build-a-large-language-model-from-scratch)

Lectures

LE1 – Introduction, NLP and Large Language Models
LE2 – Basics (learning probability distributions, sequence to sequence mappings and embeddings)
LE3 – Data curation and processing
LE4 – Pre-training and scaling
LE5 – Fine-tuning, aligning and distillation
LE6 – Inference, in context learning and retrieval augmented generation
LE7 – Benchmarking and evaluation
LE8 – Building LLMs in practice part 1
LE9 – Building LLMS in practice part 2
LE10 – Advanced topics (trustworthiness, reasoning, multi-modal models, world models)

Examination

Develop an LLM from scratch and conduct at least two experiments related to the lectures and write a short report on it. The lab is expected to have four parts:
1. Develop a simple data pre-processing pipeline
2. Pre-train a GPT-style LLM
3. Fine-tune the LLM
4. Evaluate the LLM

Examiner

Fredrik Heintz

Credits

6 ECTS

Comments


Page responsible: Anne Moe