Understanding and Building Large Language Models6FIDA19, 2026VT, 6.0 credits
|
|
Course plan
No of lectures
10
Recommended for
Everyone interesting in how large language models work and are built.
The course was last given
Spring 2025
Goals
The aim of the course is to explain the methods and techniques used by large
generative AI models such as large language models work and explore how to
build them. The focus is on the technical aspects such as methods and
techniques. The course is thus more about machine learning than about natural
language processing.
Knowledge and understanding
After completing the course, the student should be able to:
* Explain the technical underpinnings of large language models.
* Explain the processes involved in training a large language model.
Competence and skills
After completing the course, the student should be able to:
* Implement and train a basic large language model from scratch in PyTorch.
* Read and comprehend recent, academic papers on LLMs and have knowledge of the
common terms used in them (alignment, scaling laws, RLHF, prompt engineering,
instruction tuning, etc.).
Judgement and approach
After completing the course, the student should be able to:
* Understand and discuss concepts and terminology of state-of-the-art LLMs.
* Develop an ability to distinguish fact from fantasy in this fast-moving
field.
Prerequisites
Students are expected to have enough CS and AI background to be able to follow technical descriptions of deep learning methods. Basic understanding of deep learning is expected. Students are also expected to be able to implement deep learning solutions using Python and PyTorch.
Organization
The course consists of 10 lectures where the course material is presented. The
students are expected to complete lab assignments on their own, either
individually or in pairs.
Lectures are expected to be given Monday afternoons 15-17 in a hybrid mode,
either on site in Linköping or online through Zoom. The lectures will be
recorded and made available afterwards.
Content
* Overview - NLP-tasks, historic development
* Large language models - overview and architectures (encoder only BERT,
decoder only GPT, encoder-decoder T5)
* Learning probability distributions - Generative AI - VAE, GAN, Diffusion
models, NERF, Gaussian Splatter
* Learning sequence to sequence mappings - LSTM, GRU, Seq2Seq, Transformers
* Learning embeddings (representation learning? vector semantics) - word2vec,
* Data pre-processing and tokenization
* Pre-training, scaling laws, fine-tuning
* Alignment, RLHF, RLAIF
* Distillation, continued pre-training, unlearning and editing
* Privacy and security, including attacks on LLMs and how to prevent leaks
* Inference, prompting, in-context learning, RAG, test-time computing (scaling)
and reasoning
* Evaluation and benchmarking
* Interpretability and explainability
* Multi-modal models, world models
Literature
Large Language Models: A Deep Dive - Bridging Theory and Practice by Uday
Kamath, Kevin Keenan, Garrett Somers, and Sarah Sorenson, 2024.
(https://link.springer.com/book/10.1007/978-3-031-65647-7)
Building a Large Language Model (from Scratch) by Sebastian Raschka, 2024.
(https://www.manning.com/books/build-a-large-language-model-from-scratch)
Lectures
LE1 – Introduction, NLP and Large Language Models
LE2 – Basics (learning probability distributions, sequence to sequence mappings
and embeddings)
LE3 – Data curation and processing
LE4 – Pre-training and scaling
LE5 – Fine-tuning, aligning and distillation
LE6 – Inference, in context learning and retrieval augmented generation
LE7 – Benchmarking and evaluation
LE8 – Building LLMs in practice part 1
LE9 – Building LLMS in practice part 2
LE10 – Advanced topics (trustworthiness, reasoning, multi-modal models, world
models)
Examination
Develop an LLM from scratch and conduct at least two experiments related to the
lectures and write a short report on it. The lab is expected to have four
parts:
1. Develop a simple data pre-processing pipeline
2. Pre-train a GPT-style LLM
3. Fine-tune the LLM
4. Evaluate the LLM
Examiner
Fredrik Heintz
Credits
6 ECTS
Comments
Page responsible: Anne Moe
