This page links to the video lectures and study materials for the interactive sessions (notebooks and additional reading) and lists the central concepts, models, and algorithms that you are expected to master after each unit.

Course introduction

Welcome to the course! This unit introduces you to natural language processing and to written language as a type of data, presents the course logistics, and reviews basic concepts from linguistics and machine learning. You will also learn how to implement a simple sentiment classifier based on the bag-of-words representation and softmax regression.

Teaching session

Video lectures (review)

Reading

Concepts, models, and algorithms

Unit 1: Word representations

To process words using neural networks, we need to represent them as vectors of numerical values. In this unit you will learn different methods for learning these representations from data, including the widely-used skip-gram model. The unit also introduces the idea of subword representations, and in particular character-level representations, which can be learned using convolutional neural networks.

Video lectures and quizzes

Reading

Concepts, models, and algorithms

Unit 2: Language modelling

Language modelling is the task of predicting which word comes next in a sequence of words. This unit presents two types of language models: n-gram models and neural models, with a focus on models based on recurrent neural networks. You will also learn how these language models can be used to learn more powerful, contextualized word representations.

Lectures

Reading

Concepts, models, and algorithms

Unit 3: Large language models

Machine translation is one of the classical problems in artificial inteligence. In this unit you will learn about neural machine translation and one of its standard models, the encoder–decoder architecture. A crucial ingredient in this architecture is the mechanism of attention. This concept is also the key to some of the most recent developments in the field of NLP, the Transformer architecture, which we will cover in the last lectures of this unit.

Lectures

Reading

Concepts, models, and algorithms

Unit 4: Sequence labelling

Sequence labelling is the task of assigning a class label to each item in an input sequence. Many tasks in natural language processing can be cast as sequence labelling problems over different sets of output labels, including part-of-speech tagging, word segmentation, and named entity recognition. This unit introduces several models for sequence labelling, both with local and global search.

Lectures

Reading

Concepts, models, and algorithms

Unit 5: Syntactic analysis

Syntactic analysis, also called syntactic parsing, is the task of mapping a sentence to a formal representation of its syntactic structure. In this lecture you will learn about two approaches to dependency parsing, where the target representations take the form of dependency trees: the Eisner algorithm, which casts dependency parsing as combinatorial optimisation over graphs, and transition-based dependency parsing, which is the algorithm also used by Google.

Lectures

(Note that there is no Lecture 5.6.)

Reading

Concepts, models, and algorithms