Hide menu

729A27 Natural Language Processing


This page contains the study materials for the lectures and specifies the central concepts and procedures that you are supposed to master after each lecture. For more information about how these contents are examined, see the page on Examination.

Course introduction

Welcome to the course! This introductory module consists of two one-hour lectures that introduce you to natural language processing as an application area, the content and organisation of the course, and some basic concepts in text segmentation and linguistics.

Materials

Not yet updated for 2018!

Detailed information about the course organisation and examination is available on this webpage.

Contents

After this lecture you should be able to explain and apply the following concepts:

  • ambiguity, contextuality, multilinguality, combinatorial explosion
  • tokenisation, word tokens, word types, normalisation, stop words
  • morpheme, lexeme, lemma
  • part-of-speech, constituent, syntactic head, phrase structure tree, dependency tree, treebank

Topic 1: Text classification

Text classification is the task of categorising text documents into predefined classes. In this module you will be introduced to text classification and its applications, and learn about two effective classification algorithms: the Naive Bayes classifier and the multi-class perceptron. You will also learn how to evaluate text classifiers using standard validation methods.

Materials

Not yet updated for 2018!

Contents

After this lecture you should be able to explain and apply the following concepts:

  • Naive Bayes classifier
  • maximum likelihood estimation, additive smoothing
  • multi-class perceptron classifier
  • perceptron learning algorithm, averaging trick
  • accuracy, precision, recall

After this lecture you should be able to perform the following procedures:

  • evaluate a text classifier based on accuracy, precision, and recall
  • apply the classification rule of the Naive Bayes classifier and the perceptron classifier to a text
  • learn the probabilities of a Naive Bayes classifier using maximum likelihood estimation and additive smoothing
  • learn the weights of a multi-class perceptron using the perceptron learning algorithm

Topic 2: Language modelling

Language modelling is about building models of what words are more or less likely to occur in some language. This module focuses on n-gram-models, which have a wide range of applications such as predictive text input, language identification, and machine translation. High-quality models require advanced smoothing techniques, which will be a central topic of this module. You will also learn how to evaluate language models using perplexity. The last part of the module in on edit distance.

Materials

Not yet updated for 2018!

Contents

After this lecture you should be able to explain and apply the following concepts:

  • n-gram model
  • add-k smoothing, Witten–Bell smoothing, absolute discounting
  • perplexity, entropy
  • Levenshtein distance, Wagner–Fisher algorithm (advanced)

After this lecture you should be able to perform the following procedures:

  • learn an n-gram model using additive smoothing and absolute discounting
  • evaluate an n-gram model using perplexity or entropy
  • compute the Levenshtein distance between two words using the Wagner–Fisher algorithm (advanced)

Topic 3: Part-of-speech tagging

A part-of-speech tagger is a computer program that tags each word in a sentence with its part of speech, such as noun, adjective, or verb. In this section you will learn how to evaluate part-of-speech taggers, and be introduced to two methods for part-of-speech tagging: exhaustive search in hidden Markov models (with the Viterbi algorithm), and greedy search with multi-class perceptrons.

Materials

Not yet updated for 2018!

Contents

After this lecture you should be able to explain and apply the following concepts:

  • part of speech, part-of-speech tagger
  • accuracy, precision, recall
  • hidden Markov model, Viterbi algorithm
  • multi-class perceptron, feature window

After this lecture you should be able to perform the following procedures:

  • evaluate a part-of-speech tagger based on accuracy, precision, and recall
  • compute the probability of a tagged sentence in a hidden Markov model
  • simulate the Viterbi algorithm

Topic 4: Syntactic analysis

Syntactic analysis, also called syntactic parsing, is the task of mapping a sentence to a formal representation of its syntactic structure. In this lecture you will learn about two approaches to dependency parsing, where the target representations take the form of dependency trees: the Eisner algorithm, which casts dependency parsing as combinatorial optimisation over graphs, and transition-based dependency parsing, which is the algorithm also used by Google.

Materials

Not yet updated for 2018!

Contents

After this lecture you should be able to explain and apply the following concepts:

  • dependency tree, projectivity
  • Collins’ algorithm, Eisner algorithm (advanced)
  • structured perceptron training
  • transition-based dependency parser

After this lecture you should be able to perform the following procedures:

  • simulate the Eisner algorithm (advanced)
  • simulate a transition-based dependency parser

Topic 5: Semantic analysis

In this lecture you will learn about word senses and the problems they posed for language technology, as well as about two important problems in semantic analysis: word sense disambiguation and word similarity. For each task you will learn about both knowledge-based and data-driven methods, including the popular continuous bag-of-words model used in Google’s word2vec software.

Materials

Not yet updated for 2018!

Contents

After this lecture you should be able to explain and apply the following concepts:

  • word sense, homonymy, polysemy
  • synonymy, antonymy, hyponymy, hypernymy, WordNet
  • Simplified Lesk algorithm
  • word similarity, distributional hypothesis, co-occurrence matrix

After this lecture you should be able to perform the following procedures:

  • simulate the Simplified Lesk algorithm
  • compute the path length-based similarity of two words
  • derive a co-occurrence matrix from a document collection

Page responsible: Marco Kuhlmann
Last updated: 2017-12-14