# 732A91 Bayesian Learning

### Course information

#### Aims

The course aims to give a solid introduction to the Bayesian approach to statistical inference, with a view towards applications in data mining and machine learning. After an introduction to the subjective probability concept that underlies Bayesian inference, the course moves on to the mathematics of the prior-to-posterior updating in basic statistical models, such as the Bernoulli, normal and multinomial models. Linear regression and spline regression are also analyzed using a Bayesian approach. The course subsequently shows how complex models can be analyzed with simulation methods like Markov Chain Monte Carlo (MCMC). Bayesian prediction and marginalization of nuisance parameters is explained, and introductions to Bayesian model selection and Bayesian decision theory are also given.

#### Contents

- Introduction to subjective probability and the basic ideas behind Bayesian inference
- Prior-to-posterior updating in basic statistical models, such as the Bernoulli, normal and multinomial models.
- Bayesian analysis of linear and nonlinear regression models
- Shrinkage, variable selection and other regularization priors
- Bayesian analysis of more complex models with simulation methods, e.g. Markov Chain Monte Carlo (MCMC).
- Bayesian prediction and marginalization of nuisance parameters
- Introduction to Bayesian model selection
- Introduction to Bayesian decision theory.

#### Intended audience and admission requirements

This course is given primarily for students on the Master's programme

*Statistics and Machine Learning*. It is also offered to Master students in other subjects and to interested Ph.D. students (with a more advanced examination).

Students admitted to the Master's programme in Statistics and Machine Learning fulfill the admission requirements for the course.

Students not admitted to the Master's programme in Statistics and Machine Learning should have passed:

- an intermediate course in probability and statistical inference
- a basic course in mathematical analysis
- a basic course in linear algebra
- a basic course in programming

#### Course plan

The TimeEdit schedule for the course is available here.

The labs should be done in pairs of students whereas the project is individual work.

All labs and the project should be submitted as PDFs through LISAM.

#### Module 1 - The Bayesics

**Lecture 1**: Basics concepts. Likelihood. Bayesian inference. The Bernoulli model.

**Read**: BDA Ch. 1, 2.1-2.5 | Slides

**Code**: Beta density | Bernoulli model | One-parameter Gaussian model

**Lecture 2**: The Normal model. The Poisson model. Conjugate priors. Prior elicitation.

**Read**: BDA Ch. 2.6-2.9 | Slides

**Lecture 3**: Multi-parameter models. Marginalization. Multinomial model. Multivariate normal model.

**Read**: BDA Ch. 3. | Slides

**Code**: Two-parameter Gaussian model | Prediction with two-parameter Gaussian model | Multinomial model R notebook

**Math exercises 1**: One-parameter models.

Problem set 1 | Solution Problem 2 and 4 | Solution Problem 1 and 3 (Problem 3 is marked as Problem 2 in this solution)

**Lab 1**: Exploring posterior distributions in one-parameter models by simulation and direct numerical evaluation.

Lab 1 | Computer room instructions | LISAM Submission

#### Module 2 - Bayesian Regression and Classification

**Lecture 4**: Prediction. Making Decisions.

**Read**: BDA Ch. 9.1-9.2. | Slides

**Lecture 5**: Linear Regression. Nonlinear regression. Regularization priors.

**Read**: BDA Ch. 14 and Ch. 20.1-20.2 | Slides

**Lecture 6**: Classification. Posterior approximation. Logistic regression. Naive Bayes.

**Read**: BDA Ch. 16.1-16.3 | Slides

**Code**: Logistic and Probit Regression

**Math exercises 2**: Predictive distributions and decisions.

Problem set 2 | Solutions

**Lab 2**: Polynomial regression and classification with logistic regression

Lab 2 | Linköping temperature data | Women work data | LISAM Submission

#### Module 3 - More Advanced Models, MCMC and Variational Bayes

**Lecture 7**: Bayesian computations. Monte Carlo simulation. Gibbs sampling. Data augmentation.

**Read**: BDA Ch. 10-11 | Slides

**Code**: Gibbs sampling for a bivariate normal | Gibbs sampling for a mixture of normals

**Lecture 8**: MCMC and Metropolis-Hastings

**Read**: BDA Ch. 11 | Slides

**Code**: Simulating Markov Chains | Effective Sample Size

**Lecture 9**: HMC, Variational Bayes and Stan.

**Read**: BDA Ch. 12.4 and Ch. 13.7 and RStan vignette| Slides

**Code**: RStan - Three Plants | RStan - Bernoulli model | RStan - Logistic regression | RStan - Logistic regression with random effects | RStan - Poisson model

**Math exercises 3**: Comparing Bayes and Frequentist. Posterior approximation. Naive Bayes.

Problem set 3 | Solutions | R-code for exercise 3

**Lab 3**: MCMC using Gibbs sampling and Metropolis-Hastings

Lab 3 | HowToCodeRWM | Rainfall data | eBay data | LISAM Submission

#### Module 4 - Model Inference and Variable Selection

**Lecture 10**: Bayesian model comparison

**Read**: BDA Ch. 7 | Slides

**Code**: Comparing models for count data

**Lecture 11**: Computing the marginal likelihood, Bayesian variable selection, model averaging.

**Read**: Article on variable selection for additional reading | Slides

**Lecture 12**: Model evaluation and course summary.

**Read**: BDA 6.1-6.4 | Slides

**Math exercises 4**: Model comparison.

Problem set 4 | Solutions to problem set 4

**Lab 4**: Hamiltonian Monte Carlo with Stan

Lab 4 | Campylobacter data | LISAM Submission

#### Literature

**Bayesian Data Analysis**by Gelman, Carlin, Stern, och Rubin, Chapman & Hall, Third edition. The book's web site can be found here.- My
**slides**.

#### Examination

The examination for the course Bayesian Learning, 6 credits, consists of

- Written reports on the four computer labs (3 credits)
- Computer exam (3 credits)

**Computer lab instructions:**

Load the course module and start Rstudio in the computer lab rooms by executing the following lines in a terminal window:

$ module add courses/732A91/2020-02-10.1

$ rstudio &

**Computer exam instructions:**

Information about the take home exam is available in LISAM. Contact per.siden@liu.se if you need access to LISAM.

The following material can be useful during the exam:

- Slides from all the 12 lectures in PDF format
- The four computer labs exercises in PDF format
- Four pages with distributions from the Appendix in the course book.
- Some page with useful probability and math results
- Base R cheat sheet and R markdown cheat sheet.
- Your solutions to the four computer labs.

The course TDDE07 will be graded on the (U,3,4,5) scale.

Here is picture that shows the percentage of the maximum score for each grade (732A91 to the left, TDDE07 to the right).

#### Old exams with solutions

- Exam 2017-05-30 | Solutions: Paper and Code.
- Exam 2017-08-16 | Solutions: Paper and Code.
- Exam 2017-10-27 | Solutions: Paper and Code.
- Exam 2018-06-01 | Solutions: Paper and Code.
- Exam 2018-08-22 | Solutions: Paper and Code.
- Exam 2018-11-01 | Solutions: Paper and Code.
- Exam 2019-06-04 | Solutions: Paper and Code.
- Exam 2019-08-21 | Solutions: Paper and Code.
- Exam 2019-10-31 | Solutions: Paper and Code.
- Exam 2020-06-04 | Solutions: Paper and Code.
- Exam 2020-08-19 | Solutions: Paper and Code.

#### R stuff

- The main page with links to downloads for the programming language R
- RStudio - a very nice developing environment for R.
- Short introduction to R | A little longer introduction | John Cook's intro to R for programmers.

#### Other material

- Informative clickable chart with relations between distributions: http://www.johndcook.com/distribution_chart.html.
- Learning about the prior-to-posterior mapping in:
- Bernoulli model with Beta prior

Google Docs | RStudio manipulate - Normal model with normal prior.

Google Docs - Poisson model with Gamma prior

Google Docs - Normal model with Laplace prior.

Google Docs

- Bernoulli model with Beta prior
- The Feynman technique to learning: 5 min Youtube video.

Page responsible: Mattias Villani

Last updated: 2020-08-21