# 732A46 Bayesian Learning

### Course information

#### Aims

The course aims to give a solid introduction to the Bayesian approach to statistical inference, with a view towards applications in data mining and machine learning. After an introduction to the subjective probability concept that underlies Bayesian inference, the course moves on to the mathematics of the prior-to-posterior updating in basic statistical models, such as the Bernoulli, normal and multinomial models. Linear regression and spline regression are also analyzed using a Bayesian approach. The course subsequently shows how complex models can be analyzed with simulation methods like Markov Chain Monte Carlo (MCMC). Bayesian prediction and marginalization of nuisance parameters is explained, and introductions to Bayesian model selection and Bayesian decision theory are also given.

#### Contents

- Introduction to subjective probability and the basic ideas behind Bayesian inference
- Prior-to-posterior updating in basic statistical models, such as the Bernoulli, normal and multinomial models.
- Bayesian analysis of linear and nonlinear regression models
- Shrinkage, variable selection and other regularization priors
- Bayesian analysis of more complex models with simulation methods, e.g. Markov Chain Monte Carlo (MCMC).
- Bayesian prediction and marginalization of nuisance parameters
- Introduction to Bayesian model selection
- Introduction to Bayesian decision theory.

#### Intended audience and admission requirements

This course is given primarily for students on the Master's programme

*Statistics and Data Mining*. It is also offered to Master students in other subjects and to interested Ph.D. students (with a more advanced examination).

Students admitted to the Master's programme in Statistics and Data Mining fulfill the admission requirements for the course.

Students not admitted to the Master's programme in Statistics and Data Mining should have passed:

- an intermediate course in probability and statistical inference
- a basic course in mathematical analysis
- a basic course in linear algebra
- a basic course in programming

#### Organization

The course is organized into

- Lectures. 12 x 2 hours
- Classroom exercises. 4 x 2 hours
- Computer labs. 4 x 2 hours.

The computer labs give the student an opportunity to deepen their understanding of the theory and its applications in a practical computer-aided setting.

Mathematical problems are solved during the exercises.

A detailed plan of the lectures and computer labs are given on the Timetable page.

The schedule is travel-friendly to allow students from other universities to attend without too much travelling.

#### Literature

**Bayesian Data Analysis**by Gelman, Carlin, Stern, och Rubin, Chapman & Hall, Second edition, 2004. The book's web site can be found here.- My
**lecture notes** - My
**slides**.

#### Examination

The examination for the course Bayesian Learning, 6hp, consists of

- written reports on the four computer labs (2 hp)
- individual written report on a project that applies Bayesian methods for data analysis (4hp)

#### R code

- NormalNonInfoPrior.R Sampling from the joint posterior of mu and sigma2 in the normal model.
- PostAndPredIIDNormalNonInfoPrior.R Simulating from the posterior and predictive distribution in the normal model.
- DirichletSampling.R Sampling from the posterior distribution of the parameters of the multinomial model based on a Dirichlet prior.
- OptimExample1.R Simple optimization example to illustrate the use of R's optimizing routine in optim.R
- OptimizeSpam.zip Finding the posterior mode and approximate covariance matrix by numerical optimization methods. This code fits a logistic or probit regression model to the spam data from the book
*Elements of Statistical Learning*. Its a good example since the optimization for the logistic model is very stable, but this is not the case for the probit - NormalMixtureGibbs.R Simulates from the posterior distribution of the parameters in a mixture-of-normals model.
- SimulateDiscreteMarkovChain.R Simulates from Markov Chain with three states.

#### RStan example code

- Måns Magnusson's RStan example code.
- Måns Magnusson's RStan presentation slides.

#### Bugs code

- BernBeta.R Bernoulli model
- BernBetaHierarchy.R Bernoulli model with estimated prior hyperparameters
- HeightWeight.R Linear regression
- LogisticRegRandEffects.R Logistic regression with random effects

#### Other material

- Informative clickable chart with relations between distributions: http://www.johndcook.com/distribution_chart.html.
- Learning about the prior-to-posterior mapping in:
- Bernoulli model with Beta prior

Google Docs | RStudio manipulate - Normal model with normal prior.

Google Docs - Poisson model with Gamma prior

Google Docs - Normal model with Laplace prior.

Google Docs

- Bernoulli model with Beta prior
- A collection with hundreds of machine learning datasets.
- An old exam with solutions from a course I gave in 2007. Have a look at Question 2 early in the course.

Page responsible: Mattias Villani

Last updated: 2013-11-21