# IDA Machine Learning Seminars - Spring 2021

### Wednesday, February 24, 3.15 pm, 2021

**Differentiating through Optimal Transport**

Marco Cuturi, Google Brain and CREST - ENSAE, Institut Polytechnique de Paris, France

Marco Cuturi, Google Brain and CREST - ENSAE, Institut Polytechnique de Paris, France

*Abstract*: Computing or approximating an optimal transport cost is rarely the sole goal when using OT in applications. In most cases this relies instead on approximating that plan (or its application to another vector) to obtain its differentiable properties w.r.t. to its input. I will present in this talk recent applications that highlight this necessity, as well as possible algorithmic and programmatic solutions to handle such issues.

Location: You can join the seminar via this Zoom link: https://liu-se.zoom.us/j/69240032654

Passcode: 326937

Organizer: Fredrik Lindsten

### Wednesday, March 24, 3.15 pm, 2021

**Target Aware Bayesian Inference: How to Beat Optimal Conventional Estimators**

Tom Rainforth, University of Oxford, UK

Tom Rainforth, University of Oxford, UK

*Abstract*: Standard approaches for Bayesian inference focus solely on approximating the posterior distribution. Typically, this approximation is, in turn, used to calculate expectations for one or more target functionsâ€”a computational pipeline that is inefficient when the target function(s) are known upfront. We address this inefficiency by introducing a framework for target-aware Bayesian inference (TABI) that estimates these expectations directly. While conventional Monte Carlo estimators have a fundamental limit on the error they can achieve for a given sample size, our TABI framework is able to breach this limit; it can theoretically produce arbitrarily accurate estimators using only three samples, while we show empirically that it can also breach this limit in practice. We utilize our TABI framework by combining it with adaptive importance sampling approaches and show both theoretically and empirically that the resulting estimators are capable of converging faster than the standard O(1/N) Monte Carlo rate, potentially producing rates as fast as O(1/N^2). We further combine our TABI framework with amortized inference methods, to produce a method for amortizing the cost of calculating expectations. Finally, we show how TABI can be used to convert any marginal likelihood estimator into a target aware inference scheme and demonstrate the substantial benefits this can yield.

Based on the paper of the same name by Rainforth, Golinski, Wood, and Zaidi, published in the Journal of Machine Learning Research 2020 and "Amortized Monte Carlo Integration" by Golinski, Wood, and Rainforth, ICML 2019 (Best Paper Honorable Mention).

Location: TBA

Organizer: Fredrik Lindsten

### Wednesday, April 21, 3.15 pm, 2021

**Monte Carlo integration with repulsive point processes**

Rémi Bardenet, CNRS & CRIStAL, Université de Lille, France

Rémi Bardenet, CNRS & CRIStAL, Université de Lille, France

*Abstract*: Monte Carlo integration is the workhorse of Bayesian inference, but the mean square error of Monte Carlo estimators decreases slowly, typically as 1/N, where N is the number of integrand evaluations. This becomes a bottleneck in Bayesian applications where evaluating the integrand can take tens of seconds, like in the life sciences, where evaluating the likelihood often requires solving a large system of differential equations. I will present two approaches to faster Monte Carlo rates using interacting particle systems. First, I will show how results from random matrix theory lead to a stochastic version of Gaussian quadrature in any dimension d, with mean square error decreasing as 1/N^{1+1/d}. This quadrature is based on determinantal point processes, which can be argued to be the kernel machine of point processes. Second, I will show how to further take this error rate down assuming the integrand is smooth. In particular, I will give a tight error bound when the integrand belongs to any arbitrary reproducing kernel Hilbert space, using a mixture of determinantal point processes tailored to that space. This mixture is reminiscent of volume sampling, a randomized experimental design used in linear regression.

Joint work with Adrien Hardy, Ayoub Belhadji, Pierre Chainais

https://liu-se.zoom.us/j/69011766298

Passcode: 742124

Organizer: Fredrik Lindsten

### Wednesday, May 19, 3.15 pm, 2021

**Adaptive gradient descent without descent**

Yura Malitsky, Linköping University

Yura Malitsky, Linköping University

*Abstract*: In this talk I will present some recent results for the most classical optimization method - gradient descent. We will show that a simple zero cost rule is sufficient to completely automate gradient descent. The method adapts to the local geometry, with convergence guarantees depending only on the smoothness in a neighborhood of a solution. The presentation is based on a joint work with K. Mishchenko, see https://arxiv.org/abs/1910.09529.

https://liu-se.zoom.us/j/69011766298

Organizer: Fredrik Lindsten

Page responsible: Fredrik Lindsten

Last updated: 2022-03-22