# The LiU Seminar Series in Statistics and Mathematical Statistics

Spring 2013

### Tuesday, January 15, 3.15 pm, 2013. Seminar in Statistics

**Supervised Link Prediction in Dynamic Networks Using Multiple Sources of Information**

Berkant Savas, Division of Computational Mathematics, LiU.

Berkant Savas

*Abstract*: Link prediction is a fundamental problem in social network analysis and modern-day commercial applications. Most existing research approaches this problem by exploring the topological structure of a social network using only one source of information. However, in many application domains, in addition to the social network of interest, there are a number of auxiliary social networks and/or derived proximity networks available. We will use exponential random graph model (ERGM) to describe the transition probability for the network dynamics and present: (1) a supervised learning framework that can effectively and efficiently learn the dynamics of social networks in the presence of auxiliary networks; (2) a feature design scheme for constructing a rich variety of path-based features using multiple sources, and an effective feature selection strategy based on structured sparsity. Extensive experiments on three real-world collaboration networks show that our model can effectively learn to predict new links using multiple sources, yielding higher prediction accuracy than unsupervised and single source supervised models.

Location: Alan Turing

### Tuesday, January 29, 3.15 pm, 2013. Seminar in Mathematical Statistics

**Ergodic properties for the conditional distribution of partially observed Markov chains**

Thomas Kaijser, Mathematical Statistics, LiU.

Thomas Kaijser

*Abstract*: Suppose we want to investigate the properties of a stochastic process using some kind of observation system. As a model for the stochastic process we use a Markov kernel, and similarly for the observation process. (A model of this type is nowadays often called a Hidden Markov Model(HMM) or State Space Model; as a special case we have the so-called Kalman filter.) The interest in such models has been very great the last two (three) decades, and HMMs have e.g. been applied to speech recognition and gene-finding in DNA. The problem, that I have been interested in, is to find conditions which imply ergodic properties for the conditional distribution of the Markov chain, given the observations. (This problem is mainly of theoretical interest, with no immediate practical applications.) My plan is to give a historical overview of this problem and to present some recent results.

Location: Hopningspunkten.

### Tuesday, February 12, 3.15 pm, 2013. Seminar in Statistics

**Learning models of nonlinear dynamical systems**

Thomas Schön, Department of Electrical Engineering (ISY), LiU.

Thomas Schön

*Abstract*: Learning nonlinear dynamical models typically results in problems lacking analytical solutions. These problems can be attacked using computational methods aiming at approximately solving the problem as good as possible. In this talk we will show how Monte Carlo methods can be used to device powerful algorithms for learning nonlinear dynamical models. More specifically we will make use of sequential Monte Carlo (SMC) methods such as the particle filter and the particle smoother, Markov chain Monte Carlo (MCMC) methods and the combination of MCMC and SMC commonly referred to as particle MCMC. We will study both maximum likelihood solutions and Bayesian solutions. The maximum likelihood estimates are computed using the expectation maximisation algorithm invoking a particle smoother, solving the inherent nonlinear state smoothing problem. The Bayesian solution is obtained using a particle Gibbs (one of the members in the PMCMC family) algorithm with backward simulation. The Wiener model (consisting of a linear dynamical system followed by a static nonlinearity) is used as running example throughout the talk to illustrate the algorithms.

Location: Alan Turing

### Thursday, February 14, 1.15 pm, 2013. Seminar in Mathematical Statistics

**Bridging between the block-circularity and the compound symmetry tests**

Carlos A. Coelho, Mathematics Department, Faculdade de Ciencias e Tecnologia, Universidade Nova de Lisboa, Portugal

Carlos A. Coelho

*Abstract*: Using a suitable diagonalization, of the block-circulant structure and by adequately splitting the null hypothesis of block-circularity, it is possible to easily define the likelihood ratio test statistic to test for compound symmetry, once the block-circulant structure of the covariance matrix is assumed. This approach also enables us to easily build a similar test for complex multivariate normal random variables. Near-exact distributions, which lie very close to the exact distribution, are developed for the likelihood ratio test statistic.

Keywords: characteristic function, composition of hypotheses, decomposition of the null hypothesis, distribution of likelihood ratio statistics, near- exact distributions, product of independent Beta random variables, sum of independent Gamma random variables.

Location: Kompakta rummet.

### Tuesday, February 26, 3.15 pm, 2013. Seminar in Mathematical Statistics

**Large deviations for weighted empirical measures arising in importance sampling**

Henrik Hult, Mathematical Statistics, KTH.

Henrik Hult

*Abstract*: Importance sampling is a popular method for efficient computation of various properties of a distribution such as probabilities, expectations, quantiles, etc. The output of an importance sampling algorithm can be represented as a weighted empirical measure, where the weights are given by the likelihood ratio between the original distribution and the sampling distribution. In this talk the efficiency of an importance sampling algorithm is studied by means of large deviations for the weighted empirical measure. The main result, which is stated as a Laplace principle for the weighted empirical measure arising in importance sampling, can be viewed as a weighted version of Sanov's theorem. The main result is applied to quantify the efficiency of an importance sampling algorithm over a collection of subsets as well as quantiles.

Location: Kompakta rummet.

### Tuesday, March 12, 3.15 pm, 2013. Seminar in Statistics

**Model averaging and variable selection in VAR models**

Shutong Ding, Statistics, Örebro University.

Shutong Ding

*Abstract*: Bayesian model averaging and model selection is based on the marginal likelihoods of the competing models. This can, however, not be used directly in VAR models when one of the issues is which - and how many - variables to include in the model since the likelihoods will be for different groups of variables and not directly comparable. One possible solution is to consider the marginal likelihood for a core subset of variables that are always included in the model. This is similar in spirit to a recent proposal for forecast combination based on the predictive likelihood. The two approaches are contrasted and their performance is evaluated in a simulation study and a forecasting exercise.

Location: Alan Turing

### Tuesday, March 26, 3.15 pm, 2013. Seminar in Mathematical Statistics

**Statistical aspects of adaptive designs in clinical studies**

Frank Miller, Mathematical Statistics, Stockholm University.

Frank Miller

*Abstract*: Adaptive designs became popular in recent years in the context of clinical studies. Mid-term during an ongoing study, changes to the design can be conducted based on the data collected until then. This design adaptation offers the opportunity to handle uncertainties in the planning phase of the study. However, if data dependent changes are done during a study, statistical inference is more complicated. Unexpected effects on the properties of estimators and tests can be introduced. We consider a sample size re-estimation design which is an example of a specific adaptive design. We explain for this design why a bias for an estimator and a test occurs and how one can correct for it.

Location: Kompakta rummet.

### Tuesday, April 9, 3.15 pm, 2013. Seminar in Statistics

**On evolution of one application: a Bayesian data augmentation analysis**

Andriy Andreev, Statistics, Stockholm University.

Andriy Andreev

*Abstract*: Complex biological systems are almost always only partially observable and standard statistical modeling, based solely on observed data, can sometimes result in misleading conclusions about relationships of interest. Data augmentation has in its various forms proved to be useful by adding the unobservable part to the likelihood. This talk will present a research evolution of one non-parametrically defined hierarchical intensity model. Predictive distributions are used to access causal influences, while MCMC disentangles conditioning on interdependent random processes.

Location: Von Neumann

### Tuesday, April 23, 3.15 pm, 2013. Seminar in Mathematical Statistics

**Bootstrap percolation on random graphs and its application in neuromodelling.**

Tatyana Turova, Mathematical Statistics, Lund University.

Tatyana Turova

*Abstract*: We consider random processes on random graphs relevant for modelling of propagation of activation in neural networks. A bootstrap percolation is a process in a discrete time. The state of the process at each time is a vector with entries indexed by the vertices of the underlying graph. The value at each entry is binary, zero or one. The value zero becomes one at time t+1 if it has at least r neighbours with value 1 at time $t$, and the value one does not change with time. We study the dynamics of the set of vertices with value one. If the underlying structure is a classical random graph, it is known (joint work with Janson, Luczak and Vallier) that typically either the initial set of ones increases only a few times, or it percolates through almost entire graph. To model a spread of activation in a neural network we consider a percolation on a random graph with both local and random connections on a lattice. We also introduce a function of inhibition to model an effect of "self-organization". We derive sufficient conditions which allow to stabilize the percolation process at an intermediate level (which order is strictly higher than the initial one, but does not yield a complete percolation).

Location: Algorithmen, ISY.

### Thursday, May 16, 3.15 pm, 2013. Seminar in Statistics

**Dimension reduction: modern aspects**

Nickolay Trendafilov, Department of Mathematics and Statistics, The Open University, UK

Nickolay Trendafilov

*Abstract*: The talk is divided into two parts: principal component analysis and exploratory factor analysis. Principal component analysis (PCA) is well known technique for dimension reduction. For PCA, the basic formulations and features of the technique will be recalled. Next, the classic ways to interpret PCA solutions will be discussed. Their limitations and shortcomings are overcome by adopting a new approach for PCA interpretation, called sparse PCA. This approach produces sparse components loading in a sense that each principal component is composed by only few original variables. This makes the interpretation easier and objective, especially when large number of variables are involved. Exploratory factor analysis (EFA) is a less popular technique for dimension reduction than PCA. It will be explained why this is the case, and why it makes sense to reconsider EFA. The EFA basic formulations and features will be explained. Next, a modern generalization of the classic EFA will be described. This new development makes it possible to analyze data with more variables than observations, which is the typical data format in modern applications as climate, gene, etc data analysis.

[ Slides ]

Location: Alan Turing

### Tuesday, June 4, 10.15 am, 2013. Seminar in Mathematical Statistics

**Addition and multiplication laws for free non-hermitian ensembles**

Maciej A. Nowak, Jagiellonian University, Poland

Maciej A. Nowak

*Abstract*: We remind the generalization of additive Voiculescu R-transform for non-hermitian ensembles, using the planar diagrammatic techniques. Next, using similar techniques, we derive a multiplication law for free non-Hermitian random matrices allowing for an easy reconstruction of the two-dimensional eigenvalue distribution of the product ensemble from the characteristics of the individual ensembles. We provide examples which illustrate our construction.

Location: Hopningspunkten.

Page responsible: Mattias Villani

Last updated: 2013-05-29