# The LiU Seminar Series in Statistics and Mathematical Statistics

**Thursday, September 1**, 3.15 pm, 2016. Seminar in Statistics.

**An Introduction to Fuzzy Statistics**

Ronei M. de Moraes, Department of Statistics, Federal University of Paraiba, Brazil.

Ronei M. de Moraes

*Abstract*: Fuzzy sets were proposed by Lofti A. Zadeh in 1965 in order to provide a formalization for imprecision on some types of information. Fuzzy numbers, as well as operations on those ones and two probability measures were proposed after. Zadeh defined probability on fuzzy events and Buckley defined probability in which the parameters are fuzzy numbers. In last years our research group have been working on the design of classifiers based on fuzzy statistics. Some approaches can use Zadeh's probability definition and from that we have proposed classifiers based on Gaussian, Poisson and Binomial distributions. We also have designed classifiers based on Buckley's probability definition and from that we have proposed a classifier based on Exponential distribution. Some examples of using those classifiers and comparisons of them with classical classifiers will be presented.

Location: Alan Turing.

### Tuesday, September 20, 3.15 pm, 2016. Seminar in Statistics.

**Instrumental Variables in Structural Equation Modelling with Ordinal Data**

Fan Yang Wallentin, Statistics, Uppsala University

Fan Yang Wallentin

*Abstract*: Data collected from questionnaires are often in ordinal scale. Unweighted least squares (ULS), diagonally weighted least squares (DWLS) and normal-theory maximum likelihood (ML) are commonly used methods to fit structural equation models. Consistency of these estimators demands no structural misspecification. This simulation study compares the equation-by-equation polychoric instrumental variable (PIV) estimation with ULS, DWLS, and ML. Accuracy of PIV for the correctly specified model and robustness of PIV for misspecified models are investigated through a confirmatory factor analysis (CFA) model and a structural equation model with ordinal indicators. The effects of sample size and nonnormality of the underlying continuous variables are also examined. The simulation results show that PIV produces robust factor loading estimates in the CFA model and in structural equation models. PIV also produces robust path coefficient estimates in the model where valid instruments are used. However, robustness highly depends on the validity of instruments.

Location: Alan Turing.

### Tuesday, October 4, 3.15 pm, 2016. Seminar in Mathematical Statistics.

**The Minimal Hoppe-Beta Prior Distribution for Directed Acyclic Graphs and Structure Learning**

Timo Koski, Mathematical Statistics, KTH

Timo Koski

*Abstract*: This talk gives a new prior distribution over directed acyclic graphs intended for structured Bayesian networks, where the structure is given by an ordered block model. That is, the nodes of the graph are objects which fall into categories or blocks; the blocks have a natural ordering or ranking. The presence of a relationship between two objects is denoted by a directed edge, from the object of category of lower rank to the object of higher rank. Hoppe's urn scheme is invoked to generate a random block scheme. The prior in its simplest form has three parameters that control the sparsity of the graph in two ways; implicitly in terms of the maximal directed path and explicitly by controlling the edge probabilities. We consider the situation where the nodes of the graph represent random variables, whose joint probability distribution factorizes along the DAG. We use a minimal layering of the DAG to express the prior. We describe Monte Carlo schemes, with a similar generative that was used for prior, for finding the optimal a posteriori structure given a data matrix. This is joint work with J. Noble (Univ. of Warsaw) and Felix Rios (KTH).

Location: Hopningspunkten.

### Tuesday, October 18, 3.15 pm, 2016. Seminar in Statistics.

**A likelihood-free version of the stochastic approximation EM algorithm (SAEM) for parameter estimation in complex models**

Umberto Picchini, Mathematical Statistics, Lund University.

Umberto Picchini

*Abstract*: I will present an approximate maximum likelihood methodology for the parameters of incomplete-data models, that is models having some latent (unobserved) component. A likelihood-free version of the stochastic approximation expectation-maximization (SAEM, Delyon et al. 1999) algorithm is constructed to maximize the likelihood function of model parameters. While SAEM is best suited for models having a tractable "complete likelihood" function, its application to moderately complex models is difficult, and results impossible for models having a so-called "intractable likelihood". Intractable likelihoods are those unavailable in closed form or too expensive to approximate. These are typically treated using approximate Bayesian computation (ABC) algorithms or synthetic likelihoods, where information from the data is carried by a set of summary statistics. While ABC is considered the state-of-art methodology for intractable likelihoods, its algorithms are often difficult to tune. On the other hand, synthetic likelihoods (SL) is a more recent methodology which is less general than ABC, it requires stronger assumptions but also less tuning. By exploiting the Gaussian assumption set by SL on data summaries, we can construct a likelihood-free version of SAEM. Our method is completely plug-and-play and available for both static and dynamic models, the ability to simulate realizations from the model being the only requirement.

Location: Alan Turing.

### Tuesday, November 8, 3.15 pm, 2016. Seminar in Mathematical Statistics.

**Some extensions of linear prediction/approximation problems for stationary processes**

Mikhail Lifshits, Mathematical Statistics, Linkoping University and S:t Petersburg State University

Mikhail Lifshits

*Abstract*: In this talk, we consider a problem of linear approximation for stationary random processes and for processes with stationary increments with discrete or continuous time. This setting extends the classical problem of linear prediction: along with prediction quality, optimization has to take into account some other properties of the approximating process, such as , for example, the amount of kinetic energy spent in its approximation efforts. In this generalized setting, we also obtain the extentions of the classical Kolmogorov-Krein results on error-free prediction and that of Kolmogorov on error-free interpolation.

Location: Hopningspunkten.

### Tuesday, November 15, 3.15 pm, 2016. Seminar in Statistics.

**Non-stationary Smoothing Splines: Post-processing of Satellite data**

Johan Lindström, Mathematical Statistics, Lund University.

Johan Lindström

*Abstract*: Post-processing of satellite remote sensing data is often done to reduce noise and remove artifacts due to atmospheric (and other) disturbances. Here we focus specifically on satellite derived vegetation indices which are used for large scale monitoring of vegetation cover, plant health, and plant phenology. These indices often exhibit strong seasonal patterns, where rapid changes during spring and fall contrast to relatively stable behaviour during the summer and winter season. The goal of the post-processing is to extract smooth seasonal curves that describe how the vegetation varies during the year. This is however complicated by missing data and observations with large biases. Here a method for post-processing of satellite based time-series is presented. The method combines seasonally non-stationary smoothing spline with observational errors that are modelled using a normal-variance mixture. The seasonal non-stationarity allows us to capture the different behaviour during the year, and the error structure accounts for the biased and heavy tailed errors induced by atmospheric disturbances. The model is formulated using Gaussian Markov processes and fitted using MCMC.

Location: Alan Turing.

### Tuesday, December 6, 3.15 pm, 2016. Seminar in Mathematical Statistics.

**Spatio-temporal model for wind speed variability**

Igor Rychlik, Mathematical Science, Chalmers University.

Igor Rychlik

*Abstract*: Investments in wind energy harvesting facilities are often high. At the same time uncertainties for the corresponding energy gains are large. Therefore a reliable model to describe the variability of wind speed is needed to estimate the expected available wind power and other statistics of interest, e.g. coefficient of variation, expected length of the wind conditions favorable for the wind-energy harvesting etc. Wind speeds are modeled by means of a spatio-temporal transformed Gaussian field. Its dependence structure is localized by introduction of time and space dependent parameters in the field. The model has the advantage of having a relatively small number of parameters. These parameters have natural physical interpretation and are statistically fitted to represent variability of observed wind speed in ERA Interim reanalysis data set. Some validations and applications of the model will be presented.

Location: Hopningspunkten.

Page responsible: Mattias Villani

Last updated: 2017-01-30