Vikan 22. – 25. apríl 2014 verður tölfræðivika. Þessa daga verða fjórir tölfræðiviðburðir.

Allir fyrirlestrarnir verða í V-157 í VR-II.

- 22. apríl kl. 14:00.
**Emtiyaz Khan**frá École Polytechnique Fédérale de Lausanne:*Machine Learning* - 23. apríl kl. 12:00.
**R-Ísland**:*Stofnfundur félags R-notenda á Íslandi* - 23. apríl kl. 13:00.
**Andrew Gelman**frá Columbia University:*Tölfræðipakkinn Stan* - 25. apríl kl. 12:00.
**Håvard Rue**frá Norwegian University of Science and Technology:*Penalising model component complexity: A principled practical approach to constructing priors*

The week April 22–25, 2014 will be a statistics week. Over these days there will be four statistics events.

All the talks will be in room V-157 in VR-II on the UI campus.

- April 22 at 14:00.
**Emtiyaz Khan**from École Polytechnique Fédérale de Lausanne:*Machine Learning* - April 23 at 12:00.
**R-Iceland**:*Opening meeting of R-Iceland, a community of R-users in Iceland* - April 23 at 13:00.
**Andrew Gelman**from Columbia University:*The statistics package Stan* - April 25 at 12:00.
**Håvard Rue**from Norwegian University of Science and Technology:*Penalising model component complexity: A principled practical approach to constructing priors*

### Fyrirlesari: Emtiyaz Khan, Researcher við École Polytechnique Fédérale de Lausanne

Titill: Scalable Bayesian Collaborative Preference Learning

**Ágrip:** Learning about users’ utilities from preference, discrete choice or implicit feedback data is of integral importance in e-commerce, targeted advertising and web search. Due to the sparsity and diffuse nature of data, Bayesian approaches hold much promise, yet most prior work does not scale up to realistic data sizes. In this talk, I will first discuss why inference for such settings is computationally difficult for standard machine learning methods, most of which focus on predicting

explicit ratings only. To simplify the difficulty, I will present a novel expectation maximization algorithm, driven by expectation propagation approximate inference, which scales to very large datasets without requiring strong factorization assumptions. Our proposed utility model uses both latent bilinear collaborative filtering and non-parametric Gaussian process (GP) regression. In experiments on large real-world datasets, our method gives substantially better results than either matrix factorization or GPs in isolation, and converges significantly faster.

### Fyrirlesari: Prof. Andrew Gelman frá Columbia University

Titill: Stan: A platform for Bayesian inference

**Ágrip:** Stan is a free and open-source C++ program under development (the first version was released in late 2012 and we are now at version 2.2) that we intend ultimately to use for most of our statistical modeling applications. Stan works by requiring the user to declare data and parameters and specify a log probability function. The program then computes the log density and its gradients; these can then be used by various statistical algorithms to optimize, approximate, or draw samples

from the associated posterior distribution. We have so far focused on Hamiltonian Monte Carlo using the No-U-turn sampler, but we are also working on variational approximation, expectation propagation, marginal and conditional approximations, and other fast algorithms that should be useful in moderate-sized to large problems. Stan is developer-friendly in allowing direct access to the function evaluations and their gradients and with a modular structure that allows one to plug in alternative sampling, approximation, and optimization algorithms, and is computationally efficient in that it is compiled rather than interpreted and uses an

optimized algorithmic auto-differentiation procedure. We discuss what motivated us to develop Stan, how it works, our current struggles, and our future plans.

The slides of this talk can be found at

http://www.stat.columbia.edu/~gelman/presentations/stantalk2014_handout.pdf

### Fyrirlesari: Prof. Håvard Rue frá Norwegian University of Science and Technology

Titill: Penalising model component complexity: A principled practical approach to constructing priors

**Ágrip:** The issue of setting prior distributions on model parameters, or to attribute uncertainty for model parameters, is a difficult issue in applied Bayesian statistics. Although the prior distribution should ideally encode the users’ prior knowledge about the parameters, this level of knowledge transfer seems to be unattainable in practice and applied statisticians are forced to search for a “default” prior. Despite the development of objective priors, which are only available explicitly for a small number of highly restricted model classes, the applied statistician has few practical guidelines to follow when choosing the priors. An easy way out of this dilemma is to re-use prior choices of others, with an appropriate reference.

In this talk, I will introduce a new concept for constructing prior distributions. We exploit the natural nested structure inherent to many model components, which defines the model component to be a flexible extension of a base model. Proper priors are defined to penalise the complexity induced by deviating from the simpler base model and are formulated after the input of a user-defined scaling parameter for that model component.

These priors are invariant to reparameterisations, have a natural connection to Jeffreys’ priors, are designed to support Occam’s razor and seem to have excellent robustness properties, all which are highly desirable and allow us to use this approach to define default prior distributions.

This joint work with Thiago G. Martins, Daniel P. Simpson, Andrea Riebler (NTNU) and Sigrunn H. Sørbye (Univ. of Tromsø).

A copy of the paper can be found at : http://arxiv.org/abs/1403.4630

### Speaker: Emtiyaz Khan, Researcher at École Polytechnique Fédérale de Lausanne

Title: Scalable Bayesian Collaborative Preference Learning

**Abstract:** Learning about users’ utilities from preference, discrete choice or implicit feedback data is of integral importance in e-commerce, targeted advertising and web search. Due to the sparsity and diffuse nature of data, Bayesian approaches hold much promise, yet most prior work does not scale up to realistic data sizes. In this talk, I will first discuss why inference for such settings is computationally difficult for standard machine learning methods, most of which focus on predicting

explicit ratings only. To simplify the difficulty, I will present a novel expectation maximization algorithm, driven by expectation propagation approximate inference, which scales to very large datasets without requiring strong factorization assumptions. Our proposed utility model uses both latent bilinear collaborative filtering and non-parametric Gaussian process (GP) regression. In experiments on large real-world datasets, our method gives substantially better results than either matrix factorization or GPs in isolation, and converges significantly faster.

### Speaker: Prof. Andrew Gelman from Columbia University

Title: Stan: A platform for Bayesian inference

**Abstract:** Stan is a free and open-source C++ program under development (the first version was released in late 2012 and we are now at version 2.2) that we intend ultimately to use for most of our statistical modeling applications. Stan works by requiring the user to declare data and parameters and specify a log probability function. The program then computes the log density and its gradients; these can then be used by various statistical algorithms to optimize, approximate, or draw samples

from the associated posterior distribution. We have so far focused on Hamiltonian Monte Carlo using the No-U-turn sampler, but we are also working on variational approximation, expectation propagation, marginal and conditional approximations, and other fast algorithms that should be useful in moderate-sized to large problems. Stan is developer-friendly in allowing direct access to the function evaluations and their gradients and with a modular structure that allows one to plug in alternative sampling, approximation, and optimization algorithms, and is computationally efficient in that it is compiled rather than interpreted and uses an

optimized algorithmic auto-differentiation procedure. We discuss what motivated us to develop Stan, how it works, our current struggles, and our future plans.

The slides of this talk can be found at

http://www.stat.columbia.edu/~gelman/presentations/stantalk2014_handout.pdf

### Speaker: Prof. Håvard Rue from Norwegian University of Science and Technology

Title: Penalising model component complexity: A principled practical approach to constructing priors

**Abstract:** The issue of setting prior distributions on model parameters, or to attribute uncertainty for model parameters, is a difficult issue in applied Bayesian statistics. Although the prior distribution should ideally encode the users’ prior knowledge about the parameters, this level of knowledge transfer seems to be unattainable in practice and applied statisticians are forced to search for a “default” prior. Despite the development of objective priors, which are only available explicitly for a small number of highly restricted model classes, the applied statistician has few practical guidelines to follow when choosing the priors. An easy way out of this dilemma is to re-use prior choices of others, with an appropriate reference.

In this talk, I will introduce a new concept for constructing prior distributions. We exploit the natural nested structure inherent to many model components, which defines the model component to be a flexible extension of a base model. Proper priors are defined to penalise the complexity induced by deviating from the simpler base model and are formulated after the input of a user-defined scaling parameter for that model component.

These priors are invariant to reparameterisations, have a natural connection to Jeffreys’ priors, are designed to support Occam’s razor and seem to have excellent robustness properties, all which are highly desirable and allow us to use this approach to define default prior distributions.

This joint work with Thiago G. Martins, Daniel P. Simpson, Andrea Riebler (NTNU) and Sigrunn H. Sørbye (Univ. of Tromsø).

A copy of the paper can be found at : http://arxiv.org/abs/1403.4630