## Posts tagged: statistics

## Guðmundur Helgason (01/06/17)

Thesis presentation in Master of Applied Statistics (MAS)

### Guðmundur Helgason

Titill: Hversu lengi þarf ég að bíða? Forspárlíkön fyrir biðtíma í þjónustuveri CCP

Location: V-157, VRII

Time: Thursday 1. June at14:00.

### Abstrat:

Í þessari rannsókn, með hjálp ýmissa tölfræðiaðferða, spáum við fyrir um biðtíma eftir svari við tölvupósti með gögnum frá þjónustuveri CCP, framleiðanda tölvuleiksins EVE Online. Að mestu leyti er notast við tvíkosta tölfræðilíkön þar sem spáð er fyrir um hvort að svar sé gefið fyrir ákveðinn tímapunkt eða ekki. Samfelldar aðferðir eru þó einnig notaðar, bæði til að spá fyrir um biðtíma í sjálfu sér og hvort svar sé gefið fyrir ákveðinn tíma eða ekki. Auk greiningarlegra aðferða til forspáar er einnig notast við einfaldari empírískar aðferðir til að meta dreifingu biðtíma og líkindi á svari eftir ákveðinn tíma. Tiltækar rannsóknir á sviði þjónustuvera, gæða í þjónustu, áhrifa þess að bíða eftir þjónustu og aðferða sem notast hefur verið við til að spá fyrir um biðtíma eru skoðaðar. Aðferðirnar sem notast var við til biðtíma forspáar eru bornar saman, kostir þeirra og gallar ræddir, auk hugsanlegra hagnýtra eiginleika.

Leiðbeinendur:

Anna Helga Jónsdóttir

Matthías Kormáksson

Prófdómari: Thor Aspelund

## Stella Kristín Hallgrímsdóttir

Thesis presentation in Master of Applied Statistics (MAS)

### Stella Kristín Hallgrímsdóttir

Title: Samband veðurs og komufjölda á bráðamóttökur Landspítala

Location: V-157, VRII

Time: Monday 29. May at 14:00.

### Abstract:

The objective of this project is to study the seasonal and weekly fluctuations in number of arrivals to the emergency departments of the University Hospital of Iceland and also to assess the influence of weather on the number of arrivals. Four emergency departments were examined; the Emergency Department in Fossvogur, the Emergency Unit in the Children‘s Hospital Department, Hjartagátt which is the emergency department for people with suspected acute heart problems, and the Psychiatry Emergency Department. The weather variables that were mostly looked into are temperature, wind speed, precipitation and cloudiness. Seasonal fluctuations were modeled with sine and cosine waves and with the help of linear regression a new variable was made that describes the seasonal fluctuations and linear increase in the number of arrivals. A few ARIMA models were built to predict the number of arrivals in the Emergency Department in Fossvogur and in the Children‘s Emergency Department. The models were compared to find the best prediction model for each department. To assess whether weather affects the number of arrivals in the emergency departments, the weather variables were added one by one to the best prediction model for each department to see if the model‘s prediction root-mean squared error (RMSE) decreases when information about weather is added to the model. Principal components analysis was also used to combine the weather variables into fewer new variables. The new variables were then added to the ARIMA models to assess their effect on the goodness of the models. The results show that adding the weather information slightly decreases prediction RMSE in the Emergency Department in Fossvogur but increases it for the Children‘s Emergency Department. That both applies to when each weather variable was looked into separately and when the principal components were used. Therefore, it can be concluded that weather does not affect the number of arrivals to the Children‘s Emergency Department but it has a minor effect on the number of arrivals to the Emergency Department in Fossvogur. Furthermore, the results show that a good prediction model for the number of arrivals to the emergency departments can be developed only using calendar variables.

Advisors: Dr. Sigrún Helga Lund and Dr. Tryggvi Helgason

Examiner: Dr. Ólafur Pétur Pálsson

## Okan Bulut (28/10/15)

Statistics Colloquium

### Speaker: Okan Bulut

Title: Profile Analysis of Multivariate Data Using the profileR Package

Location: Room 5, Háskólabíó.

Time: Wednedsay, October 28, at 11:00-12:00.

### Abstract:

Profile analysis is a psychometric clustering technique that is the equivalent of a repeated measures extension of the multivariate analysis of variance model. Profile analysis is used by researchers and practitioners to identify whether two or more groups of individuals have significantly distinct or similar profiles based on a set of continuous variables (e.g., test scores on a battery of tests). Profile analysis involves the quantification of the elevation, variation, and parallelism of multiple variables across groups. The profileR package (Bulut & Desjardins, 2015) in R can perform several profile analytic methods, including criterion-related profile analysis, profile analysis via multidimensional scaling, moderated profile analysis, profile analysis by group, and a within-person factor model to derive score profiles. This presentation will provide a brief introduction about common profile analytic techniques and demonstrate their application using the profileR package in R.

## Doctoral defence in Statistics – Anna Helga Jónsdóttir

The defence will take place in the Aula at the Main Building and starts at 14:00.

Opponents are dr. Per B. Brockhoff, Professor at the Technical University of Denmark and dr. Robert C. delMas, Associate professor at the University of Minnesota.

The supervisor is dr. Gunnar Stefánsson, Professor at the Faculty of Physical Sciences at the University of Iceland. The doctoral committe also includes dr. Freyja Hreinsdóttir. Associate professor at the University of Iceland and dr. Auðbjörg Björnsdóttir. Director of the Centre of Teaching at the University of Akureyri.

The ceremony will be chaired by dr. Oddur Ingólfsson, Professor and Vice Head of the Faculty of Physical Sciences at the University of Iceland.

**Abstract**

Continue reading 'Doctoral defence in Statistics – Anna Helga Jónsdóttir'»

## Ólafur Birgir Davíðsson (17/12/14)

Masters thesis presentations

### Ólafur Birgir Davíðsson

Title: Bayesian Flood Frequency Analysis Using Monthly Maxima

Location: VR-II, V-157.

Time: Wednesday December 15., at 14:00-15:00.

### Abstract:

In this thesis a statistical flood frequency analysis model is proposed working fully within the framework of Bayesian hierarchical models and latent Gaussian models. The model uses monthly maxima as opposed to the almost exclusive use of annual maxima in field in an attempt to make better use of data in a field where reliable data is hard to come by. At the latent level a generalized linear mixed model is incorporated that accounts for seasonal dependence of parameters and provides a mechanism that allows the model to be extrapolated to river

catchments where little or no data is available. The observed data comes from twelve river catchments around Iceland.

The choice of data distribution is based on the Gumbel distribution, a special case of the Generalized Extreme Value distribution, and is a complex, high dimensional model that comes with high computational costs. The Markov chain Monte Carlo (MCMC) inference methods make use of a newly developed sampling scheme called the split-sampler pioneered by Óli Páll

Geirsson at the University of Iceland to make the sampling process efficient. The specification of prior distributions makes use of Penalizing Complexity Priors to introduce a robust method to infer the latent parameters.

The results indicate that the use monthly maxima are a viable option in flood fre- quency analysis and that the latent linear mixed model for the likelihood parameters serves as a solid foundation for models of this type.

Advisors: Birgir Hrafnkelsson and Sigurður Magnús Garðarsson

Faculty Representative: Sigrún Helga Lund

## Sigrún Helga Lund (10/11/14)

Math Colloquium

### Speaker: Sigrún Helga Lund, University of Iceland

Title: Multiple spots with the same probe on tiled RNA expression microarrays

Location: V-157, VRII

Time: Monday, November 10 at 15:00-16:00.

### Abstract:

Tiled microarrays are a technology to target non-coding RNAs. In this thesis, custom arrays are designed with the same probe at multiple locations. This allows measuring sources of errors in microarray data that are otherwise neglected, as well as measuring the consistency of selection methods. The analyses performed in this thesis can broadly be split in two categories; analysing sources of variation in microarray data and developing selection methods for targeting expressed and differentially expressed regions. These analyses are the substance of the four papers: Paper I estimates the difference in signal intensities both within and between probe pairs that contain a Single Nucleotide Polymorphism and differ only by the varying allele. The majority of probe-pairs with sufficiently high expression have significant differences in expression levels within the pair that is consistent with the genotype. By using the expression level of the probe within the probe-pair that has the higher value, one receives more accurate estimates. Paper II shows that most RNA sequences are pairwise significantly differently expressed, including randomly generated ones. A search for sequences with expression levels which are significantly different from the population of random ones is therefore proposed. The analysis of within-array replicates indicates that within-array variability can be considerable and can be reduced by replicating probes within the array. Paper III proposes a selection method to identify relatively long regions of moderate expression. The method is used to search for candidate long non-coding RNAs (lncRNAs) at locus 8q24.2 and is run on three independent experiments. The method shows high consistency between experiments that used the same samples, but different probe layout. There is statistically significant consistency between experiments on different samples. Paper IV evaluates the TileShuffle method as a method of finding lncRNAs at 8q24.2. The method is run on three microarrays which all contained the same sample and repeated copies of tiled probes. Monte Carlo simulations showed poor consistency in areas selected between arrays. A crude application of the method can result in most of the region being selected. This thesis shows how repeating the same probe on multiple spots on a microarray can greatly improve accuracy of expression estimates; new methods for locating expressed regions can be applied that show greater consistency than conventional methods. Finally, guidelines for design of tiled microarray experiments are proposed that may be beneficial for all users of tiled microarray experiments.

## Daníel F. Guðbjartsson (03/10/14)

Statistics Colloquium

### Speaker: Daníel F. Guðbjartsson, Department of Statistics, deCODE genetics

Title: Estimating the effect of a sequence variant on correlated phenotypes

Location: Lögberg, 201

Time: Friday October 3 at 12:00-13:00.

### Abstract:

Some variations in the human genome associate with multiple correlated phenotypes. This leads naturally to questions about conditional independence. E.g.: Given the association between a sequence variant and a phenotype, is the association between the variant and a second phenotype significant? It is relatively easy to create statistical tests for conditional independence but concluding about biological mechanisms from

the results of these tests must be done with great care. This is demonstrated through several important examples.

## Anna Helga Jónsdóttir (29/09/14)

Math Colloquium

### Speaker: Anna Helga Jónsdóttir, University of Iceland

Title: The performance of first year students on a diagnostic test of basic mathematical skills

Location: V-157, VRII

Time: Monday September 29., at 15:00-16:00.

### Abstract:

Dropout-rates and poor performance of students in first year courses in calculus in the School of Engineering and natural sciences are of great concern. In order to examine the student’s background in mathematics a diagnostic test has been administrated annually since 2008. The main purpose of the diagnostic test is to provide immediate feedback to the students on their ability but also to enable instructors to identify common areas of difficulty within the group.

The results of the test will be described in the talk; how the results have changed in time, which variables are linked to the grades and how well the test results predict performance on final exams in Calculus IA, IB, IC and N. Ways forward will also be discussed.

## Douglas P. Wiens (12/09/14)

Statistics colloquium

### Speaker: Douglas P. Wiens, Department of Mathematical and Statistical Sciences, University of Alberta

Title: Robustness of Design: A Survey

Location: V-147, VRII

Time: Friday September 12, at 12:00-13:00

### Abstract:

When an experiment is conducted for purposes which include fitting a particular model to the data, then the ‘optimal’ experimental design is highly dependent upon the model assumptions – linearity of the response function, independence and homoscedasticity of the errors, etc. When these assumptions are violated the design can be far from optimal, and so a more robust approach is called for. We should seek a design which behaves reasonably well over a large class of plausible models.

I will review the progress which has been made on such problems, in a variety of experimental and modelling scenarios – prediction, extrapolation, discrimination, survey sampling, dose-response, etc.