Advanced search

Advanced search is divided into two main parts, and one or more groups in each of the main parts. The main parts are the "Search for" (including) and the "Remove from search" (excluding) part. (The excluding part might not be visible until you hit "NOT" for the first time.) You can add new groups to both the including and the excluding part by using the buttons "OR" or "NOT" respectively, and you can add more search options to all groups through the drop down menu on the last row (in each group).

For a result to be included in the search result, is it required to fit all added including parameters (in at least one group) and not fit all parameters in one of the excluding groups. This system with the two main parts and their groups makes it possible to combine two (or more) distinct searches into one search result, while being flexible in removing results from the final list.

Associations between heterozygosity and fitness traits have typically been investigated in populations characterized by low levels of inbreeding. We investigated the associations between standardized multilocus heterozygosity (stMLH) in mother trees (obtained from12 nuclear microsatellite markers) and five fitness traits measured in progenies from an inbred Scots pine population. The traits studied were proportion of sound seed, mean seed weight, germination rate, mean family height of one-year old seedlings under greenhouse conditions (GH) and mean family height of three-year old seedlings under field conditions (FH). The relatively high average inbreeding coefficient (F) in the population under study corresponds to a mixture of trees with different levels of co-ancestry, potentially resulting from a recent bottleneck. We used both frequentist and Bayesian methods of polynomial regression to investigate the presence of linear and non-linear relations between stMLH and each of the fitness traits. No significant associations were found for any of the traits except for GH, which displayed negative linear effect with stMLH. Negative HFC for GH could potentially be explained by the effect of heterosis caused by mating of two inbred mother trees (Lippman and Zamir 2006), or outbreeding depression at the most heterozygote trees and its negative impact on the fitness of the progeny, while their simultaneous action is also possible (Lynch. 1991). However,since this effect wasn’t detected for FH, we cannot either rule out that the greenhouse conditions introduce artificial effects that disappear under more realistic field conditions.

In this paper the moments of the maximum of a finite number of random values are analyzed. The largest part of analysis is focused on extremes of dependent normal values. For the case of normal distribution, the moments of the maximum of dependent values are expressed through the moments of independent values.

The Swedish Transport Administration has, and manages, a database containing information of the status of road condition on all paved and governmental operated Swedish roads. The purpose of the database is to support the Pavement Management System (PMS). The PMS is used to identify sections of roads where there is a need for treatment, how to allocate resources and to get a general picture of the state of the road network condition. All major treatments should be reported which has not always been done.

The road condition is measured using a number of indicators on e.g. the roads unevenness. Rut depth is an indicator of the roads transverse unevenness. When a treatment has been done the condition drastically changes, which is also reflected by these indicators.

The purpose of this master thesis is to; by using existing indicators make predictions to find points in time when a road has been treated.

We have created a SAS-program based on simple linear regression to analyze rut depth changes over time. The function of the program is to find levels changes in the rut depth trend. A drastic negative change means that a treatment has been made.

The proportion of roads with an alleged date for the latest treatment earlier than the programs latest detected date was 37 percent. It turned out that there are differences in the proportions of possible treatments found by the software and actually reported roads between different regions. The regions North and Central have the highest proportion of differences. There are also differences between the road groups with various amount of traffic. The differences between the regions do not depend entirely on the fact that the proportion of heavily trafficked roads is greater for some regions.

Evaluation of forensic evidence is a process lined with decisions and balancing, not infrequently with a substantial deal of subjectivity. Already at the crime scene a lot of decisions have to be made about search strategies, the amount of evidence and traces recovered, later prioritised and sent further to the forensic laboratory etc. Within the laboratory there must be several criteria (often in terms of numbers) on how much and what parts of the material should be analysed. In addition there is often a restricted timeframe for delivery of a statement to the commissioner, which in reality might influence on the work done. The path of DNA evidence from the recovery of a trace at the crime scene to the interpretation and evaluation made in court involves several decisions based on cut-offs of different kinds. These include quality assurance thresholds like limits of detection and quantitation, but also less strictly defined thresholds like upper limits on prevalence of alleles not observed in DNA databases. In a verbal scale of conclusions there are lower limits on likelihood ratios for DNA evidence above which the evidence can be said to strongly support, very strongly support, etc. a proposition about the source of the evidence. Such thresholds may be arbitrarily chosen or based on logical reasoning with probabilities. However, likelihood ratios for DNA evidence depend strongly on the population of potential donors, and this may not be understood among the end-users of such a verbal scale. Even apparently strong DNA evidence against a suspect may be reported on each side of a threshold in the scale depending on whether a close relative is part of the donor population or not. In this presentation we review the use of thresholds and cut-offs in DNA analysis and interpretation and investigate the sensitivity of the final evaluation to how such rules are defined. In particular we show what are the effects of cut-offs when multiple propositions about alternative sources of a trace cannot be avoided, e.g. when there are close relatives to the suspect with high propensities to have left the trace. Moreover, we discuss the possibility of including costs (in terms of time or money) for a decision-theoretic approach in which expected values of information could be analysed.

The Swedish Transport Agency has for a long time collected data on a monthly basis for different variables that are used to make predictions, short projections as well as longer projections. They have used SAS for producing statistical models in air transport. The model with the largest value of coefficient of determination is the method that has been used for a long time. The Swedish Transport Agency felt it was time for an evaluation of their models and methods of how projections is estimated, they would also explore the possibilities to use different, completely new models for forecasting air travel. This Bachelor thesis examines how the Holt-Winters method does compare with SARIMA, error terms such as RMSE, MAPE, R2, AIC and BIC will be compared between the methods.

The results which have been produced showing that there may be a risk that the Holt-Winters models adepts a bit too well in a few variables in which Holt-Winters method has been adapted. But overall the Holt-Winters method generates better forecasts .

By means of ab initio molecular dynamics (AIMD) simulations we carry out a detailed stdly of the electronic and atomic structure of Mo upon the thermal stabilization of its dynamically unstable face-centered cubic (fcc) phase, Wc calculate how the atomic positions, radial distribution function, and the ei<xtronic density of states of fcc Mo evolve with temperature. The results are compared with those for dynamically stable body-centered cubic (bcc) phase of Mo, as well as with bcc Zr, which is dynamically unstable at T = OK, but (in contrast to fcc Mo) becomes thermodynamically stable at high temperature, In particular, wc emphasize the difference between the local positions of atoms in the simulation boxes at a particular step of AIMD simulation and the average positions, around which the atoms vibrate, and show that the former are solcly responsible for the electronic properties of the material. WE observe that while the average atomic positions in fcc Mo correspond perfectly to the ideal structure at high temperature, the electronic structure of the metal calculated from AIMD differs substantially from the canonical shape of the density of states for the ideal fcc crystaL From a comparison of our results obtained for fcc Mo arid bcc Zr, we advocate the use of the electronic structure calculations, complemented with studies of radial distribution functions, as a sensitive test of a degree of the temperature induced stabilization of phases, which are dynamically unstable at T = OK.

Using transoesophageal echocardiography (TEE) we investigated the occurrence, and the association of possible abnormalities of motion of the regional wall of the heart (WMA) or diastolic dysfunction with raised troponin concentrations, or both during fluid resuscitation in patients with severe burns.

Patients and methods

Ten consecutive adults (aged 36–89 years, two women) with burns exceeding 20% total burned body surface area who needed mechanical ventilation were studied. Their mean Baux index was 92.7, and they were resuscitated according to the Parkland formula. Thirty series of TEE examinations and simultaneous laboratory tests for myocyte damage were done 12, 24, and 36 h after the burn.

Results

Half (n = 5) the patients had varying grades of leakage of the marker that correlated with changeable WMA at 12, 24 and 36 h after the burn (p ≤ 0.001, 0.044 and 0.02, respectively). No patient had WMA and normal concentrations of biomarkers or vice versa. The mitral deceleration time was short, but left ventricular filling velocity increased together with stroke volume.

Conclusion

Acute myocardial damage recorded by both echocardiography and leakage of troponin was common, and there was a close correlation between them. This is true also when global systolic function is not deteriorated. The mitral flow Doppler pattern suggested restrictive left ventricular diastolic function.

Background: The Parkland formula (2-4 mL/kg/burned area of total body surface area %) with urine output and mean arterial pressure (MAP) as endpoints; for the fluid resuscitation in burns is recommended all over the world. There has recently been a discussion on whether central circulatory endpoints should be used instead, and also whether volumes of fluid should be larger. Despite this, there are few central hemodynamic data available in the literature about the results when the formula is used correctly.

Methods: Ten burned patients, admitted to our unit early, and with a burned area of >20% of total body sur-face area were investigated at 12, 24, and 36 hours after injury. Using transesophageal echocardiography, pulmonary artery catheterization, and transpulmonary thermodilution to monitor them, we evaluated the cardiovascular coupling when urinary output and MAP were used as endpoints.

Results: Oxygen transport variables, heart rate, MAP, and left ventricular fractional area, did not change significantly during fluid resuscitation. Left ventricular end-systolic and end-diastolic area and global end-diastolic volume index increased from subnormal values at 12 hours to normal ranges at 24 hours after the burn. Extravascular lung intrathoracal blood volume ratio was increased 12 hours after the burn.

Conclusions: Preload variables, global systolic function, and oxygen transport recorded simultaneously by three separate methods showed no need to increase the total fluid volume within 36 hours of a major burn. Early (12 hours) signs of central circulatory hypovolemia, however, support more rapid infusion of fluid at the beginning of treatment.

Bayesian networks (BNs) are advantageous when representing single independence models, however they do not allow us to model changes among the relationships of the random variables over time. Due to such regime changes, it may be necessary to use different BNs at different times in order to have an appropriate model over the random variables. In this paper we propose two extensions to the traditional hidden Markov model, allowing us to represent both the different regimes using different BNs, and potential driving forces behind the regime changes, by modelling potential dependence between state transitions and some observable variables. We show how expectation maximisation can be used to learn the parameters of the proposed model, and run both synthetic and real-world experiments to show the model’s potential.

The Swedish Transport Administration (Trafikverket) has been in charge of the maintenance of the railway systems since 2010. The railway requires regular maintenance in order to keep tracks in good condition for passengers and other transports safety. To insure this safety it is important to measure the tracks geometrical condition. The gauge is one of the most important geometrics that cannot be too wide or narrow.

The aim of this report is to create a model that is able to simulate the deviation from normal gauge from track geometrics and properties.

The deviation from normal gauge is a random quantity that we modeled as a generalized linear model or a generalized additive model. The models can be used to simulate the possible values of the deviation. It was demonstrated in this study that GAM was able to model most of the variation in the deviation from normal gauge with the information from some track geometrics and properties.

Information on weather extremes is essential for risk awareness in planning of infrastructure and agriculture, and it may also playa key role in our ability to adapt to recurrent or more or less unique extreme events. This thesis reports new statistical methodologies that can aid climate risk assessment under conditions of climate change. This increasing access to high temporal resolution of data is a central factor when developing novel techniques for this purpose. In particular, a procedure is introduced for analysis of long-term changes in daily and sub-daily records of observed or modelled weather extremes. Extreme value theory is employed to enhance the power of the proposed statistical procedure, and inter-site dependence is taken into account to enable regional analyses. Furthermore, new methods are presented to summarize and visualize spatial patterns in the temporal synchrony and dependence of weather events such as heavy precipitation at a network of meteorological stations. The work also demonstrates the significance of accounting for temporal synchrony in the diagnostics of inter-site asymptotic dependence.

Abstract [en]

Temporal trends in meteorological extremes are often examined by first reducing daily data to annual index values, such as the 95th or 99th percentiles. Here, we report how this idea can be elaborated to provide an efficient test for trends at a network of stations. The initial step is to make separate estimates of tail probabilities of precipitation amounts for each combination of station and year by fitting a generalised Pareto distribution (GPD) to data above a user-defined threshold. The resulting time series of annual percentile estimates are subsequently fed into a multivariate Mann-Kendall (MK) test for monotonic trends. We performed extensive simulations using artificially generated precipitation data and noted that the power of tests for temporal trends was substantially enhanced when ordinary percentiles were substituted for GPD percentiles. Furthermore, we found that the trend detection was robust to misspecification of the extreme value distribution. An advantage of the MK test is that it can accommodate non-linear trends, and it can also take into account the dependencies between stations in a network. To illustrate our approach, we used long time series of precipitation data from a network of stations in The Netherlands.

Grimvall, Anders

Brömssen, Claudia von

Swedish University of Agricultural Sciences, Uppsala, Sweden.

(English)Manuscript (preprint) (Other academic)

Abstract [en]

Extreme precipitation events vary with regard to duration, and hence sub-daily data do not necessarily exhibit the same trends as daily data. Here, we present a framework for a comprehensive yet easily undertaken statistical analysis of long-term trends in daily and sub-daily extremes. A parametric peaks-over-threshold model is employed to estimate annual percentiles for data of different temporal resolution. Moreover, a trend-durationfrequency table is used to summarize how the statistical significance of trends in annual percentiles varies with the temporal resolution of the underlying data and the severity of the extremes. The proposed framework also includes nonparametric tests that can integrate information about nonlinear monotonic trends at a network of stations. To illustrate our methodology, we use climate model output data from Kalmar, Sweden, and observational data from Vancouver, Canada. In both these cases, the results show different trends for moderate and high extremes, and also a clear difference in the statistical evidence of trends for daily and sub-daily data.

Abstract [en]

We develop new techniques to summarize and visualize spatial patterns of coincidence in weather events such as more or less heavy precipitation at a network of meteorological stations. The cosine similarity measure, which has a simple probabilistic interpretation for vectors of binary data, is generalized to characterize spatial dependencies of events that may reach different stations with a variable time lag. More specifically, we reduce such patterns into three parameters (dominant time lag, maximum cross-similarity, and window-maximum similarity) that can easily be computed for each pair of stations in a network. Furthermore, we visualize such threeparameter summaries by using colour-coded maps of dependencies to a given reference station and distance-decay plots for the entire network. Applications to hourly precipitation data from a network of 93 stations in Sweden illustrate how this method can be used to explore spatial patterns in the temporal synchrony of precipitation events.

Abstract [en]

Weather extremes often occur along fronts passing different sites with some time lag. Here, we show how such temporal patterns can be taken into account when exploring inter-site dependence of extremes. We incorporate time lags into existing models and into measures of extremal associations and their relation to the distance between the investigated sites. Furthermore, we define summarizing parameters that can be used to explore tail dependence for a whole network of stations in the presence of fixed or stochastic time lags. Analysis of hourly precipitation data from Sweden showed that our methods can prevent underestimation of the strength and spatial extent of tail dependencies.

Weather extremes often occur along fronts passing different sites with some time lag. Here, we show how such temporal patterns can be taken into account when exploring inter-site dependence of extremes. We incorporate time lags into existing models and into measures of extremal associations and their relation to the distance between the investigated sites. Furthermore, we define summarizing parameters that can be used to explore tail dependence for a whole network of stations in the presence of fixed or stochastic time lags. Analysis of hourly precipitation data from Sweden showed that our methods can prevent underestimation of the strength and spatial extent of tail dependencies.

Extreme precipitation events vary with regard to duration, and hence sub-daily data do not necessarily exhibit the same trends as daily data. Here, we present a framework for a comprehensive yet easily undertaken statistical analysis of long-term trends in daily and sub-daily extremes. A parametric peaks-over-threshold model is employed to estimate annual percentiles for data of different temporal resolution. Moreover, a trend-durationfrequency table is used to summarize how the statistical significance of trends in annual percentiles varies with the temporal resolution of the underlying data and the severity of the extremes. The proposed framework also includes nonparametric tests that can integrate information about nonlinear monotonic trends at a network of stations. To illustrate our methodology, we use climate model output data from Kalmar, Sweden, and observational data from Vancouver, Canada. In both these cases, the results show different trends for moderate and high extremes, and also a clear difference in the statistical evidence of trends for daily and sub-daily data.

Temporal trends in meteorological extremes are often examined by first reducing daily data to annual index values, such as the 95th or 99th percentiles. Here, we report how this idea can be elaborated to provide an efficient test for trends at a network of stations. The initial step is to make separate estimates of tail probabilities of precipitation amounts for each combination of station and year by fitting a generalised Pareto distribution (GPD) to data above a user-defined threshold. The resulting time series of annual percentile estimates are subsequently fed into a multivariate Mann-Kendall (MK) test for monotonic trends. We performed extensive simulations using artificially generated precipitation data and noted that the power of tests for temporal trends was substantially enhanced when ordinary percentiles were substituted for GPD percentiles. Furthermore, we found that the trend detection was robust to misspecification of the extreme value distribution. An advantage of the MK test is that it can accommodate non-linear trends, and it can also take into account the dependencies between stations in a network. To illustrate our approach, we used long time series of precipitation data from a network of stations in The Netherlands.

We develop new techniques to summarize and visualize spatial patterns of coincidence in weather events such as more or less heavy precipitation at a network of meteorological stations. The cosine similarity measure, which has a simple probabilistic interpretation for vectors of binary data, is generalized to characterize spatial dependencies of events that may reach different stations with a variable time lag. More specifically, we reduce such patterns into three parameters (dominant time lag, maximum cross-similarity, and window-maximum similarity) that can easily be computed for each pair of stations in a network. Furthermore, we visualize such threeparameter summaries by using colour-coded maps of dependencies to a given reference station and distance-decay plots for the entire network. Applications to hourly precipitation data from a network of 93 stations in Sweden illustrate how this method can be used to explore spatial patterns in the temporal synchrony of precipitation events.

Monotonic regression is a nonparametric method for estimation ofmodels in which the expected value of a response variable y increases ordecreases in all coordinates of a vector of explanatory variables x = (x1, …, xp).Here, we examine statistical and computational aspects of our recentlyproposed generalization of the pool-adjacent-violators (PAV) algorithm fromone to several explanatory variables. In particular, we show how the goodnessof-fit and accuracy of obtained solutions can be enhanced by presortingobserved data with respect to their level in a Hasse diagram of the partial orderof the observed x-vectors, and we also demonstrate how these calculations canbe carried out to save computer memory and computational time. Monte Carlosimulations illustrate how rapidly the mean square difference between fittedand expected response values tends to zero, and how quickly the mean squareresidual approaches the true variance of the random error, as the number of observations increases up to 104.

In this paper, the monotonic regression problem (MR) is considered. We have recentlygeneralized for MR the well-known Pool-Adjacent-Voilators algorithm(PAV) from the case of completely to partially ordered data sets. Thenew algorithm, called GPAV, combines both high accuracy and lowcomputational complexity which grows quadratically with the problemsize. The actual growth observed in practice is typically far lowerthan quadratic. The fitted values of the exact MR solution composeblocks of equal values. Its GPAV approximation has also a blockstructure. We present here a technique for refining blocks produced bythe GPAV algorithm to make the new blocks more close to those in theexact solution. This substantially improves the accuracy of the GPAVsolution and does not deteriorate its computational complexity. Thecomputational time for the new technique is approximately triple thetime of running the GPAV algorithm. Its efficiency is demonstrated byresults of our numerical experiments.

Isotonic
regression problem (IR) has numerous important applications in
statistics, operations research, biology, image and signal
processing
and other areas. IR in L2-norm is a minimization problem in which
the
objective function is the squared Euclidean distance from a given
point
to a convex set defined by monotonicity constraints of the form:
i-th
component of the decision vector is less or equal to its j-th
component. Unfortunately, the conventional optimization methods are
unable to solve IR problems originating from large data sets.
The existing IR algorithms, such as the minimum lower sets
algorithm by
Brunk, the min-max algorithm by Lee, the network flow algorithm by
Maxwell & Muchstadt and the IBCR algorithm by Block et al. are
able to
find exact solution to IR problem for at most a few thousands of
variables. The IBCR algorithm, which proved to be the most
efficient of
them, is not robust enough. An alternative approach is related to
solving IR problem approximately. Following this approach, Burdakov
et
al. developed an algorithm, called GPAV, whose block refinement
extension,
GPAVR, is able to solve IR problems with a very high accuracy in a
far
shorter time than the exact algorithms. Apart from this, GPAVR is a
very robust algorithm, and it allows us to solve IR problems with
over
hundred thousands of variables.
In this talk, we introduce new exact IR algorithms, which can be
viewed
as active set methods. They use the approximate solution produced
by
the GPAVR algorithm as a starting point. We present results of our
numerical experiments demonstrating the high efficiency of the new
algorithms, especially for very large-scale problems, and their
robustness. They are able to solve the problems which all existing
exact IR algorithms fail to solve.

In this talk we consider the isotonic regression (IR) problem which can be
formulated as follows. Given a vector $\bar{x} \in R^n$,
find $x_* \in R^n$ which solves the problem:
\begin{equation}\label{ir2}
\begin{array}{cl}
\mbox{min} & \|x-\bar{x}\|^2 \\
\mbox{s.t.} & Mx \ge 0.
\end{array}
\end{equation}
The set of constraints $Mx \ge 0$ represents here the monotonicity
relations of the form $x_i \le x_j$ for a given set of pairs of the
components of $x$. The corresponding row of the matrix $M$ is composed mainly of zeros,
but its $i$th and $j$th elements, which are equal to $-1$ and $+1$,
respectively. The most challenging applications of (\ref{ir2}) are
characterized by very large values of $n$. We introduce new IR
algorithms. Our numerical experiments demonstrate the high efficiency
of our algorithms, especially for very large-scale problems, and their
robustness. They are able to solve some problems which all existing IR
algorithms fail to solve. We outline also our new algorithms for
monotonicity-preserving interpolation of scattered multivariate data.
In this talk we focus on application of our IR algorithms in
postprocessing of FE solutions.
Non-monotonicity of
the numerical solution is a typical drawback of the conventional
methods of approximation, such as finite elements (FE), finite volumes,
and mixed finite elements. The problem of monotonicity is
particularly important in cases of highly anisotropic diffusion tensors
or distorted unstructured meshes. For instance, in the nuclear waste
transport simulation, the non-monotonicity results in the presence of
negative concentrations which may lead to unacceptable concentration
and chemistry calculations failure. Another drawback of the conventional
methods is a possible violation of the discrete maximum principle, which
establishes lower and upper bounds for the solution.
We suggest here a least-change correction to the available FE solution
$\bar{x} \in R^n$. This postprocessing procedure is aimed on recovering
the monotonicity and some other important properties that may not be
exhibited by $\bar{x}$.
The mathematical formulation of the postprocessing problem
is reduced to the following convex quadratic programming problem
\begin{equation}\label{ls2}
\begin{array}{cl}
\mbox{min} & \|x-\bar{x}\|^2 \\
\mbox{s.t.} & Mx \ge 0, \quad
l \le x \le u, \quad
e^Tx = m,
\end{array}
\end{equation}
where$e=(1,1, \ldots ,1)^T \in R^n$.
The set of constraints $Mx \ge 0$ represents here the monotonicity
relations between some of the adjacent mesh cells.
The constraints $l \le x \le u$ originate from
the discrete maximum principle.
The last constraint formulates the conservativity requirement.
The postprocessing based on (\ref{ls2}) is typically a large scale
problem. We introduce here algorithms for solving this problem. They
are based on the observation that, in the presence of the monotonicity
constraints only, problem (\ref{ls2}) is the classical monotonic
regression problem, which can be solved efficiently by some of the
available monotonic regression algorithms. This solution is used then
for producing the optimal solution to problem (\ref{ls2}) in the
presence of all the constraints. We present results of numerical
experiments to illustrate the efficiency of our algorithms.

Monotonic (isotonic) regression is a powerful tool used for solving a wide range of important applied problems. One of its features, which poses a limitation on its use in some areas, is that it produces a piecewise constant fitted response. For smoothing the fitted response, we introduce a regularization term in the monotonic regression, formulated as a least distance problem with monotonicity constraints. The resulting smoothed monotonic regression is a convex quadratic optimization problem. We focus on the case, where the set of observations is completely (linearly) ordered. Our smoothed pool-adjacent-violators algorithm is designed for solving the regularized problem. It belongs to the class of dual active-set algorithms. We prove that it converges to the optimal solution in a finite number of iterations that does not exceed the problem size. One of its advantages is that the active set is progressively enlarging by including one or, typically, more constraints per iteration. This resulted in solving large-scale test problems in a few iterations, whereas the size of that problems was prohibitively too large for the conventional quadratic optimization solvers. Although the complexity of our algorithm grows quadratically with the problem size, we found its running time to grow almost linearly in our computational experiments.

Monotonic (isotonic) Regression (MR) is a powerful tool used for solving a wide range of important applied problems. One of its features, which poses a limitation on its use in some areas, is that it produces a piecewise constant fitted response. For smoothing the fitted response, we introduce a regularization term in the MR formulated as a least distance problem with monotonicity constraints. The resulting Smoothed Monotonic Regrassion (SMR) is a convex quadratic optimization problem. We focus on the SMR, where the set of observations is completely (linearly) ordered. Our Smoothed Pool-Adjacent-Violators (SPAV) algorithm is designed for solving the SMR. It belongs to the class of dual activeset algorithms. We proved its finite convergence to the optimal solution in, at most, n iterations, where n is the problem size. One of its advantages is that the active set is progressively enlarging by including one or, typically, more constraints per iteration. This resulted in solving large-scale SMR test problems in a few iterations, whereas the size of that problems was prohibitively too large for the conventional quadratic optimization solvers. Although the complexity of the SPAV algorithm is O(n2), its running time was growing in our computational experiments in proportion to n1:16.

We consider the problem of minimizing the distance from a given n-dimensional vector to a set defined by constraintsof the form xi xj Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph.This problem is known as the isotonic regression (IR) problem. It has important applications in statistics, operations research and signal processing. The most of the applied IR problems are characterized by a very large value of n. For such large-scale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly.The existing optimization-based algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the Pool-Adjacent-Violator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to the IR problems with thousands of observations.

Large-Scale Nonlinear Optimization reviews and discusses recent advances in the development of methods and algorithms for nonlinear optimization and its applications, focusing on the large-dimensional case, the current forefront of much research.

The chapters of the book, authored by some of the most active and well-known researchers in nonlinear optimization, give an updated overview of the field from different and complementary standpoints, including theoretical analysis, algorithmic development, implementation issues and applications

We propose a novel method for MAP parameter inference in nonlinear state space models with intractable likelihoods. The method is based on a combination of Gaussian process optimisation (GPO), sequential Monte Carlo (SMC) and approximate Bayesian computations (ABC). SMC and ABC are used to approximate the intractable likelihood by using the similarity between simulated realisations from the model and the data obtained from the system. The GPO algorithm is used for the MAP parameter estimation given noisy estimates of the log-likelihood. The proposed parameter inference method is evaluated in three problems using both synthetic and real-world data. The results are promising, indicating that the proposed algorithm converges fast and with reasonable accuracy compared with existing methods.

Establishing what variables affect learning rates in experimental auctions can be valuable in determining how competitive bidders in auctions learn. This study aims to be a foray into this field. The differences, both absolute and actual, between participant bids and optimal bids are evaluated in terms of the effects from a variety of variables such as age, sex, etc. An optimal bid in the context of an auction is the best bid a participant can place to win the auction without paying more than the value of the item, thus maximizing their revenues. This study focuses on how two opponent types, humans and computers, affect the rate at which participants learn to optimize their winnings.

A Bayesian multilevel model for time series is used to model the learning rate of actual bids from participants in an experimental auction study. The variables examined at the first level were auction type, signal, round, interaction effects between auction type and signal and interaction effects between auction type and round. At a 90% credibility interval, the true value of the mean for the intercept and all slopes falls within an interval that also includes 0. Therefore, none of the variables are deemed to be likely to influence the model.

The variables on the second level were age, IQ, sex and answers from a short quiz about how participants felt when the y won or lost auctions. The posterior distributions of the second level variables also found to be unlikely to influence the model at a 90% credibility interval.

This study shows that more research is required to be able to determine what variables affect the learning rate in competitive bidding auction studies

Cancer is a big health problem in Sweden and one of the leading causesof death. As of recently there has been an increase in the interest in thecancer research community in how chronic inflammation influences theemergence of cancer tumors. In this paper we intend to investigate therelationship between markers of chronic inflammation and lung cancer.We will also look in to the possibility that sialic acid may be a marker oflung cancer.Lung cancer is one of the most common types of cancers in Sweden andhas a high mortality rate compared to other cancers. Reasons for thisbeing both from the fact that it is usually discovered at a later stage and islocated in a sensitive organ make it harder to treat.To do this we will use sialic acid as a marker of inflammation from acohort study done during the early sixties, it will be cross referenced withthe Swedish Cancer Registry from Socialstyrelsen. BMI and age will alsobe used as variables of interest.The method for the paper will use Cox Proportional Hazard Model toestimate the risk of sialic acid relating to the emergence of lung cancer.Sensitivity analysis will be used to consider the effects of smoking as aconfounder variable.The results show that there are reasons to believe that chronicinflammation may have a role in the emergence of lung cancer tumors.There is no evidence in this study that suggest that sialic acid is a goodmarker for an individual having lung cancer. Furthermore, BMI seems tohave a protective effect against lung cancer, but this would need furtherstudy to draw any real conclusions from.

Change-point detection aims to reveal sudden changes in sequences of data. Special attention has been paid to the detection of abrupt level shifts, and applications of such techniques can be found in a great variety of fields, such as monitoring of climate change, examination of gene expressions and quality control in the manufacturing industry. In this work, we compared the performance of two methods representing frequentist and Bayesian approaches, respectively. The frequentist approach involved a preliminary search for level shifts using a tree algorithm followed by a dynamic programming algorithm for optimizing the locations and sizes of the level shifts. The Bayesian approach involved an MCMC (Markov chain Monte Carlo) implementation of a method originally proposed by Barry and Hartigan. The two approaches were implemented in R and extensive simulations were carried out to assess both their computational efficiency and ability to detect abrupt level shifts. Our study showed that the overall performance regarding the estimated location and size of change-points was comparable for the Bayesian and frequentist approach. However, the Bayesian approach performed better when the number of change-points was small; whereas the frequentist became stronger when the change-point proportion increased. The latter method was also better at detecting simultaneous change-points in vector time series. Theoretically, the Bayesian approach has a lower computational complexity than the frequentist approach, but suitable settings for the combined tree and dynamic programming can greatly reduce the processing time.

Analysis of functional magnetic resonance imaging (fMRI) data is becoming ever more computationally demanding as temporal and spatial resolutions improve, and large, publicly available data sets proliferate. Moreover, methodological improvements in the neuroimaging pipeline, such as non-linear spatial normalization, non-parametric permutation tests and Bayesian Markov Chain Monte Carlo approaches, can dramatically increase the computational burden. Despite these challenges, there do not yet exist any fMRI software packages which leverage inexpensive and powerful graphics processing units (GPUs) to perform these analyses. Here, we therefore present BROCCOLI, a free software package written in OpenCL (Open Computing Language) that can be used for parallel analysis of fMRI data on a large variety of hardware configurations. BROCCOLI has, for example, been tested with an Intel CPU, an Nvidia GPU, and an AMD GPU. These tests show that parallel processing of fMRI data can lead to significantly faster analysis pipelines. This speedup can be achieved on relatively standard hardware, but further, dramatic speed improvements require only a modest investment in GPU hardware. BROCCOLI (running on a GPU) can perform non-linear spatial normalization to a 1 mm3 brain template in 4–6 s, and run a second level permutation test with 10,000 permutations in about a minute. These non-parametric tests are generally more robust than their parametric counterparts, and can also enable more sophisticated analyses by estimating complicated null distributions. Additionally, BROCCOLI includes support for Bayesian first-level fMRI analysis using a Gibbs sampler. The new software is freely available under GNU GPL3 and can be downloaded from github (https://github.com/wanderine/BROCCOLI/).

We propose a voxel-wise general linear model with autoregressive noise and heteroscedastic noise innovations (GLMH) for analyzing functional magnetic resonance imaging (fMRI) data. The model is analyzed from a Bayesian perspective and has the benefit of automatically down-weighting time points close to motion spikes in a data-driven manner. We develop a highly efficient Markov Chain Monte Carlo (MCMC) algorithm that allows for Bayesian variable selection among the regressors to model both the mean (i.e., the design matrix) and variance. This makes it possible to include a broad range of explanatory variables in both the mean and variance (e.g., time trends, activation stimuli, head motion parameters and their temporal derivatives), and to compute the posterior probability of inclusion from the MCMC output. Variable selection is also applied to the lags in the autoregressive noise process, making it possible to infer the lag order from the data simultaneously with all other model parameters. We use both simulated data and real fMRI data from OpenfMRI to illustrate the importance of proper modeling of heteroscedasticity in fMRI data analysis. Our results show that the GLMH tends to detect more brain activity, compared to its homoscedastic counterpart, by allowing the variance to change over time depending on the degree of head motion.

The software packages SPM, FSL and AFNI are the most widely used packages for the analysis of functional magnetic resonance imaging (fMRI) data. Despite this fact, the validity of the statistical methods has only been tested using simulated data. By analyzing resting state fMRI data (which should not contain specific forms of brain activity) from 396 healthy con- trols, we here show that all three software packages give in- flated false positive rates (4%-96% compared to the expected 5%). We isolate the sources of these problems and find that SPM mainly suffers from a too simple noise model, while FSL underestimates the spatial smoothness. These results highlight the need of validating the statistical methods being used for fMRI.

The most widely used task functional magnetic resonance imaging (fMRI) analyses use parametric statistical methods that depend on a variety of assumptions. In this work, we use real resting-state data and a total of 3 million random task group analyses to compute empirical familywise error rates for the fMRI software packages SPM, FSL, and AFNI, as well as a nonparametric permutation method. For a nominal familywise error rate of 5%, the parametric statistical methods are shown to be conservative for voxelwise inference and invalid for clusterwise inference. Our results suggest that the principal cause of the invalid cluster inferences is spatial autocorrelation functions that do not follow the assumed Gaussian shape. By comparison, the nonparametric permutation test is found to produce nominal results for voxelwise as well as clusterwise inference. These findings speak to the need of validating the statistical methods being used in the field of neuroimaging.

We are glad that our paper (1) has generated intense discussions in the fMRI field (2⇓–4), on how to analyze fMRI data, and how to correct for multiple comparisons. The goal of the paper was not to disparage any specific fMRI software, but to point out that parametric statistical methods are based on a number of assumptions that are not always valid for fMRI data, and that nonparametric statistical methods (5) are a good alternative. Through AFNI’s introduction of nonparametric statistics in the function 3dttest++ (3, 6), the three most common fMRI softwares now all support nonparametric group inference [SPM through the toolbox SnPM (www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/nichols/software/snpm), and FSL through the function randomise].

Cox et al. (3) correctly point out that the bug in the AFNI function 3dClustSim only had a minor impact on the false-positive rate (FPR). This was also covered in our original paper (1): “We note that FWE [familywise error] rates are lower with the bug-fixed 3dClustSim function. As an example, the updated function reduces the degree of false …

Simple models and algorithms based on restrictive assumptions are often used in the field of neuroimaging for studies involving functional magnetic resonance imaging, voxel based morphometry, and diffusion tensor imaging. Nonparametric statistical methods or flexible Bayesian models can be applied rather easily to yield more trustworthy results. The spatial normalization step required for multisubject studies can also be improved by taking advantage of more robust algorithms for image registration. A common drawback of algorithms based on weaker assumptions, however, is the increase in computational complexity. In this short overview, we will therefore present some examples of how inexpensive PC graphics hardware, normally used for demanding computer games, can be used to enable practical use of more realistic models and accurate algorithms, such that the outcome of neuroimaging studies really can be trusted.

This report gives a background description of the data collection company Norstat and how they implement a tracking survey with a panel via the internet. Furthermore connections between variables describing persons in the survey and the way these persons answer the survey will be investigated. The report also intends to find out how long a survey needs to be running and if there are differences between received answers depending on when a person has answered. A detailed description of the processing and variables included in the data material being used will also be given. Earlier research concerning panels and web surveys are covered to give the reader a nuanced picture of the pros and cons with opinion surveys.

Logistic regression methods have been used to examine which variables influence whether a person will answer the survey or not, and the variables that make a person answer the survey early or late. Other methods used are descriptive statistics and a χ2-test.

The results show that factors influencing how much spare time a person has give the greatest impact on whether and how early the survey gets completed. It can be noted that it is often enough with a field period up to 6 days after the invitation to the survey has been sent out until all categories of persons are relatively equally represented. The optimal field period differs depending on whether a study is aimed at providing a picture of the entire country's population or only specific categories of this. For a special category of the persons, it can sometimes be enough to let the field period run until the day after the invitation to the survey had been sent out for enough answers to be submitted.

The market for streaming services is growing rapidly. With a growing number of subscribers, companies are now looking for efficient ways to find which underlying effects that has effect on when a subscriber will churn. The purpose for this paper is to with the use of viewings per month, type of package and number of packages in the time period see if the risk for churn differs between the groups.

The data contains information about all started subscription for four different types of subscription-based-packages for a single streaming company. Information about all subscribers starting dates are known. Due to the fact that not every subscriber has churned at the end of the study, the data is partially censored. We use Cox proportional hazard model to handle this problem. Multiple failure models are used because some subscribers have multiple subscriptions during the study.

The study shows that with increased activity on the site, the risk for churn increases for all four different types of packages. Kwong-Kay Wong’s (2011) study shows that a subscriber who has changed its plan to a more optimal one has a lower risk of churn. In their study the switch of plan was to a known better plan. In our study we see that most subscribers who return after churn comes back to the same type of package that they left, and the results will therefore be interpreted differently. A subscriber who has his second or third subscription in package 4 has an increased risk of churn if they have the same amount of activity on the website compared to if it would have been the subscribers first subscription. We also find that package 4 has an increased risk of churn compared to the risk of package 1, given that the two subscribers have the same amount of activity.

There is always a need to study the properties of complex input–output systems, properties that may be very difficult to determine. Two such properties are the output’s sensitivity to changes in the inputs and the output’s uncertainty if the inputs are uncertain.

A system can be formulated as a model—a set of functions, equations and conditions that describe the system. We ultimately want to study and learn about the real system, but with a model that approximates the system well, we can study the model instead, which is usually easier. It is often easier to build a model as a set of combined sub-models, but good knowledge of each sub-model does not immediately lead to good knowledge of the entire model. Often, the most attractive approach to model studies is to write the model as computer software and study datasets generated by that software.

Methods for sensitivity analysis (SA) and uncertainty analysis (UA) cannot be expected to be exactly the same for all models. In this thesis, we want to determine suitable SA and UA methods for a road traffic emission model, methods that can also be applied to any other model of similar structure. We examine parts of a well-known emission model and suggest a powerful data-generating tool. By studying generated datasets, we can examine properties in the model, suggest SA and UA methods and discuss the properties of these methods. We also present some of the results of applying the methods to the generated datasets.

Eriksson, Olle

Abstract [en]

Emissions from road traffic are hard to measure and therefore usually estimated in models. In this paper the construction of the widely used COPERT III model is examined, and the model is rewritten in mathematical notation. The original COPERT III software is easily handled but is not suitable as an emissiondata generating tool for fast calculations over a broad variety of driving conditions, which is required for sensitivity analysis. An alternative program has been developed to meet the desired properties of such a tool. The construction of the alternative program is discussed together with its new abilities and restrictions. Some differences between the results from the original COPERT III software and the alternative program are analyzed and discussed.

Eriksson, Olle

Abstract [en]

Many complex systems are simulated in models, and these models are hard to evaluate completely. For example, the sensitivity of the model output for different inputs is seldom analyzed. This paper provides an introduction to sensitivity analysis and proposes some basic sensitivity analysis methods. The results are important, for example, in choosing a sampling scheme. Themethods are applied to the COPERT III model, and the sensitivity results are compared and discussed. Certain theoretical results simplify the sensitivity analysis in a special case, and the paper ends by recommending a method.

Eriksson, Olle

Abstract [en]

Sensitivity analyses may be local or global, one at a time or all at a time, or classified in other ways. This paper examines some response surface methods for instant calculations of many sensitivities at the same time. The appropriateness of replacing the original model with a simpler response surface is discussed.The methods are also compared with one at a time methods. Sensitivity results are calculated by applying the methods to the COPERT III model. The conclusion is that a simple response surface method should be preferred.

Eriksson, Olle

Abstract [en]

This paper discusses sensitivity analysis based on methods that divide the total variation of a particular response into components associated with the explaining variables or factors. Two methods for estimating variance components in no-interaction multiway designs are discussed, and their dependence on balancein the sampling design is studied. The methods are applied to datasets generated from a road traffic emission model.

Eriksson, Olle

Abstract [en]

The output of a model or an experiment is often fairly uncertain because the underlying conditions have to some extent been guessed at or estimated. Such uncertainty is usually described by examining the distribution of the output given an input of known distribution. In this paper, we discuss why this approachmay not be suitable for some problems, and consider an alternate approach in cases in which we have only vague information concerning the distributions of the inputs, regardless as to whether these inputs are continm:ms or categorical. Our approach is well suited for a mean output that represents the weighted sum of the outputs of categories, if there is a large dataset with an uncertain input distribution. In this study, the method is applied to a road traffic emission scenario.

This paper concerns design of contingent valuation experiments when interest is in knowing whether respondents have positive willingness to pay and, if so, if they are willing to pay a certain amount for a specified good. A trinomial spike model is used to model the response. Locally D- and c-optimal designs are derived and it is shown that any locally optimal design can be deduced from the locally optimal design for the case when one of the model parameters is standardized. It is demonstrated how information about the parameters, e.g., from pilot studies, can be used to construct minimax and maximin efficient designs, for which the best guaranteed value of the criterion function or the efficiency function is sought under the assumption that the parameter values are within certain regions. The proposed methodology is illustrated on an application where the value of the environmentally friendly production of clothes is evaluated.

This paper demonstrates how to plan a contingent valuation experiment to assess the value of ecologically produced clothes. First, an appropriate statistical model (the trinomial spike model) that describes the probability that a randomly selected individual will accept any positive bid, and if so, will accept the bid A, is defined. Secondly, an optimization criterion that is a function of the variances of the parameter estimators is chosen. However, the variances of the parameter estimators in this model depend on the true parameter values. Pilot study data are therefore used to obtain estimates of the parameter values and a locally optimal design is found. Because this design is only optimal given that the estimated parameter values are correct, a design that minimizes the maximum of the criterion function over a plausable parameter region (i.e. a minimax design) is then found.

Some people will follow a different educational path despite having the intellectual ability to do well in school. This study explored how educational achievers and underachievers were different from each other in middle adulthood as well as examined which individual and contextual factors in adolescence were important to educational underachievement in middle adulthood. Participants are a school cohort followed from age 10 to middle adulthood (N = 1,326) and are from the Swedish longitudinal research program entitled Individual Development and Adaptation. This study focuses on a subgroup of Individual Development and Adaptation participants (n = 304) with above average intelligence (Mean IQ = 119.39, SD = 5.97). Study findings showed that a minority of adolescents in the study focal group (26%) did not complete high school, and women were more likely to educationally underachieve than men. A simultaneous multilevel logistic regression, with school class accounted for in the analysis, showed that for those of above average intelligence parents socioeconomic status and school grades were the strongest predictors of educational attainment. Finally, in midlife, underachievers had lower incomes and occupational levels, drank less frequently, and rated their health as worse than achievers. Study implications are discussed in terms of ways to advance the field of gifted underachievement and in relation to Swedish gifted educational policy.

In 1997 began a unique project, the ABIS study (All Babies in Southeast Sweden) at the Faculty of Health Sciences in Linköping (Linköping University). Of all babies born during the period 1 October 1997-1 October 1999 in the counties Blekinge, Småland, Öland and Östergötland have about 17 000 been followed-up at regular intervals over the years: at birth, after one-year, after 2-3 years, after 5-6 years and after eight years.

Children/families have for each moment in the study submitted biological samples and responded to questionnaires. The questionnaires contain questions of varying types; this paper takes into account the socio-demographic variables and the variables that were used to "measure" stress with the parents and to some extent with the children.

Through the years the number of participants declined sharply from 16 051 filling out the first questionnaire to 4030 at the eight-year follow-up. With this essay we will investigate if the nonresponse group has specific characteristics and if the cause of the nonresponse can be explained. As a basis we work with the data recorded by the questionnaires, with starting point at the birth in which all who responded are assumed to be the population and then the ones who leave the study into the next follow-ups constitute the nonresponse group.

In order to tackle the problem, multidimensional scaling, cluster analysis, and logistic regression have been used. None of the methods however made it possible distinguish observations in two different groups that correspond with the groups of respondents and dropouts. Therefore, we cannot describe or explain the nonresponse from the variables that have been chosen, i.e. socio-demographic and stress variables.

The objective of this master thesis is to identify “key-drivers” embedded in customer satisfaction data. The data was collected by a large transportation sector corporation during five years and in four different countries. The questionnaire involved several different sections of questions and ranged from demographical information to satisfaction attributes with the vehicle, dealer and several problem areas. Various regression, correlation and cooperative game theory approaches were used to identify the key satisfiers and dissatisfiers. The theoretical and practical advantages of using the Shapley value, Canonical Correlation Analysis and Hierarchical Logistic Regression has been demonstrated and applied to market research.

This thesis is focused on investigating the predictability of exchange rate returns on monthly and daily frequency using models that have been mostly developed in the machine learning field. The forecasting performance of these models will be compared to the Random Walk, which is the benchmark model for financial returns, and the popular autoregressive process. The machine learning models that will be used are Regression trees, Random Forests, Support Vector Regression (SVR), Least Absolute Shrinkage and Selection Operator (LASSO) and Bayesian Additive Regression trees (BART). A characterizing feature of financial returns data is the presence of volatility clustering, i.e. the tendency of persistent periods of low or high variance in the time series. This is in disagreement with the machine learning models which implicitly assume a constant variance. We therefore extend these models with the most widely used model for volatility clustering, the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) process. This allows us to jointly estimate the time varying variance and the parameters of the machine learning using an iterative procedure. These GARCH-extended machine learning models are then applied to make one-step-ahead prediction by recursive estimation that the parameters estimated by this model are also updated with the new information. In order to predict returns, information related to the economic variables and the lagged variable will be used. This study is repeated on three different exchange rate returns: EUR/SEK, EUR/USD and USD/SEK in order to obtain robust results. Our result shows that machine learning models are capable of forecasting exchange returns both on daily and monthly frequency. The results were mixed, however. Overall, it was GARCH-extended SVR that shows great potential for improving the predictive performance of the forecasting of exchange rate returns.

Consistent negative correlations between sibship size and cognitive performance (as measured by intelligence quotient and other mental aptitude tests) have been observed in past empirical studies. However, parental decisions on family size may correlate with variables affecting child cognitive performance. The aim of this study is to demonstrate how selection bias in studies of sibship size effects can be adjusted for. We extend existing knowledge in two aspects: as factors affecting decisions to increase family size may vary across the number and composition of current family size, we propose a sequential probit model (as opposed to binary or ordered models) for the propensity to increase family size; to disentangle selection and causality we propose multilevel multiprocess modelling where a continuous model for performance is estimated jointly with a sequential probit model for family size decisions. This allows us to estimate and adjust for the correlation between unmeasured heterogeneity affecting both family size decisions and child cognitive performance. The issues are illustrated through analyses of scores on Peabody individual achievement tests among children of the US National Longitudinal Survey of Youth 1979. We find substantial between-family heterogeneity in the propensity to increase family size. Ignoring such selection led to overestimation of the negative effects of sibship size on cognitive performance for families with 1-3 children, when known sources of selection were accounted for. However, the multiprocess modelling proposed could efficiently identify and control for such bias due to adverse selection.

We demonstrate improvements in predictive power when introducing spline functions to take account of highly nonlinear relationships between firm failure and leverage, earnings, and liquidity in a logistic bankruptcy model. Our results show that modeling excessive nonlinearities yields substantially improved bankruptcy predictions, on the order of 70%-90%, compared with a standard logistic model. The spline model provides several important and surprising insights into nonmonotonic bankruptcy relationships. We find that low-leveraged as well as highly profitable firms are riskier than those given by a standard model, possibly a manifestation of credit rationing and excess cash-flow volatility.

Department of Neurology, Boston Children’s Hospital, Boston, Massachusetts, United States of America.

Craddock, R. Cameron

Computational Neuroimaging Lab, Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America, Center for the Developing Brain, Child Mind Institute, New York, New York, United States of America.

Department of Psychology, Stanford University, Stanford, California, United States of America.

Flandin, Guillaume

Wellcome Trust Centre for Neuroimaging, London, United Kingdom.

Ghosh, Satrajit S.

McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, Department of Otolaryngology, Harvard Medical School, Boston, Massachusetts, United States of America.

Guntupalli, J. Swaroop

Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire, United States of America.

The rate of progress in human neurosciences is limited by the inability to easily apply a wide range of analysis methods to the plethora of different datasets acquired in labs around the world. In this work, we introduce a framework for creating, testing, versioning and archiving portable applications for analyzing neuroimaging data organized and described in compliance with the Brain Imaging Data Structure (BIDS). The portability of these applications (BIDS Apps) is achieved by using container technologies that encapsulate all binary and other dependencies in one convenient package. BIDS Apps run on all three major operating systems with no need for complex setup and configuration and thanks to the comprehensiveness of the BIDS standard they require little manual user input. Previous containerized data processing solutions were limited to single user environments and not compatible with most multi-tenant High Performance Computing systems. BIDS Apps overcome this limitation by taking advantage of the Singularity container technology. As a proof of concept, this work is accompanied by 22 ready to use BIDS Apps, packaging a diverse set of commonly used neuroimaging algorithms.

Multiple time series of environmental quality data with similar, but not necessarily identical, trends call for multivariate methods for trend detection and adjustment for covariates. Here, we show how an additive model in which the multivariate trend function is specified in a nonparametric fashion (and the adjustment for covariates is based on a parametric expression) can be used to estimate how the human impact on an ecosystem varies with time and across components of the observed vector time series. More specifically, we demonstrate how a roughness penalty approach can be utilized to impose different types of smoothness on the function surface that describes trends in environmental quality as a function of time and vector component. Compared to other tools used for this purpose, such as Gaussian smoothers and thin plate splines, an advantage of our approach is that the smoothing pattern can easily be tailored to different types of relationships between the vector components. We give explicit roughness penalty expressions for data collected over several seasons or representing several classes on a linear or circular scale. In addition, we define a general separable smoothing method. A new resampling technique that preserves statistical dependencies over time and across vector components enables realistic calculations of confidence and prediction intervals.

50.

Halje, Karin

et al.

Young Adults Ctr, Linkoping, Sweden.

Timpka, Toomas

Linköping University, Department of Medical and Health Sciences, Division of Community Medicine. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Center for Health and Developmental Care, Center for Public Health.

Objectives: To examine a self-referral psychological service provided to young adults with regard to effects on anxiety, depression and psychological distress and to explore client factors predicting non-adherence and non-response. Design: Observational study over a 2-year period. Setting: Young Adults Centre providing psychological services by self-referral (preprimary care) to Linkoping, Atvidaberg, and Kinda municipalities (combined population 145 000) in Ostergotland county, Sweden. Participants: 607 young adults (16-25 years of age); 71% females (n= 429). Intervention: Individually scheduled cognitive behavioural therapy delivered in up to six 45 min sessions structured according to an assessment of the clients mental health problems: anxiety, depression, anxiety and depression combined, or decreased distress without specific anxiety or depression. Primary outcome measures: Pre-post intervention changes in psychological distress (General Health Questionnaire-12, GHQ-12), Hospital Anxiety and Depression Scale Anxiety/Depression (HADS-A/D). Results: 192 clients (32.5%) discontinued the intervention on their own initiative and 39 clients (6.6%) were referred to a psychiatric clinic during the course of the intervention. Intention-to-treat analyses including all clients showed a medium treatment effect size (d= 0.64) with regard to psychological distress, and small effect sizes were observed with regard to anxiety (d= 0.58) and depression (d= 0.57). Restricting the analyses to clients who adhered to the agreed programme, a large effect size (d= 1.26) was observed with regard to psychological distress, and medium effect sizes were observed with regard to anxiety (d= 1.18) and depression (d= 1.19). Lower age and a high initial HADS-A score were the strongest risk factors for non-adherence, and inability to concentrate and thinking of oneself as a worthless person increased the risk for discontinuation. Conclusions: We conclude that provision of psychological services to young people through a self-referral centre has potential to improve long-term mental health in communities, but management of non-adherence remains a central challenge.