The evolution of "informatics" technologies has the potential to generate massive databases, but the extent to which personalized medicine may be effectuated depends on the extent to which these rich databases may be utilized to advance understanding of the disease molecular profiles and ultimately integrated for treatment selection, necessitating robust methodology for dimension reduction. Yet, statistical methods proposed to address challenges arising with the high-dimensionality of omics-type data predominately rely on linear models and emphasize associations deriving from prognostic biomarkers...

We discuss causal mediation analyses for survival data and propose a new approach based on the additive hazards model. The emphasis is on a dynamic point of view, that is, understanding how the direct and indirect effects develop over time. Hence, importantly, we allow for a time varying mediator. To define direct and indirect effects in such a longitudinal survival setting we take an interventional approach (Didelez, 2018) where treatment is separated into one aspect affecting the mediator and a different aspect affecting survival...

The concordance correlation coefficient (CCC) and the probability of agreement (PA) are two frequently used measures for evaluating the degree of agreement between measurements generated by two different methods. In this paper, we consider the CCC and the PA using the bivariate normal distribution for modeling the observations obtained by two measurement methods. The main aim of this paper is to develop diagnostic tools for the detection of those observations that are influential on the maximum likelihood estimators of the CCC and the PA using the local influence methodology but not based on the likelihood displacement...

Different cure fraction models have been used in the analysis of lifetime data in presence of cured patients. This paper considers mixture and nonmixture models based on discrete Weibull distribution to model recurrent event data in presence of a cure fraction. The novelty of this study is the use of a discrete lifetime distribution in place of usual existing continuous lifetime distributions for lifetime data in presence of cured fraction, censored data, and covariates. In the verification of the fit of the proposed model it is proposed the use of randomized quantile residuals...

The popularity of penalized regression in high-dimensional data analysis has led to a demand for new inferential tools for these models. False discovery rate control is widely used in high-dimensional hypothesis testing, but has only recently been considered in the context of penalized regression. Almost all of this work, however, has focused on lasso-penalized linear regression. In this paper, we derive a general method for controlling the marginal false discovery rate that can be applied to any penalized likelihood-based model, such as logistic regression and Cox regression...

In linear mixed-effects models, random effects are used to capture the heterogeneity and variability between individuals due to unmeasured covariates or unknown biological differences. Testing for the need of random effects is a nonstandard problem because it requires testing on the boundary of parameter space where the asymptotic chi-squared distribution of the classical tests such as likelihood ratio and score tests is incorrect. In the literature several tests have been proposed to overcome this difficulty, however all of these tests rely on the restrictive assumption of i...

Designs incorporating more than one endpoint have become popular in drug development. One of such designs allows for incorporation of short-term information in an interim analysis if the long-term primary endpoint has not been yet observed for some of the patients. At first we consider a two-stage design with binary endpoints allowing for futility stopping only based on conditional power under both fixed and observed effects. Design characteristics of three estimators: using primary long-term endpoint only, short-term endpoint only, and combining data from both are compared...

Marginal tests based on individual SNPs are routinely used in genetic association studies. Studies have shown that haplotype-based methods may provide more power in disease mapping than methods based on single markers when, for example, multiple disease-susceptibility variants occur within the same gene. A limitation of haplotype-based methods is that the number of parameters increases exponentially with the number of SNPs, inducing a commensurate increase in the degrees of freedom and weakening the power to detect associations...

The analysis of cause of death is increasingly becoming a topic in oncology. It is usually distinguished between disease-related and disease-unrelated death. A frequently used approach is to define death as disease-related when a progression to advanced phases has occurred before, otherwise as disease-unrelated. The data are often analyzed as competing risks, while a progressive illness-death model might in fact describe the situation more precisely. In this study, we investigated under which circumstances this misspecification leads to biased estimations of the state occupation probabilities...

In clinical trials, sample size reestimation is a useful strategy for mitigating the risk of uncertainty in design assumptions and ensuring sufficient power for the final analysis. In particular, sample size reestimation based on unblinded interim effect size can often lead to sample size increase, and statistical adjustment is usually needed for the final analysis to ensure that type I error rate is appropriately controlled. In current literature, sample size reestimation and corresponding type I error control are discussed in the context of maintaining the original randomization ratio across treatment groups, which we refer to as "proportional increase...

We consider the estimation of the prevalence of a rare disease, and the log-odds ratio for two specified groups of individuals from group testing data. For a low-prevalence disease, the maximum likelihood estimate of the log-odds ratio is severely biased. However, Firth correction to the score function leads to a considerable improvement of the estimator. Also, for a low-prevalence disease, if the diagnostic test is imperfect, the group testing is found to yield more precise estimate of the log-odds ratio than the individual testing...

If the number of treatments in a network meta-analysis is large, it may be possible and useful to model the main effect of treatment as random, that is to say as random realizations from a normal distribution of possible treatment effects. This then constitutes a third sort of random effect that may be considered in connection with such analyses. The first and most common models treatment-by-trial interaction as being random and the second, rather rarer, models the main effects of trial as being random and thus permits the recovery of intertrial information...

We argue that the term "relative risk" should not be used as a synonym for "hazard ratio" and encourage to use the probabilistic index as an alternative effect measure for Cox regression. The probabilistic index is the probability that the event time of an exposed or treated subject exceeds the event time of an unexposed or untreated subject conditional on the other covariates. It arises as a well known and simple transformation of the hazard ratio and nicely reveals the interpretational limitations...

In this paper, we investigate K-group comparisons on survival endpoints for observational studies. In clinical databases for observational studies, treatment for patients are chosen with probabilities varying depending on their baseline characteristics. This often results in noncomparable treatment groups because of imbalance in baseline characteristics of patients among treatment groups. In order to overcome this issue, we conduct propensity analysis and match the subjects with similar propensity scores across treatment groups or compare weighted group means (or weighted survival curves for censored outcome variables) using the inverse probability weighting (IPW)...

In the estimation of proportions by group testing, unequal sized groups results in an ambiguous ordering of the sample space, which complicates the construction of exact confidence intervals. The total number of positive groups is shown to be a suitable statistic for ordering outcomes, provided its ties are broken by the MLE. We propose an interval estimation method based on this quantity, with a mid-P correction. Coverage is evaluated using group testing problems in plant disease assessment and virus transmission by insect vectors...

The hierarchical metaregression (HMR) approach is a multiparameter Bayesian approach for meta-analysis, which generalizes the standard mixed effects models by explicitly modeling the data collection process in the meta-analysis. The HMR allows to investigate the potential external validity of experimental results as well as to assess the internal validity of the studies included in a systematic review. The HMR automatically identifies studies presenting conflicting evidence and it downweights their influence in the meta-analysis...

Next-generation sequencing (NGS) experiments are often performed in biomedical research nowadays, leading to methodological challenges related to the high-dimensional and complex nature of the recorded data. In this work we review some of the issues that arise in disorder detection from NGS experiments, that is, when the focus is the detection of deletion and duplication disorders for homozygosity and heterozygosity in DNA sequencing. A statistical model to cope with guanine/cytosine bias and phasing and prephasing phenomena at base level is proposed, and a goodness-of-fit procedure for disorder detection is derived...

Data Monitoring Committees (DMCs) are an integral part of clinical drug development. Their use has evolved along with changing study designs and regulatory expectations, which has associated statistical and ethical implications. Although there is guidance from the different regulatory agencies, there are opportunities to bring more consistency to address practical issues of establishing and operating a DMC. Challenging issues include defining the scope of DMC decisions, the regulatory requirements and expectations, the perceived independence of DMCs, the specific focus primarily on safety, etc...

Successful pharmaceutical drug development requires finding correct doses. The issues that conventional dose-response analyses consider, namely whether responses are related to doses, which doses have responses differing from a control dose response, the functional form of a dose-response relationship, and the dose(s) to carry forward, do not need to be addressed simultaneously. Determining if a dose-response relationship exists, regardless of its functional form, and then identifying a range of doses to study further may be a more efficient strategy...

We present a method to fit a mixed effects Cox model with interval-censored data. Our proposal is based on a multiple imputation approach that uses the truncated Weibull distribution to replace the interval-censored data by imputed survival times and then uses established mixed effects Cox methods for right-censored data. Interval-censored data were encountered in a database corresponding to a recompilation of retrospective data from eight analytical treatment interruption (ATI) studies in 158 human immunodeficiency virus (HIV) positive combination antiretroviral treatment (cART) suppressed individuals...