Logistic regression - overview

This page offers structured overviews of one or more selected methods. Add additional methods for comparisons by clicking on the dropdown button in the right-hand column. To practice with a specific method click the button at the bottom row of the table

In the population, the relationship between the independent variables and the log odds $\ln (\frac{\pi_{y=1}}{1 - \pi_{y=1}})$ is linear

The residuals are independent of one another

Often ignored additional assumption:

Variables are measured without error

Also pay attention to:

Multicollinearity

Outliers

In the population, the two variables are jointly normally distributed (this covers the normality, homoscedasticity, and linearity assumptions)

Sample of pairs is a simple random sample from the population of pairs. That is, pairs are independent of one another

Note: these assumptions are only important for the significance test and confidence interval, not for the correlation coefficient itself. The correlation coefficient just measures the strength of the linear relationship between two variables.

Test statistic

Test statistic

Model chi-squared test for the complete regression model:

$X^2 = D_{null} - D_K = \mbox{null deviance} - \mbox{model deviance} $
$D_{null}$, the null deviance, is conceptually similar to the total variance of the dependent variable in OLS regression analysis. $D_K$, the model deviance, is conceptually similar to the residual variance in OLS regression analysis.

Wald test for individual $\beta_k$:
The wald statistic can be defined in two ways:

Wald $ = \dfrac{b_k^2}{SE^2_{b_k}}$

Wald $ = \dfrac{b_k}{SE_{b_k}}$

SPSS uses the first definition

Likelihood ratio chi-squared test for individual $\beta_k$:

$X^2 = D_{K-1} - D_K$
$D_{K-1}$ is the model deviance, where independent variable $k$ is excluded from the model. $D_{K}$ is the model deviance, where independent variable $k$ is included in the model.

where $r_{Fisher} = \frac{1}{2} \times \log\Bigg(\dfrac{1 + r}{1 - r} \Bigg )$ and $z^*$ is the value under the normal curve with the area $C / 100$ between $-z^*$ and $z^*$ (e.g. $z^*$ = 1.96 for a 95% confidence interval).
Then transform back to get approximate $C$% confidence interval for $\rho$:

$R^2_L = \dfrac{D_{null} - D_K}{D_{null}}$
There are several other goodness of fit measures in logistic regression. In logistic regression, there is no single agreed upon measure of goodness of fit.

The Pearson correlation coefficient is a measure for the linear relationship between two quantitative variables.

The Pearson correlation coefficient squared reflects the proportion of variance explained in one variable by the other variable.

The Pearson correlation coefficient can take on values between -1 (perfect negative relationship) and 1 (perfect positive relationship). A value of 0 means no linear relationship.

The absolute size of the Pearson correlation coefficient is not affected by any linear transformation of the variables. However, the sign of the Pearson correlation will flip when the scores on one of the two variables are multiplied by a negative number (reversing the direction of measurement of that variable).For example:

the correlation between $x$ and $y$ is equivalent to the correlation between $3x + 5$ and $2y - 6$.

the absolute value of the correlation between $x$ and $y$ is equivalent to the absolute value of the correlation between $-3x + 5$ and $2y - 6$. However, the signs of the two correlation coefficients will be in opposite directions, due to the multiplication of $x$ by $-3$.

The Pearson correlation coefficient does not say anything about causality.

If you also have code (dummy) variables as independent variables, you can put these in the box below Covariates as well

Instead of transforming your categorical independent variable(s) into code variables, you can also put the untransformed categorical independent variables in the box below Factors. Jamovi will then make the code variables for you 'behind the scenes'