Statitical Decision Tree

Comments (0)

Transcript of Statitical Decision Tree

Interval/Ratio Dependent Variable Statistical Decision TreeThis is a decision tree designed to explain each step of selecting an appropriate statistic. If you are unsure how to make a decision, just zoom in and look a bit closer to see more details. Also, feel free to zoom out and try to get a better idea of how all these statistical analysis relate. :) Enjoy! SourcesNominal Dependent VariableOrdinal Dependent Variable1 Independent Variables with 2+ levels2+ Independent Variables0 Independent VariablesOne-Way MANOVAFactor AnalysisBetween subjects designNot a between subjects design2 groups3 + groupsFriedmanWilcoxon or Sign Test2 groups3 + groupsKruskal-Wallis>20 per cell<20 per cellrank-sumMann-Whitney ULooking for the difference between expected and observed?Looking at the relationship between twovariables?Chi square goodness of fitChi square test of independencefor Inferential StatisticsNo IndependentVariablesIndependentVariablesKnown sigmaUnknown sigmaindependent samples t-test or 1-way between subjects ANOVAdependent samples t-test or 1-way within subjects ANOVAMultiple Dependent Variables 2+ Independent Variables1 Independent VariableIndependent Variables are all between subjectsIndependent Variables are all within subjectsIndependent Variables are all between subjects and

A true zero typically refers to the absence of something. It is the point from which positive or negative numerical quantities can be measured. True zero

A true zero typically refers to the absence of something. It is the point from which positive or negative numerical quantities can be measured. True zero

A true zero typically refers to the absence of something. It is the point from which positive or negative numerical quantities can be measured. Between Subjects

An experimental design with 2 + groups of subjects, each group is being tested under different treatment conditions. Not Between Subjects

An experimental design that is not between subject is referred to as a within-subjects (aka: repeated measures) design. In within-subjects designs, the participants are exposed to multiple levels of the treatment conditions. This is also the type of measure used in longitudinal studies.Per cell refers to per condition

Per cell refers to per condition

Sigma

Sigma show how much variation from the average there is (Standard Deviation). A low sigma indicates the data points are very close to the mean, whereas a high sigma indicates that the data points are spread out over a large range. The deviation is how far a given point is from the average. Image source: David Chandler, MIThttp://web.mit.edu/newsoffice/2012/explained-sigma-0209.htmlSigma

Sigma show how much variation from the average there is (Standard Deviation). A low sigma indicates the data points are very close to the mean, whereas a high sigma indicates that the data points are spread out over a large range. The deviation is how far a given point is from the average. Independent Variables

Independent variables are the causes or predictor variables in the experiment. Technically, they are only referred to as independent variables in a true experiment, in all other types of experimental design they are called the predictor variables as they predict the dependent variables.Independent Variables

Independent variables are the causes or predictor variables in the experiment. Technically, they are only referred to as independent variables in a true experiment, in all other types of experimental design they are called the predictor variables as they predict the dependent variables.This one gives great info on how to interpret statistical analysis in SPSS, R, SAS and Stata (this is my favorite statistical website I think, you should check it out) :)http://www.ats.ucla.edu/stat/mult_pkg/whatstat/

Here is my second favorite (as this one is obviously my favorite) interactive decision tree:http://www.microsiris.com/Statistical%20Decision%20Tree/default.htm

This is a glossary from the above interactive decision tree:http://www.microsiris.com/Statistical%20Decision%20Tree/Glossary.htm#STATISTICAL MEASURE

I found this website particularly useful for a variety of things. It does a great job explaining power and effect size and such very succinctly:http://onlinestatbook.com/2/introduction/levels_of_measurement.htmlThese are a few of my favoriteWhat it does:With a Chi square goodness of fit test, we can see if the observed (actual experimental) results are different from the expected (hypothesized/ predicted) results. What to report it:You'll want to report the Chi-square value, degrees of freedom, and the p value.

What it does:The Wilcoxon-Mann-Whitney test is the nonparametric version of the the independent samples t-test. It is handy because you don't have to assume that the dependent variable is normally distributed. (The dependent variable does have to be interval but is does have to be ordinal). What to report it:Make sure to report the z score and p value.

What it does:With a Chi square test of independence, you can check for a relationship between two categorical variables. This test assumes that the value in each cell is higher than 5. If you don't meet this criteria, use Fisher's exact test (only for a 2x2 table)What to report it:You'll want to report the Chi-square value, degrees of freedom, and the p value. Hint: In SPSS, chi-squared test is under the crosstabs command

What it does:With a One-way ANOVA, we can look at the differences in the means of the dependent variable (normally distributed, interval) broken down by levels of the independent variable (2+ categories, categorical).What to report it:The F value is important (as well as its p value). The means and standard deviations of the variables may also be worth reporting.

What it does:You will want to use the sign test when the variable has just direction information (positive and negative) rather than ordinal. The Wilcoxon is used when there is direction (positive and negative) as well as magnitude information in the data.

(Like the Wilcoxon, this is also a nonparametric test)

What it does:The Wilcoxon signed rank sum test is the nonparametric version of a paired samples t-test (so you don't have to assume that the difference between the two variables is normally distributed or interval). You can use it to compare the differences between two variables What to report:You'll want to report the z score and p valueWhat it does:There are several types of t-tests. Independent, paired samples (dependent), and one sample t-test. This category is specific to independent samples t-test.Using the independent samples t-test, we can test to see if a mean (for a normal distributed, interval, dependent variable) is the same for two independent groups. What to report it:You'll want to report the t value p value and perhaps the raw meas and standard deviations.What it does:There are several types of t-tests. Independent, paired samples (dependent), and one sample t-test. This category is specific to dependent t-test, also known as paired samples t-tests.Using the dependent (paired) samples t-test, we can test to see if two means (normally distributed) differ from one another. What to report it:You'll want to report the t value p value.What it does:There are several types of t-tests. Independent, paired samples (dependent), and one sample t-test. This category is specific to one-sample t-tests.Using the one-sample t-test, we can test to see if a sample mean is significantly different from the hypothesized value. The t-test which follows a Student's t-distribution (versus the z-test, which follows a normal distribution. The t-test is also better for smaller samples than the z-test. The t-test is more commonly used. What to report it:You'll want to report the t value p value.What it does:The One-way MANOVA is similar to the ANOVA except that there are 2+ dependent variables and one categorical independent variable. It allows us to look at the differences between the dependent variables broken down by level of the independent variable. What to report:F values, degrees of freedom, and p values for each of the relationships. This test also provides the Type III Sum of Squares in the outputWhat it does:Factor analysis is used to either decrease the number of variables or find relationships among variables. With this analysis, we can describe the variability among the variables in terms of factors (a factor is something that may encompass one or more of the variables together, helping to decrease the number of variables overall). This analysis shows the interdependence between variables. This analysis can also be used to test for the validity of a test.*Note: Factor analysis is related to the principal component analysis (PCA) although PCA is a descriptive statistical technique.*Note: This is a complicated analysis to explain and I would highly recommend looking at a book or taking some time online to really understand it well. With that said, it is a highly useful and often used analysis. What to report:Variance accounted for by each factor, commonalities (what they have in common... the opposite of uniqueness) for the factors, rotated factor loadingshttp://www.ats.ucla.edu/stat/mult_pkg/whatstat/What it does:When using a within-subjects independent variavle with multiple level, you use the Fridman test to look for a difference in different scores. The null would be that there is no difference between the scores.What to report:You'll want to report the chi-square value and p-value.What it does:Using the z-test, we can test to see if a sample mean is significantly different from a provided constant number or provided mean when we know sigma. The z-test follows a normal distribution (versus the t-test which follows a Student's t-distribution). The z-test is also better for larger samples than the t-test; however, the z-test is less commonly used. What to report it:You'll want to report the z value p value.What it does:In a two-way ANOVA, you are comparing the mean differences between groups that are split on two independent variables. It allows you to look for an interaction between the two independent variables due to the dependent variable.

Here's a good review of the assumptions to make sure you meet before beginning: https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php

A between-subjects 2-way ANOVA just means that both the independent variables are between subjects.

What to report it:You'll want to report the F values (for main effects and interaction), degrees of freedom, and p-values.What it does:In a two-way ANOVA, you are comparing the mean differences between groups that are split on two independent variables. It allows you to look for an interaction between the two independent variables due to the dependent variable.

Here's a good review of the assumptions to make sure you meet before beginning: https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php

A within-subjects 2-way ANOVA just means that both the independent variables are within subjects.

What to report it:You'll want to report the F values (for main effects and interaction), degrees of freedom, and p-values.What it does:In a two-way ANOVA, you are comparing the mean differences between groups that are split on two independent variables. It allows you to look for an interaction between the two independent variables due to the dependent variable.

Here's a good review of the assumptions to make sure you meet before beginning: https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php

A mixed-subjects 2-way ANOVA just means that one independent variable is between subjects and the other is within subjects.

What to report it:You'll want to report the F values (for main effects and interaction), degrees of freedom, and p-values.Multivariate Multiple Linear RegressionWhat it does:This is used to predict 2 or more dependent variables from two or more independent variables. It estimates a single regression model with multiple outcome (dependent) variablesWhat to report:F values, degrees of freedom, and p values for each of the relationships. This test also provides the Type III Sum of Squares in the output.