Chi-Squared for Dummies

Comments (0)

Transcript of Chi-Squared for Dummies

by Isabel Kraus-Liang Chi-Squared For Dummies The Many uses of Chi-squared 1. Test for Goodness of Fitdetermine if population distribution is different from specific distributionConditions: expected counts must be >1 and no more than 20% of them must be <5. 2. Test for Homogeneity comparison of observed and expected valuesas name suggests, it is to see their similarities in regards to a specific variable3. Test for Association/Independenceto see if there is an association between 2 categorical variablesrequires a Two-Way table Components of Chi-Squared Test 1. Hypotheses2. Matrices3. Z-score4. Finding chi-squared 5. Interpretation Hypotheses 1st Step in any testPoint of test is to prove or disprove null hypothesisIn the case that null hypothesis is disproved, alternative hypothesis is correct. Two kinds 1. Null 2. Alternative Matrices Is in r x c form, which is row by column form. You do not count the sum row or column. Consists of expected and observed data, all in separate matrices. Usually requires you to make a sum row and column Z-scores Chi-squared is actually the square of the Z statistic.Can only be used for comparison of two proportions. The greater the Z-score, the higher the probability Null Hypothesis: proportions are equalAlternative Hypotheses: the proportions are not equal. Finding Chi-Squared Make sure there is a sum row and columnFind expected counts by multiplying row total and column total and then dividing by the number of observations. Plug in observed and expected counts into this equation, or your calculator. You're done. Null Hypotheses 1. Goodness of FitThe observed population proportions are equal to that of the expected2. Homogeneity The distribution of the response variable (y-value) is the same in all the populations.3. Association/IndependenceThere is no association between the 2 categorical variables. Alternative Hypotheses 1. Goodness of FitThe observed population proportions differ from that of the expected2. Homogeneity The distribution of the response variable (y-value) is not the same in all the populations.3. Association/IndependenceThere is an association between the 2 categorical variables. Interpretation Our Example Let's say you surveyed 200 people about their opinion on the nation's economic status. This consisted of 123 women and 77 men. Twenty-three women and 10 men had no opinion and 33 women and 15 men were satisfied with the economic status. Equation Our Example Suppose a student gets a 60 in a numerical reasoning test. The class scores for the exam are normally distributed. The mean score on the test is 50 and standard deviation is 12. How did this student perform relative to everyone else?

The z-score for the numerical test is (60-50)/12 = 0.83. This means 17% of students in his class did better than him. You would put this in your calculator... In [A] In [B] At this point, you need to find the degrees of freedom (a.k.a. df) which is basically (r-1)(c-1).You find this value on the far left column of the Chi-square distribution critical values.The top row gives p-values (which are probability values). Match your df with a p-value by finding the number in the table that is the closest to your chi-squared value. If you find it is evenly between two p-values, find the average of this. Compare p-value to significance level (alpha), which is commonly 0.05. If p-value is less than alpha, you cannot accept null hypothesis. And then: