Wednesday, 14 March 2018

Chi-Square Test for Independence - Question 17(A research team...)

Question 17
A research team investigated whether there was any significant correlation between the severity of a certain disease runoff and the age of the patients. During the study, data for n = 200 patients were collected and grouped according to the severity of the disease and the age of the patient. The table below shows the result

Age

below 40

40 - 60

above 60

runoff

slight

41

34

9

average

25

25

12

serious

6

33

15

Let us decided about the correlation between the age of the patients and the severity of disease progression.

Solution Steps
As usual, we need to understand the problem and decide on which particular test to carry out.

In this case, since the question says to investigate whethere there was any significant correlation between the severity and age, it means that the null hypothesis would be that 'there is correlation between the age and the severity'. That is the hypothesis we are going to test.

Step 1: State the null and alternate hypothesisH0: there is significant correlation between the severity and the ageHa: there is no significant correlation between the severity and the age

Step 2: Calcualte the totals
In this step we calculate the totals for each of the row. This i have done using excel formula as you can see in Table 2

Table 2

Step 3: Calculate the expected values
The expected values are calculated by multiplying the corresponding row and column sub-total and dividing by the grand-total. For example, the first expected value that corresponds to Slight and Below 40 would be calculated as follows:

Do this for all the 9 observed values. I have used excel to automatically generate these values and it is shown in Table 3

Table 3

Step 4: Calculate Squared Difference (O-E)2
Where O is the observed values in Table 2 and E is the expected values calcualted in Table 3. The first squared difference would be.

Do this for all the the observed values and the corresponding expected values. The resulting sets of values is given in Table 4

Table 4

Step 5: Calculate the Component
This is the squared deviation you calculated in step 4 divided by the corresponding expected values. For the first value it would be

If you repeat this all the values, then the resulting table would be table 5.

Step 6: Calculate the Test Statistic

This is the sum of all the terms in calculated in the table. I calculated this using the Sum() formula in Excel, but you can do this by hand just to verify.