Calculate Standard Error over averaged responses or raw responses?

New Member

I would like to calculate the standard error of my Dependent Variable. However, I have psycholinguistic data, which uses multiple ppts and multiple items. Thus, there are 2 ways of doing this: either by calculating SE over averaged responses (example 1) OR by calcualting SEs over raw responses (Example 2)

1. Average over items by participant (as in Table 1). Then calculate mean and SE from this table.
Table 1.
..which gives mean 62.67 and SE = 16.63

2. List each observation on its own row (as opposed to average observation by ppt) - as in Table 2.
Table 2:

Because N is greater in example 2, the SE will be different (Mean =62.67 ; SE =17.3 ). The example I give is very simple, but the fact you can get different SEs is important - especially if I were using SE to compare means in two different conditions. My question is, which method is best for calculating the Standard Error?

Many thanks! Altohugh this is a simple Q, I really appreciate any insights.
Ryan

New Member

Thank you for your response - it makes sense! PPT refers to participant (i.e., PPT 1 is data observed from first participant ,whereas PPT 2 is data from second participant, and so on). I expected this was the answer, but it is interesting because a lot of standard text books don't discuss this issue: they simply present the summary data which averages a response for each participant (i.e., they have already averaged over items, as in example 1), and calculates SE from there (as in Example 1, above) - but I understand that would be the wrong approach and that the approach used in Example 2 is best practice. It would be good if such textbooks provided a footnote to explain this.

New Member

New Member

OK so it's example 1? Or is there even a 'correct' choice? Example 2 might be more appropriate because it doesn't throw any data away i.e., it is calculating variance of every observation from the average. On the other hand, if we aren't allowed to use multiple responses from the same participant, then Example 1 would be more appropriate.

Member

OK so it's example 1? Or is there even a 'correct' choice? Example 2 might be more appropriate because it doesn't throw any data away i.e., it is calculating variance of every observation from the average. On the other hand, if we aren't allowed to use multiple responses from the same participant, then Example 1 would be more appropriate.