It's not about what you guess the distribution in the population to be, it's how you represent distribution you get from the actual test results. Someone who does the IQ test and gets a result 1 standard deviation above the mean of all the test-takers gets an IQ score of 124 from a test-giver using the old scale and an IQ score of 115 from a test-giver using the current scale.

No, it doesn't work that way. The test delivers a specific IQ score for a given performance regardless of the traits of the population of test-takers. One's score doesn't rely on the scores of other test-takers or the statistics of the population as a whole.

If this were not true, people's IQ scores would change depending on the makeup of the tested population, but in fact, it's the other way around -- the statistical makeup of the population, the mean and standard deviation, depends on assessing many individual scores, each of which is immutable and unrelated to the population's overall statistics.

If your position were valid, if one person took one IQ test on a desert island, he would not be able to get a score at all, for lack of a population to give a context to the test result. But this is not how IQ testing works.

Let's simplify this. Let's say it's an arithmetic test of 200 questions of gradually increasing difficulty. An average test-taker can answer 100 of the 200 questions. A very smart person can answer 150 questions. Do you really think an individual's score depends on the average score and distribution of the population of which he is a part?

I must add that, if IQ scores really depended on the population's traits, then IQ testing would really deserve its present terrible reputation. Apropos:

Some scoring systems use an initial standardization where the standard deviation is 15 points, others use 24 points, so the same test performance can get you IQ 115 or IQ 124 depending on whose test you take.

Yes, I know. But that debate has been settled, and AFAIK the result is population mean 100, standard deviation 15.

Unless, of course, different groups of testers are using different assumptions, but without being driven by the analysis of the largest possible collection of standardized test scores. If so, it casts IQ testing into doubt as a reliable tool.

Evidence for this being a settled issue is the fact that workers in this field report a gradual increase in IQ over the decades:

If mean IQ really was adjusted to agree with current test scores, the mean would always be 100, regardless of test score changes over time.

> Some scoring systems use an initial standardization where the standard deviation is 15 points, others use 24 points, so the same test performance can get you IQ 115 or IQ 124 depending on whose test you take. [ephasis added]

The conclusion is still false -- the tests itself doesn't change, only the scoring assumptions. Those who assume σ = 15 could acquire the tests from those who assume σ = 24 and add them to their own dataset, and vice versa. Also, I have to say, either the standard deviation can't change the test scores, or the test scores have no meaning.

One more thing -- the standard deviation shouldn't be an assumption, with one group arbitrary choosing 15 and another choosing 24. The value should be derived from a large set of test scores, not a committee casting a vote.

Your argument seems to be that one's IQ score depends on the population result, along with some arbitrary assumptions like σ = 15 or σ = 24. But that's the reverse of normal statistical practice, in which the mean and standard deviation derive from test scores, not the other way around.

Obviously I'm not doubting that what you say may be so, only that it shouldn't be so -- the standard deviation shouldn't be based on anything but the analysis of a large set of standardized test scores.

My argument is that there are two* different units for IQ points, like inches and centimeters, which are both called "IQ points" by psychologists because psychologists suck at units.

I'm not making an argument about problems with the actual process of measurement, I'm making an argument that the confusion between two reported values sounds quite a lot like a confusion between two reported lengths would sound if they were of the same object, but one had been made with a centimeter ruler and another with an inch ruler, but both had been labeled just "length" in the report.

You can obviously trivially convert between the scales and convert things to the modern scale, once you know that the value you got uses the different SD value. But when the values get just thrown around as "x IQ", you don't know if they are on the old scale.

I'm not entirely sure what you think I'm arguing, but so far you've been talking about something quite different the entire time.

(*Wikipedia says there are actually three common IQ scale conventions, two psychologists had some sort of feud and one of them picked SD=16 to piss the SD=15 guy off.)

> My argument is that there are two* different units for IQ points, like inches and centimeters, which are both called "IQ points" by psychologists because psychologists suck at units.

Yes, I'm not doubting that this is so, only that it shouldn't be so in a scientific endeavor. If IQ testing were purely scientific (as opposed to being partly political), all those involved in IQ testing would allow a large set of test scores in a standardized test to produce the mean and sigma values on which everyone would need to agree. In other words, an empirical outcome.

> I'm not entirely sure what you think I'm arguing, but so far you've been talking about something quite different the entire time.

Apparently so. My point is that IQ test scores must be collected on an absolute scale based on testing results, before any of the adjustments you're describing. If this weren't the case, if test outcomes depended on something other than the direct performance of the subjects measured in a uniform, reliable way, the testing procedure would be fatally undermined.

Bottom line: I doubt that changes in mean and sigma can produce two different IQ scores in a standardized test as you're claiming. For this to be true, the relationship between the population statistics and the analysis result (mean, sigma) would have to be reversed -- it would put the cart before the horse.

Imagine this conversation:

Q. How do the statistical results derive from the test scores?

A. By a straightforward procedure -- the test scores are subjected to a classical statistical analysis, resulting in a mean and standard deviation.

Q. How are the original test scores arrived at?

A. They're derived from (a) the test results, but (b) adjusted by the the mean and standard deviation values of the population created above.

Q. (after a long pause) But ... but ... doesn't that create an example of circular reasoning, in which the scores rely on the stats and the stats rely on the scores?