Data defines the model by dint of genetic programming, producing the best decile table.

The Correlation Coefficient: Its Values Range Between Plus/Minus 1, or Do They?Bruce Ratner, Ph.D.

The “correlation coefficient” was coined by Karl Pearson in 1896. Accordingly, this statistic is over a century old, and is still going strong. It is one of the most used statistics today, second to the mean. The correlation coefficient’s weaknesses and warnings of misuse are well documented. As a fifteen-year practiced consulting statistician, who also teaches statisticians continuing and professional studies for the Database Marketing/Data Mining Industry, I see too often the weaknesses and warnings are not heeded. Among the weaknesses/uses, there is one that is rarely mentioned: the correlation coefficient interval [-1, +1] is restricted by the individual distributions of the two variables being correlated. The purpose of this article is: 1) to introduce the affects the distributions of the two individual variables have on the correlation coefficient interval; and 2) thusly, to provide a procedure for calculating an adjusted correlation coefficient, whose realized correlation coefficient interval is often shorter than the original one.

For the current articleclick here (sorry, this article is in my new book).

Related Articles:1. Calculating the Average Correlation Coefficient (sorry, this article is in my new book).2. Different Data, Identical Regression Models: Which Model is Better?3. Genetic Data Mining Method for the Proper Use of the Correlation Coefficient (sorry, this article is in my new book).4. Variable Selection Methods in Regression: Many Statisticians Know Them, But Few Know They Produce Poorly Performing Models (sorry, this article is in my new book).5. Genetic vs. Statistic Regression - A Comparison (sorry, this article is in my new book).

For more information about this article, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at br@dmstat1.com.