Spearman's rank correlation coefficient

In mathematics and statistics, Spearman's rank correlation coefficient is a measure of correlation, named after its maker, Charles Spearman. It is written in short as the Greek letter rho (ρ{\displaystyle \rho }) or sometimes as rs{\displaystyle r_{s}}. It is a number that shows how closely two sets of data are linked. It only can be used for data which can be put in order, such as highest to lowest.

The general formula for rs{\displaystyle r_{s}} is ρ=1−6∑d2n(n2−1){\displaystyle \rho =1-{\cfrac {6\sum d^{2}}{n(n^{2}-1)}}}.

For example, if you have data for how expensive different computers are, and data for how fast the computers are, you could see if they are linked, and how closely they are linked, using rs{\displaystyle r_{s}}.

Next, we have to find the difference between the two ranks. Then, you multiply the difference by itself, which is called squaring. The difference is called d{\displaystyle d}, and the number you get when you square d{\displaystyle d} is called d2{\displaystyle d^{2}}.[1]

This scatter graph has positive correlation. The rs{\displaystyle r_{s}} value would be near 1 or 0.9. The red line is a line of best fit.

rs{\displaystyle r_{s}} always gives an answer between −1 and 1. The numbers between are like a scale, where −1 is a very strong link, 0 is no link, and 1 is also a very strong link. The difference between 1 and −1 is that 1 is a positive correlation, and −1 is a negative correlation. A graph of data with a rs{\displaystyle r_{s}} value of −1 would look like the graph shown except the line and points would be going from top left to bottom right.

For example, for the data that we did above, rs{\displaystyle r_{s}} was 0.8. So this means that there is a positive correlation. Because it is close to 1, it means that the link is strong between the two sets of data. So, we can say that those two sets of data are linked, and go up together. If it was −0.8, we could say it was linked and as one goes up, the other goes down.

Sometimes, when ranking data, there are two or more numbers that are the same. When this happens in rs{\displaystyle r_{s}}, we take the mean or average of the ranks that are the same. These are called tied ranks. To do this, we rank the tied numbers as if they were not tied. Then, we add up all the ranks that they would have, and divide it by how many there are.[2] For example, say we were ranking how well different people did in a spelling test.