Correlation: Does Order Matter?

Occasionally I may include affiliate links, which means I may get a commision if you purchase something via that link. Check out my privacy policy for more info.

One of the more interesting and important parts of studying two different sets of data is to see if they are correlated. It might make one wonder if the order of the data matters. In this blog post, I show with three different methods - by an empirical example, by looking at the correlation function, and visually - why the order doesn’t matter so long as each data point is matched to the same datapoint each time.

Empirical Examples of Correlation Order

I took the following table and checked the correlation of the first column to each of the other columns.

x

2x

x^2

rand

randx

2randx

rand2x

0

0

0

0.929583627

0

0

0

1

2

2

0.106796718

0.446658011

0.528395381

0.619822838

2

4

8

0.965680581

0.558912516

2.54093125

1.828189598

3

6

18

0.414864737

0.687537491

5.138625559

4.390020628

4

8

32

0.281896948

1.435247716

12.22423024

9.742716442

5

10

50

0.991283219

1.946204896

24.84904045

19.60735842

6

12

72

0.209942889

2.753579488

50.48208289

39.39509331

7

14

98

0.069575289

3.300250238

102.0477535

79.71388555

8

16

128

0.21872816

3.659157949

205.9436615

160.0289991

9

18

162

0.371072841

4.449775914

412.9537101

320.2597064

10

20

200

0.644162181

5.011163213

827.1657248

640.7804415

I got the correlation values of 1, 0.963142661, -0.289999765, 0.988187371, 0.774789519, and 0.775376284, respectively. I then sorted the entire table by the various columns, but the correlation never changed. However, when I sorted each column independently, the correlation did in fact change.

Let’s dig a little deeper as to why by looking into the correlation function.

The Correlation Function

The correlation function of x and y is defined as the covariance of x and y divided by the product of the standard deviations of x and y:

correlation(x,y) = covariance(x,y)/std(x)std(y)

This means, as described on the Math Is Fun webpage, we only need to know a few things: the sum of x, y, x^2, y^2, and xy. Intuitively, and perhaps by experience, we know that the order by which we sum numbers does not affect the result. This is also a basic mathematical concept known as the associative property of addition.

But why does changing the order of one column matter?

Because we also need to know the sum of xy - that is, it’s not so much the order of the column, but the pairings that matter. If we take this very basic example:

x

y

xy

0

0

0

1

2

2

2

4

8

3

6

18

4

8

32

5

10

50

6

12

72

7

14

98

8

16

128

9

18

162

10

20

200

sum

770

We see that reversing the order of y messes with the (x,y) pairings, making the results different:

x

y

xy

0

20

0

1

18

18

2

16

32

3

14

42

4

12

48

5

10

50

6

8

48

7

6

42

8

4

32

9

2

18

10

0

0

sum

330

Visually and Conceptually Comparing Correlation Order

Finally, we look into what correlation is trying to tell us why order doesn’t matter so long as the variables remained paired with one another.

A correlation of 1 means that there is a linear relationship between the variables, whereas a correlation of -1 means that there is an inverse linear relationship between the variables. Visually, this means if you plotted each pair of points in either case and connected the dots, then you would have a straight line.

The closer to the correlation is to zero, the less of a line is formed.

Here are the above examples plotted:

Plotted Correlation Examples

You can imagine if that if the x was sorted without regard to y, or vice versa, the graphs would look very different. However, it doesn’t matter which dot you drew first. The same is true for their correlations.

I am Joe. I'm one of many Joes, but I am uniquely me. I'm the messiest organized guy you'll ever meet. My glasses are always bent and my hair always a mess. I've never had a role model and as such am my own person. Scientist, programmer, Christian, libertarian, and life long learner.