June 03, 2010

Ever wondered what the difference between yellow-green and yellowish green is? Or yellow-green and green-yellow? Maybe the XKCD color survey data can tell us something.

Here are some combinations of green, blue, and purple with other colors (click to view full size):

The hyphenated forms aren't present in the XKCD color names, even though they're often more common in the raw data than any of the other forms. I'm guessing they were combined with one of the others, in the same way that 'grey' and 'gray' were combined.

The patterns in this image are ... subtle. It's obvious that, e.g., yellow + green combinations are along the border between the yellow region and the green region, and combinations of colors which don't border each other don't exist; so there's no blue-yellow for example. Order does appear to matter, so if c1 and c2 are color names, 'c1 c2' and 'c1ish c2' are slightly more c2 than c1, and 'c2 c1' and 'c2ish c1' are slightly more c1 than c2. Forms separated by '/' seem to vary a bit more, to the point it's unclear which side is dominant. Other than that, it isn't clear what the differences between the colors are. 'c1 c2', 'c1ish c2', 'c1y c2', and 'c1c2' don't appear to be significantly different.

There are a few anomalous color names, with the patterns 'c1ish' and 'c1y'. After examining the raw data, it's clear that these name the edge of the main color region, which causes the algorithm for finding the point of highest density to return a somewhat arbitrary value somewhere along the edge.

From the raw data, 'greenish' and 'orangish':

These types of names should probably be filtered out of the name/RGB pair list, since they don't have a density peak, but instead a density ridge.