I have some code that uses a multidimensional look-up table (LUT) to do interpolation. The typical application is a color space conversion, where 3D inputs (RGB) are converted to 4D (CMYK), but the code is rather general. The look-up table is a numpy array, which will generally have a shape like (4, 17, 17, 17). Here, 4 is the number of output dimensions, o, 17 is the number of grid points along each axis, g, which must be the same for all, and 3 (the number of 17s in the shape tuple) is the number of input dimensions, i. There is also lut_axis, 0 in this case, that indicates the position of o in the shape tuple. There is an additional requirement for g to be either a power of two (1, 2, 4, 8, 16...) or a power of two plus one (2, 3, 5, 9, 17...).

I am currently checking that the lut and lut_axis I receive as inputs are compatible with the above requirements. But in most cases there is only one value of lut_axis that would yield a correct result, so I think I could improve the code by only using lut_axis when the shape of the LUT doesn't give away what it should be, and else ignoring it.

My first attempt was this logical spaghetti code, that gets things done, but which I will be unable to figure out a month from today :

There's one call too many to check_lut_shape to fetch the values of i, o and g, but since the code is already not an example of efficiency, the focus should be on readability, and I find it is easier to understand like this.

I would feel more comfortable if I could put together a more elegant version of option 1, but since I don't see how to go about that, I am more and more leaning to something along the lines of option 2. Any comments on which way to go, or improvements over the code above are more than welcome.

def process_lut_shape(lut_shape, default_axis=0):
if len(lut_shape) < 2:
# anything less then two long cannot be a shape
raise NoneLutShapeError(lut_shape)
elif len(lut_shape) == 2:
# given two, either could be the lut_axis
# try the default if it can work
# otherwise try the other one
if is_almost_power_of_two(lut_shape[default_axis]):
lut_axis = default_axis
else:
lut_axis = 1 - default_axis
else: # len(lut_shape) >= 3
counts = Counter(lut_shape)
if len(counts) == 1:
# since all axis are the same, we can't distinguish between them
lut_axis = default_axis
elif len(counts) > 2:
# can't have more then 2 distinct values in the shape
raise NoneLutShapeError(lut_shape)
else:
# the least common count, with three elements should be the one
lut_value, count = counts.most_common()[-1]
if count != 1:
raise NoneLutShapeError(lut_shape)
lut_axis = lut_shape.index(lut_value)
g = lut_shape[(lut_axis + 1) % len(lut_shape)]
if not is_almost_power_of_two(g):
raise NoneLutShapeError(g)
return lut_axis, len(lut_shape) - 1, lut_shape[lut_axis], g

However, I'd really recommend you not write this function at all. This function is guessing. As the zen of python says:

In the face of ambiguity, refuse the temptation to guess.

Given (16, 16, 16) you have no way of knowing which axis was intended to be the lut_axis. So I really really recommend not trying to guess. You'll just do the wrong thing oddly in certain circumstances. Even trying to figure it out when its not ambigious is problematic because it'll still become ambigious at times and your code will work differently. As the Zen of Python also says:

Explicit is better than implicit.

Make your users explicitly pass the lut_axis. You can't do the correct thing for all inputs, so don't create false expectations by doing the correct thing for some inputs.

\$\begingroup\$Nice use of Counter, I need to get more acquainted with collections. And lose my fear of calling len more than once on the same sequence, it is an O(1) operation in Python. Good point also on the Zen of Python, but I'm leaning towards still writing this function, but call it only if lut_axis is not explicitly provided, and raise an error if the result cannot be unambiguously determined. Ambiguous shapes are a rare corner case, and most users don't know (and don't have to know) what lut_axis is.\$\endgroup\$
– JaimeFeb 6 '13 at 7:42