The Python implementation eagerly allocates intermediate arrays that are trivial combinations of small numbers of elements from existing arrays.

There are several different ways to translate this into OCaml. If we start from scratch with the task of computing this wavelet transform then we are most likely to write in a C style using convolution:

This convolution based approach is 5x faster than the Python. However, some people argued that this is an unfair comparison because the implementations in different languages are using different algorithms.

We can also mimic the slicing approach used by Python. A naive translation generates many intermediate arrays (7 array allocations per iteration) but this is easily reduced by composing closures rather than allocating new arrays (a deforesting optimisation):