On that topic, D's arrays would play nicer with both
refcounting *and* modern garbage collectors if they were
structured as base, offset, length instead of start, length.

That might be slower sometimes as slices wouldn't fit in two
registers then.

You could put metadata just before the start of the array,
including the reference count.

Yes, but GC arrays already do that with GC metadata (alloc size)
without having offset, so that technique could in theory be done
with RC too. It's a bit mysterious how the base address is found,
would be nice to have some clear docs on this to point to.