The Data::BitStream and Data::BitStream::XS modules may be of some help (shameless plug since I'm the author). They're made for supporting the sort of compression you're looking at.

Using Adaptive Rice coding yields about 6.8:1, Exponential Golomb with best parameters about 6.9:1. xz/bzip2/gzip aren't able to help much with either result, yielding only about 7:1. So sadly not "significant" vs. 6.5:1.

One advantage of this is that, like your packing, it writes the values compressed, so no second stage of running xz needed.

There are lots of ways you could tweak the variable length output. Adaptive Rice works pretty well without a lot of thought about the parameters (the initial values used don't matter much as they'll adjust quickly). You could get complicated with Comma / Taboo, StartStop / StartStepStop, etc. codes if you wanted. It's also easy to do lossy coding by shifting the deltas before encoding and back again after decoding (taking care to keep symmetry in compressor/decompressor).