Welcome to the Encode's Forum!
It's probably the biggest forum about the data compression software and algorithms on the web! Here you can find state of the art compression software, detailed description on algorithms, the latest news and, most importantly, you may ask a professional and get the answers! Join us today!

Scores on my testset for paq8pxd_v73 - very nice improvements overall - especially for K.WAD file. In total 25KB of gain.
Option -x (second table) gives additional 16KB of gain to -s test -> the gains are visible almost for every file.
Time penalty for -x option compared to -s is about 21%.

Overall, improved compression, and slight reduction in compression time for v73!

In single mode there is file overhead about 50 bytes per file vs px version. Like when you compress 1 byte file.
For this single mode test i think its about 9000 bytes total. Not sure how much overhead is on tarred file, probably 100 bytes total for input data. (data that cant be compressed)

Thanks:

@kaitz - I have question - is there -x option works for -x15 option? I've tested enwik8 files and scores for -s15 and -x15 are identical. Timings are also similar.
For my testset, for textual files there are difference between -s9 and -x9 options.

It's used in stream where all default data is. Also for text as humans tend to produce allot of it. :)
Some images have headers and it my gives somewhat better compression on that, but not very useful. It all depends (how many files, etc).

This needs time consuming testing to be actually useful on other types of data.
My test version shows what context are mostly bad for given data in time.

I think there has been good enough improvements from someone like me. But i still wonder.

Maybe due to the fact that there are longer tests and compress additionally tarred file double the test time.
For paq8px/pxd there is a quite reasonable time yet, but for cmix it's additonal 3-day test in one approach which sometimes could be hard to handle.

From other side - tarred file test didn't give you any information about improvement of particular files.

According to MaximumCompression corpus estimate -> 1'000'000 is very ambitious challenge... at now best scores of all files gives 5'872'598 bytes. Using all techniques from all compressors (especially paq8px and paq8pxd for FlashMx.pdf and vcfiu.hlp) into actually best cmix there is a chance to get 5'800'000, maybe 5'700'000 bytes. More parsers and better NN compression maybe gives additonal 100-300KB of gain then we are landing about 5'400'00-5'500'000 bytes. There's need a completely new technique or specialized parsers for all files to get lower score but it could be 4'000'000 bytes. Hmmmm... 1'000'000 looks impossible at now for me.

In attached table there are the best scores of MaximumCompression corpus for most of best compressors.

.hlp file has some LZ compressed data, recompress it. And if you target only this set you can gain allot. Jpeg gain 1-2kb gain is possible. On im24 4kb is possible. On dict 3kb is possible But why? And pxd splits data, and sometimes it is bad for compression. Silesia -> samba. Do you care compressor/decompressor size, memory usage, time...
px version has gains on different places, but conisider time/mem/ etc...
In pxd its clear that sometimes adding more models makes it worse. x vs s option. Like adding dmc back to jpeg stream makes it worse... (one present in pxd).
Its hard.

126'211'491 - enwik9_1423 -s15 by Paq8pxd_v74_AVX2, time 67'980,66s.
126'599'124 - enwik9_1423.drt -s15 by Paq8pxd_v74_AVX2, time 97604,05s - I don't know why DRT file is compressed 44% more time...125'752'479 - enwik9_1423 -x15 by Paq8pxd_v74_AVX2, time 91787,59s - record for a paq8pxd series at all!

16'260'265 - enwik8 -x8 by Paq8pxd_v75_AVX2- 0.12% of improvement
15'912'509 - enwik8 -x15 by Paq8pxd_v75_AVX2- 0.10% of improvement -> this score could provide to about 125'58x'xxx bytes for enwik9_142315'859'187 - enwik8.drt -x15 by Paq8pxd_v75_AVX2- 0.13% of improvement, best score for paq8pxd series

According to time there are different changes -> from 5-7% up to 18% quicker (for enwik8.drt -x15)!