Talk:TransOgg Page

Checksum

I would recommend changing the checksum polynomial.

For the range of page sizes possible in transogg the CRC-32C (Castagnoli) polynomial offers superior error detection ability over the 802.3 CRC32 that we currently have specified. This polynomial is also used by SCTP and iSCSI, so I could wave my arms and suggest that it might be more likely to get hardware assistance (ethernet CRC is usually done on the adaptor, but the SCTP and iSCSI crc can't be easily offloaded) ... but no armwaving is required: i7 has an instruction for the Castagnoli polynomial. There are superior polys to the iSCSI one, at least for some sizes relevant to us[1], but they lack obvious prospects of hardware assistance.

Upsides to switching to Castagnoli CRC:

Faster CRC on some hardware

Increased error detection

Upsides to switching to another improved CRC:

(Possibly) Increased error detection over Castagnoli

Upsides to staying with current CRC:

Decreased implementation size and complexity for something also supporting the ogg format

Arrange checksum order for fast page edits of essential data

IP allows fast editing of packets by using a checksum that isn't order-sensitive, so you can subtract the old value and add the new value of any given field directly to the checksum without having to re-sum the whole packet. CRC allows the same thing using EOR, but the modification to the checksum depends both the data and its distance to the end of the checked block. If data which is likely to be edited in a repetitive way over many pages is stored at a fixed distance to the end of the checksum block then the checksum delta can be pre-computed and applied directly without recalculation.

checksum different components separately and EOR them together in an order-insensitive way, or EOR them with length-agnostic perturbations which preserve the useful error-detection features of a CRC.

Note that some CRC libraries and accelerators will have overheads that restrict their ability to accumulate data in a random order. They're likely to prefer well-aligned data and long runs of conventionally-ordered data. An ideal solution wouldn't interfere with their operation unnecessarily. --Gumboot 12:14, 4 June 2010 (UTC)

I guess we should figure out what fields people would want to change cheaply. The lacing, for example, isn't going to get changed without changing the data. I think the obvious cases include the DTS (among other things) and that's currently a variable length field. --Gmaxwell 15:06, 4 June 2010 (UTC)