Introduction

CRC (Cyclic Redundancy Check) is commonly used as a way to confirm that a file had not corrupted during download. While convenient, it takes some time to read the data off of the disk after downloading for the check. It would be convenient if applications checked the CRC on-the-fly during download, so as not to waste idle CPU time and disk read time.

Downloading is done at a relatively leisurely pace (typically anywhere between 5-300kb/s) and over a long period of time, so it makes for a great opportunity to process data without impeding performance. Although ugly and impractical for most applications (it'd be safe to assume that most users think they've "broken the intarweb" when they see a hex number), displaying the CRC to the user immediately as a download finishes can often be a well-appreciated bonus.

This class passively calculates CRCs as data passes through it, ready to be used at any time.

Using the code

To calculate the CRC of a file as it is read to the end, create a new CrcStream passing the FileStream as an argument, and use the ReadCrc property to retrieve the CRC. Be sure to use the new CrcStream instead of the file stream to read from the file; otherwise the checksum will not be calculated.

''' <summary> ''' Gets the CRC checksum of the data that was read by the stream thus far. ''' </summary> Public ReadOnly Property ReadCrc() As UInteger Get Return m_readCrc 'Xor &HFFFFFFFF End Get End Property

''' <summary> ''' Gets the CRC checksum of the data that was written to the stream thus far. ''' </summary> Public ReadOnly Property WriteCrc() As UInteger Get Return m_writeCrc ' Xor &HFFFFFFFF End Get End Property

It works really well with big files, especially if you're already reading or writing them for other purposes. The main idea of this class is that everything is done on-the-fly, thus getting rid of any significant overhead and wait times.

I wanted to be able to read sections of a file and calculate the CRC for only that part, so I added a ReadLine method. I thought I'd post it here in case anyone else finds it useful. It's a bit of a hack but it works!

The file I am checking contains blocks of text, separated with blank lines. I want a CRC of each block so I use the ReadLine method to read up to the next blank line, get the checksum, reset the CRC by calling ResetChecksum() and continue reading the file to the next blank line.

Anthony

----I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone.-Bjarne Stroustrup

The code works fine as far as I can tell. I just gave it a quick run through reading a file. Definitely a lot more convenient if you know you're going to be dealing with a file rather than some other type of stream.

A couple reasons it won't affect performance:1. When you override a member of a class, internally it does the same thing as when you call two methods one after another and pass parameters along -- just that it does that bookkeeping and method calling for you automatically. Even if it does make a slight difference, it'd only be a couple dozen CPU cycles per call at most. Each call takes probably at least a few million cycles, so the difference is immeasurable.2. On an even broader scale, the time it takes to load a file off the disk generally dwarfs the time it takes to calculate the CRC (modern hard drives are that slow). Performance is a bit more significant only after the first time you read a file, because Windows caches the file onto memory.

thanks for your contribution! Just a detail: strictly speaking, CRC is not a checksum algorithm. While checksums (used in internet protocols like IPv4) are one class of error detection codes, CRC is another one (used in LAN protocols like Ethernet). Parity checks form yet another such class...

"A cyclic redundancy check (CRC) is a type of hash function used to produce a checksum, which is a small number of bits, from a large block of data, such as a packet of network traffic or a block of a computer file, in order to detect errors in transmission or storage. "

Thinking about it, I don't remember a better name for the generated bits. I checked some technical papers: those bits are often called "CRC bits" or just the "CRC". Like you mentioned, "checksum" is very popular, too, maybe due to the lack of a better name. Again Wikipedia: Checksum[^]: "This article is about checksums calculated using addition. The term 'checksum' is sometimes used in a more general sense to refer to any kind of redundancy check."

As everyone understands what you mean by the term "checksum": no matter, ignore my post above (don't want to be captious)