Reverse Engineering Stack Exchange is a question and answer site for researchers and developers who explore the principles of a system through analysis of its structure, function, and operation. It only takes a minute to sign up.

I use Backblaze to back up my computer. You restore files from your backups by selecting files to restore, which are then packed into large zip files. Of course, it's fairly rare to be able to download a 500GB zip file without a connection interruption, so a sane developer would implement support for the HTTP Range header to allow users to resume downloads.

They have not done so. Instead, they have a boutique download utility that specifies the requested byte ranges by emulating a POSTed HTML form. This utility does all the stuff you'd expect a normal download manager to do, like downloading with multiple connections at a time and resuming partially completed downloads, but due to some dodgy design issues (like opening a fully-fledged process, not a thread, for each 40MB block) it is rather inefficient on fast (>100 Mbps) connections. It also is Windows-exclusive.

I'm trying to write an open source replacement in Node.js that removes some of the suck, but I'm up against a roadblock: one of the fields the utility sends in its POST requests is called "bzsanity" and is a 16-bit checksum over the account email address. Unfortunately, I can't figure out what the algorithm is. Maybe I'm just dumb, but I'm hoping you guys can help me out.

Here are some checksum values:

test@test.com: 028a

Test@test.com: 4152

test2@test.com: 3d0f

test: 494c

aa: acf2

ab: aaad

ac: 8e4d

ad: 0436

"" (empty string): a93e

a: ce7f

b: 1a1e

c: 1540

d: 6c57

If you want more test vectors, I can probably deliver. I've tried adding the bytes in an accumulator and a few variants of CRC-16, and those approaches don't work.

Maybe a few years too late but... Did you ever finish making a replacement in node.js? I've got several 300-500GB zip files I need to get down, and even with gigabit internet download speeds are slow. (I'm using aria2c 1 thread, cos as you said, no HTTP Range header.. I'm on Linux so...can't use there DL apps.)
– MintAug 28 at 9:13

1

@Mint I did, and then I promptly forgot about it for three years. Thanks for making sure I followed through, if a bit belatedly.
– Reid RankinAug 28 at 18:47

legend!!! I had little hope that I'd ever hear back let along a fully working node.js app! Gave it a try just now and it does indeed work, "(513.61 Mbps instantaneous, 491.52 Mbps total, ETA: an hour)" An hour sure beats like ~10h I was getting. Thank you so much for sharing.
– MintAug 29 at 0:11

IDA Pro + OllyDbg. Because of the compiler they used, it was nearly impossible to figure this out without dynamic analysis, so don't kick yourself too hard ;)
– Jason GeffnerMar 9 '16 at 22:55

1

@JasonGeffner awesome work, although IMO it is better if this site does not become "do my reversing of me" site, but rather provides advice on how to tackle difficulties with reversing.
– Vitaly OsipovMar 11 '16 at 20:25

4

@VitalyOsipov: My hope is to show people that for all these "figure out the checksum by looking at sample inputs and outputs" questions, you almost always need to reverse engineer the code itself. Expect me to point people to this answer (and this specific comment) going forward ;)
– Jason GeffnerMar 11 '16 at 20:27