I find it hard to believe that I haven't written this up somewhere on the web already, oh well, here goes.

The raw records are one, single line, record per file.

The first field is the sha1 sum of the file;
The second field is the volume ID (in this case: "Kindle");
Which is followed by one or more fields delimited by "|";
The last field is the path and filename which was checksumed in the first field.

The (variable number) of fields between the volume ID and the last field are "containers" (usually archives).

Here a (short) example, with the single line broken for posting at "|" :

The "poor man's query tool" for this database, grep:

Code:

knoppix:cat$ grep 'udev/rules.d/60' Amazon_2012.02.18_sha1.cat | sort

Which results in a sorted list of the matching udev rules in all Amazon source code releases.

Code:

7748689817fdd946e73e31b703a40f421249646a Kindle|

sha1sum and volume id

Code:

/kdx/Kindle_src_2.1.1_351050064.tar.gz|/gplrelease/udev-112.tar.bz2|

volume path and file name (a compressed archive)
which in turn contains another compressed archive
which in turn contains this file (the one that was sha1sum'd):

Code:

/udev-112/etc/udev/rules.d/60-persistent-input.rules

Which serves the needs of a developer who is wondering: "which models/releases use this identical framebuffer driver".

All that person needs to do is invent a grep expression and ask. ;-)

Note:
My dream was to import these catalogs into an MySQL database, but I have
been running this script for nearly ten years now and not 'gotten around to it' yet.

The record format is such that it can be imported into OpenOffice and searched there as a spreadsheet based database.
(and from there, OpenOffice could populate a for-real database).

As I write, the script is approaching the 1 1/2 million record mark, still running.

expand_tar_bz2 /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults /taglib-1.5.tar.bz2 /taglib-1.5.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/taglib-1.5.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults/taglib-1.5.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Unexpected EOF in archive /bin/tar: Unexpected EOF in archive /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults/taglib-1.5.tar.bz2: bzip2 compressed data, block size = 900k
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults /DirectFB-1.2.0.tar.bz2 /DirectFB-1.2.0.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/DirectFB-1.2.0.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults/DirectFB-1.2.0.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Unexpected EOF in archive /bin/tar: Unexpected EOF in archive /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults/DirectFB-1.2.0.tar.bz2: bzip2 compressed data, block size = 900k
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.2_572340009.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.2.1_576290015.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty

A common cause of those sorts of failure messages is that the file is not actually a bz2 compressed file, regardless of its name extension.

Five errors in 2 1/2 million files - I can live with that.
At least it qualifies as better than a WAFG as to what was released.

The record format is such that it can be imported into OpenOffice and searched there as a spreadsheet based database.
(and from there, OpenOffice could populate a for-real database).

I provided the sha1 catalog file in OpenOffice format for each of the source bundles I mirrored here: http://mirrors.minimodding.com
Pick any entry on that index page for an example of the mirrored source bundles.
Click any *.ods entry and it should just open in your spreadsheet application.

Most of those bundles have both a short, summary catalog and the full catalog.
All of them a lot smaller than the 2 1/2 million record Amazon catalog.

I will post the actual catalog file(s) as soon as the script finishes fondling the Amazon gigabytes.

May even describe the record format - not that anyone ever RTFM.

first non-knc1 poster here!
Can you please explain what this is all about, what is the purpose and use of it (like a very hands-on intro), all I perceive from it that it seems interesting but in all honesty I have close-to-0 clue what it is at all!!! Thanks dude.

first non-knc1 poster here!
Can you please explain what this is all about, what is the purpose and use of it (like a very hands-on intro), all I perceive from it that it seems interesting but in all honesty I have close-to-0 clue what it is at all!!! Thanks dude.

Yes, I spend a lot of time, talking to myself in the corner.

You ask a good question - the "what is it good for" is not exactly obvious.
I will continue to post links to worked examples so people can get a feel for what/when/why to use it.

When trying to develop something that works across all models/system-versions -
One of the first things a developer needs to know is if support for that feature was present in the orginal source code used by the vendor.

(Which does not mean it was included in any particular build, but if it isn't there to start with...)

The currently linked to example uses in posts show -

There where two versions of udev (the hotplug notification system used in the builds) used in the system.
Now the developer knows (or can check for) if what they plan to do was supported by both versions.

When working with the display (across models/system-versions) it can become important to know if the same eink driver was used in all the machines.
(It wasn't)
Now the developer knows what to check into to see if whatever they are doing will work with all versions of the driver.

It has a lot of other uses, but like a baby, it is hard to see what it might be good for in the long run.

first non-knc1 poster here!
Can you please explain what this is all about, what is the purpose and use of it (like a very hands-on intro), all I perceive from it that it seems interesting but in all honesty I have close-to-0 clue what it is at all!!! Thanks dude.

Oh, I missed the chance for a brag -

Ever want to do "Declarative Programming"?
Ever want to do it in Bash?

Look at the script - it has a "Declarative Progamming" engine in it and this utitlity uses that to "solve" the problem of opening any combination of compressed files and archives.
Like that baby, it "learns as it goes" (and also remembers to record how to clean up after itself).

Don't be side-tracked by all the supporting functions - the entire program is three (3!) lines - just ignore the first 1200 lines or so of supporting functions, look at the bottom of the file.
To which I added a couple of lines to give a nice dump of all the problems encountered.

That is how it got into the ABS Guide back in 2003 (possibly dropped in newer versions).
It was a "Declarative Programming" extension to my chapter on using Bash arrays.

After:
There where five archives that did not get included in the catalog:

Code:

expand_tar_bz2 /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults /taglib-1.5.tar.bz2 /taglib-1.5.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/taglib-1.5.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults/taglib-1.5.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Unexpected EOF in archive /bin/tar: Unexpected EOF in archive /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults/taglib-1.5.tar.bz2: bzip2 compressed data, block size = 900k
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults /DirectFB-1.2.0.tar.bz2 /DirectFB-1.2.0.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/DirectFB-1.2.0.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults/DirectFB-1.2.0.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Unexpected EOF in archive /bin/tar: Unexpected EOF in archive /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults/DirectFB-1.2.0.tar.bz2: bzip2 compressed data, block size = 900k
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.2_572340009.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.2.1_576290015.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults
Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now
/mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty

Three bad archive files - the Q.A. department at Amazon released the same broken archive in three different source bundles.