2016-12-20

IEC18004 QR Codes

I said I had my mojo back :-) Yesterday afternoon I decided to have a bash at writing a QR code encoding library, from scratch.

Yes, this is re-inventing the wheel as there are QR encoding libraries out there. It was fun, and it is always nice to have source code that is ours, especially if we may put it in the FireBrick (I am looking at making the TOTP logic in the FireBrick a lot easier to use).

Thankfully Cliff had already written a Reed/Soloman ECC generation function for me, and has made me a very simple BCH coding function. Whilst I understand Error Correction Code, it really is just beyond me in terms of the maths.

I found a copy of IEC18004 on-line. You normally have to pay for a spec, and I may do so at some point, but the court ruling on reading stuff on-line using your browser makes clear that I am not breaking copyright simply be reading it in my browser - whoever is hosting it is. It is 118 pages long!

What really annoys me about this whole specification is the tables of numbers. Instead of saying that the alignment marks will be evenly spaced with spacing between 16 and 20 units starting on unit 6, or something like that, they have a table that states the positioning for each. I played around and worked out a simple algorithm to work out the table and so did not have to use the table - yay. I double checked my calculations only to find one barcode size does not follow the same logic and is a special case for no apparent reason. Why not just make it a simple algorithm?

You then have the same for the level of ECC coding - rather than say "medium ECC uses X% of the space for ECC words" and work that out for each size, there is a table, for four different ECC levels for 40 different sizes of barcode. Then the number of blocks used for ECC is not something simple like "use more blocks when data encoding size is 32 bytes or more" or something simple, no, again a table, for all four ECC levels and all 40 sizes. It drives me round the bend. It could be one line of C rather than typing and double checking and testing hundreds of numbers in to a table.

Anyway, in the end, I have myself a nice little library that codes in 8 bit, Alphanumeric, or Numeric (not Kanji, but I could add that I guess). It codes the input all in one format only - I may, later, make something to work out optimal coding of the string changing coding in the middle as needed, like I did for the IEC16022 barcode library I wrote years ago, but I suspect there is no point.

It is very useful having QR readers on my phone to test it, and the reference coding in the specification was really useful too. I like specs that do a worked example like that.

Well, for a start, I said it was fun. However, you really have to do as much work to work out really what is in someone else's code. We are very careful what we put in the FireBrick. In this case the qrencode library on linux is quite good but often libraries are either not good enough, or bloated beyond reason, and that is a reason to make your own.

I think that the tendency to construct large tables of constants loosely based on some progression (but with sufficient deviation to ensure that there exists no simple formula for deriving the values) comes from the flawed belief that creating a large volume of copyrightable material that must be used somewhat verbatim helps to ensure that early implementers are beholden to the licence under which they receive the application specification.

By the time the work has progressed to being an International Standard this mindset is clearly less relevant since ISO/IEC have strict procedures requiring the declaration and release of intellectual property for the sake of promoting unencumbered use of the standard. However this doesn't appear to stop sponsors from starting out by hedging their bets and retaining what they perceive to be an option to abort the standard setting process and collect royalties if they so choose.

Many years ago I worked for a company that was more interested in suing other companies for infringing patents than in selling products. But they told us (in a training course on how to go about this) that they were only interested in patents where it was quite simple to detect that the patent had been infringed, because courts need solid evidence. Data tables that have to be embedded in code fill that goal well because they can be searched for. A deliberate infringer may think to obfuscate the table, but accidental infringers are much more common.

Everything I write here is just my honest opinion and not a statement by my employer, etc, you get the idea. If you find any words or pictures menacing or offensive, or likely to impair your computer, or alarming or distressing, stop reading now and don't come back (and don't forget to block me on social media too). Nothing here is legal advice. Everything on this blog is without prejudice, just in case. Comments are moderated to weed out obvious spam, so do not appear instantly. You take responsibility for any comments you post. Always bookmark www.me.uk as I may change the URL blogger sees.

And please, if you don't like what I post, say so - comment - discuss...