This is different from those coding schemes, in that it is designed to lose information more transparently than AAC (and especially mp3). Obviously there is very little hope of any lossy codec ever becoming nearly so popular as those two codecs, but I see absolutely nothing wrong with continued innovation in terms of improving both quality and coding efficiency. After all, there are still killer samples around for 320kbps mp3, which make it ABXable from the lossless source, so a lossy codec which is able to lose quality imperceptibly on a very consistent basis while achieving low bitrates is always in demand for me, and for many others I'd imagine.

NB: the Float16 audio is normalised in the range ±65504.0 rather than the standard ±1.0 (32bit and 64bit floating point audio).

That's going to be a huge surprise for any software that naively implements "normal" floating-point behaviour on 16-bit data. And doesn't that eliminate all of floating point's clipping headroom?

Anyway; what I was going to say was that if you take all the negative values and eor them with 0x7fff, that should look fairly linear to flac, and it may have a reasonable shot at compressing it. Except the normalisation probably leaves a huge discontinuity across zero because denormal LSBs won't line up with 24-bit LSBs.

It might have some chance of compressing better than 24-bit native, though. Have you tried it?

.... a surprise, yes, however as the standard has not become widespread in terms of implementation then I think that Float16 being different is understandable / acceptable in the same way that 8-bit integer is unsigned with a 128 offset compare to 16/24/32 bit signed integers with no offset.

[edit] In terms of headroom, due to the added noise associated mantissa bit reduction, I would expect Float16 to be a final format rather than an interim step for further processing. Use of the full range available in Float16 was a very conscious decision to maximise range (min sample value to max sample value). Also, there is nothing stopping someone reducing the scale of the audio prior to converting to Float16. [/edit]

In my testing, Float16 in a 24-bit sample container compressed better than the same number of 24-bit integer samples upon which the Float16 samples were based. Unfortunately, now that HalfPrecision stores the FMT.wFormatTag properly (0x0003 for Float), FLAC will not compress Float16 output.

Well, by "huge surprise" I mean "destroy your speakers", and by "any software" I mean that I have to contact my previous employer and have them check to see if I committed that code before one of these files gets into a mobile phone and is played at high volume through a set of earbuds.

QUOTE

In terms of headroom, due to the added noise associated mantissa bit reduction, I would expect Float16 to be a final format rather than an interim step for further processing.

This raises the question of why you need so much low-level resolution, then.

It would surprise me if, when using any type of float where the expected range is ±1.0 but the available range is massive (i.e. 10^38 or 10^308), no checks were made by the decoding software on the inbound data to ensure that these limits were adhered to.

As this potential audio type is in its infancy, common practice is not yet defined (although it could be *assumed* that the treatment would be the same as for Float32).

Why maximise resolution? Simply, because it's there to be used. Why limit the type to a range of ±1.0 to 5.96E-08 (i.e. 2^24:1) instead of using ±65504 to 5.96E-08 (i.e. almost 2^40:1)? One of the complaints commonly made against 16-bit integer is lack of resolution compared to relatively commonly available 24-bit integer. The adoption of ±65504 takes the range of the Float16 type beyond that of 32-bit integer.

Well it's not really in its infancy. I included 16 bit float (part of the logical extrapolation on power-of-two bit depths) in a file format proposal in 2005, and a couple of years ago the same thing came about as an inevitable consequence of passing the bit depth (whatever it maybe) to a floating-point conversion routine which supported 16-bit as well as 32 and 64 bit precision.

What checks do you propose to perform, and what is the proper action to take on different results?

I don't mean for anything to be changed on my account, anyway. I've already asked for my old code to be reviewed and deleted as appropriate.

I accept that the Float16 proposal must have been in existence for some time before it was adopted in the revised IEEE-754 [2008], but it is fair to say that very little Float16 (as defined in the 2008 revision) encoded audio has been promulgated.

I would expect that any player would perform a bound check on any input format for which not all possible values are permissible, i.e. no check required for 8, 16, 24 or 32-bit integer; limit check required for 32 and 64-bit float (permissible values ±1.0; Float32 Max/Min: ±3.4028234 × 10^38; Float64 Max/Min: ±1.7976931348623157 x 10^308), along with ±INF and NaN checks. Allowing Float16 to use ±6.5504 x 10^4 increases range and reduces checking to ±INF and Nan.

Sorry to have proposed a handling of the type contrary to the handling of 32 and 64-bit float (and causing you a retrospective code change!), but, as stated earlier, it maximises the potential of the type.

I fear that I may need a little more clarification as to the benefits of this over, say, lossyWAV, in terms of both resolution and efficiency in reducing information from 24-bit samples. Any more comparisons would be most welcome (and level-matching would be appreciated also, as the two files in the OP are a couple dB apart.)

Not being able to recompress to FLAC is unfortunate. Is there any kind of workaround possible for this?

Using HalfPrecision to convert from 24-bit integer PCM to 16-bit floating point will add a small amount of noise to the output while at the same time reducing the uncompressed file-size by approximately one-third.

The lossless file, when downloaded, has no ReplayGain information. The lossy version has ReplayGain information appended (using the latest R128Gain in foobar2000). If new ReplayGain values are calculated for both files then you will find that they are within approximately 0.01dB of each other (+2.97dB and +2.98dB respectively).

I agree about the lack of lossless compression - I am quietly hoping that one of the lossless codec developers takes up the challenge. As I said previously, when an incorrect wFormatTag is used ($0001 for Integer rather than $0003 for Floating Point) then over 20% filesize saving is made compressing 16-bit FP-PCM instead of 24-bit Int-PCM.

This is not really a codec - rather an alternative representation of the audio data - (practically) no decoding is required by the playback software.

Looking at the ReplayGain through Foobar, you're quite right. Really should've checked on that before I posted what I did, since we should all be quite aware of the frailty of hearing + perception on this forum (I had perceived a volume difference after repeatedly flipping back and forth between the files.)

Optional fingerprinting of Float16 output using the -t, --tracer parameter. Adds some (actually very little) noise to the output;

Precision parameter removed (reduced mantissa bits).

The trace / fingerprint is intended to allow identification of Float16 output if written as CDDA. The fingerprint is "[Fp]" and is stored one bit every eight samples, channel hopping every bit. Repeat is every 256 samples. If interpreted as 16-bit integer, the Float16 audio is pretty nasty, but not totally unrecognisable in comparison to the original.