There is a known problem within foobar2000 (although more likely to do with cmd.exe itself) when running an executable within the cmd.exe command line from a path which includes spaces. The suggested fix for this is to enclose the element of the path which contains spaces within double quotation marks ("), e.g. c:\"program files"\directory_where_executable_is\executable_name

Change log 1.1.4q: 02/11/09Reversion to use of previous noise pre-calculated constant;Shaping now OFF by default. To enable shaping use -s or --shaping, without a parameter for automatic shaping or with a value 0<=n<=1 for user specified shaping.

Change log 1.1.4g: 20/08/09--maxsnr removed. -p or --postanalyse parameter implemented. Using this parameter checks the noise level of the correction data and compares to the low value derived from the associated source audio. If the correction noise (i.e. that of the difference signal) is greater than the source audio low value then the bits_to_remove value is reduced for the codec-block until the added noise is lower. Code further tidied. -F or --fftw parameter removed as FFTW dll is now automatically used if found (slight speed-up makes this the fastest way to go). Stack error fixed which occurs when libfftw3-3.dll v3.2.2 is used (newly released).

Change log 1.1.4d: 07/06/09Bug fixed whereby lossyWAV would crash if 'libfftw3-3.dll' could not be initialised. If --fftw parameter is used and the DLL cannot be found then lossyWAV will revert to the existing FFT routines and output a warning. Link to FFTW Windows DLL download page.

Change log 1.1.4c: 05/06/09FFTW can now be optionally used for FFT analyses in lossyWAV. Use of FFTW requires the presence of "libfftw3-3.dll" on the host computer, somewhere on the path and the addition of -F or --fftw to the lossyWAV command line. FFT (Delphi and assembler) further optimised. General code tidy-up. Link to FFTW Windows DLL download page.

Change log 1.1.3j: 15/04/09--sortspread parameter modified (again), now takes a parameter between 0 and 7, 2 is equivalent to beta 1.1.3i.--centre parameter removed.Reference_threshold tables removed in favour of direct calculation of the level of added noise due to bitdepth reduction using derived formula.

Change log 1.1.3: 22/02/09Integration of data structures used in new and old spreading functions. Source release.

Change log 1.1.2j: 18/02/09Implementation of -O or --oldspread parameter to enable the use of the spreading function used in v1.1.0b instead of the revised version currently under development. This gives very slightly different results to v1.1.0b as is to be expected due to the revision of the reference-threshold constants at beta v1.1.1d.

Change log 1.1.2i: 12/02/09Addition of a -N or --nasty (-q -2.0) and -A or --awful (-q -4.0) to allow extremely low quality levels to be explored.

Change log 1.1.2h: 12/02/09Addition of a -N or --nasty (-q -2.0) to allow extremely low quality levels to be explored. Removed: Bug to be fixed.

Change log 1.1.2g: 10/02/09Addition of a -r or --randombits parameter to randomise the zeroed lsbs.

Change log 1.1.1d: 10/09/08Further revision to the simplified spreading function - slightly higher bitrates than 1.1.1c but I'm happier with the method;Reference-threshold constants re-calculated using more iterations (2^(32-fft-bit-length) iterations, i.e. 512K iterations for 8192 sample FFT and 128M iterations for 32 sample FFT) and for the first time taking into account FFT-result values less than 1. This only really affects bits-to-remove values between 1 and 7, which is in line with my expectation when I made the change to the noise-calculation method;

Change log 1.1.1b: 26/08/08Revision to the simplified spreading function. All bin "averages" now calculated taking into account a variable proportion of bins to either side, i.e. "average" = (fft_result[i]+(fft_result[i-1]+fft_result[i+1])*factor)/(1+2*factor), where factor = 0.0 at 20Hz and 1.0 at 16kHz, with linear interpolation for intermediate values.

Change log 1.1.1a: 25/08/08Fundamental simplification of spreading function methodology put forward for comment. All bin "averages" now calculated taking into account a fixed proportion of bins to either side, i.e. "average" = (fft_result[i]+(fft_result[i-1]+fft_result[i+1])*factor)/(1+2*factor), where factor = 0.26 in this case;FFT result overall averaging now carried out prior to the spreading function rather than at the same time;Reference_threshold constants revised slightly.

Change log 1.1.0b: 03/08/08FFT lengths will now increase for higher bitrate audio, i.e. 88.2/96kHz, 176.4/192kHz and 352.8/384kHz;improved logfile output and --detail output;reference threshold constants for rectangular dither and triangular dither have been calculated so added noise should be the same for dither off and any dither level between 0 and 1 - the number of bits-to-remove will however reduce with "increasing" dither.

I ask as these all add to the time taken to process files (even if the options themselves are not selected).

Comments / criticisms / brickbats welcomed as before.

I will acknowledge the usefulness of the correction file as a quick and automatic way of generating the difference signal between the lossless original and processed output (for scaling=1 only).

My opinion on the questions:

a) dithering is not neededb) I like to see further support for 24 bit depth input files (I replaygain with foobar using 24 bit WAV output files). I cannot imagine anybody needs a bit depth of 32 bit. In case that's what your question is about.c) I like to be able to listen to the error signal. I don't need the correction file for reconstructing the original signal.

Great that you're still struggling so much to improve lossyWAV.I'm just a bit sceptical about the new spreading approach. It changes the machinery in a quite significant way at the low frequency end, and I think we can be very content with the current machinery. Changing the machinery would mean we throw away the experience we have so far with lossyWAV's quality (and though the experience situation isn't optimal we do have some experience), and start experiencing quality again from the zero point - more or less). Doesn't spreading just mean averaging over a certain number of bins? With this in mind I wouldn't care whether or not the virtual center of the bins involved is identical with one of the real bins. At least this is how I understand the new spreading idea. Sure there are numerous ways of doing the averaging, but are there expectations for a real benefit when going the new way?

I wish I could answer your questions but:1: I don't know what lossywav can gain from dithering (the same as mp3 ? supposed more "natural" background noise ? I always thought dithering was made to soften frequency destruction effect so I don't get the use with lossywav) 2: I don't get what you meant but my CPU is 32bits & my audio is 16/24Bits 3: I already said I didn't use correction files in the other thread

I disagree with halb27 on the spreading function, if you must refrain experimentation because you fear to break the machine lossywav will never progress, you just need to be sure it worth it before making it a full release.Also I don't need a gui personnaly, even if it would exist, I would use F2K. I agree it would be better than F2K for noobs allergic to command-line but it should be a lossywav/flac gui or the noob will end with a big wav file asking himself the purpose of such a useless codec so maybe a fork of speek' flac frontend ... but not a lossywav gui alone ...

Note: I will most likely convert my whole lossless collection to lossywav after the 1.2.0 release, so I hope it will be VERY good

One possible variant is to use 1 as a spreading value where 1.1.0b did and for all the values which exceed 1 use something else (1<value<2), i.e. ((2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(1,2,2,2,2,2,2,3),(1,1,2,2,2,2,2,3),(1,1,2,2,2,2,3,4)) goes to((SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(1,SC,SC,SC,SC,SC,SC,SC),(1,SC,SC,SC,SC,SC,SC,SC),(1,1,SC,SC,SC,SC,SC,SC))this should go some way to alleviate any concerns with respect to reducing quality as less averaging = lower minima.

One possible variant is to use 1 as a spreading value where 1.1.0b did and for all the values which exceed 1 use something else (1<value<2), i.e. ((2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(1,2,2,2,2,2,2,3),(1,1,2,2,2,2,2,3),(1,1,2,2,2,2,3,4)) goes to((SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(1,SC,SC,SC,SC,SC,SC,SC),(1,SC,SC,SC,SC,SC,SC,SC),(1,1,SC,SC,SC,SC,SC,SC))this should go some way to alleviate any concerns with respect to reducing quality as less averaging = lower minima.

If I understand it correctly you want to stay conservative when including more bins in the averaging compared to what we have now by applying a considerably smaller weight to the bins which are off-center to the highest degree (so a weight of >0.5 for the center bin of the 3 bin spreading replacing current 2 bin spreading?). Sounds good though I still can't see the potential advantage and why you aren't content with current spreading.

I am re-examining each major processing component in turn - it's the turn of the spreading function....

I've modified the spreading function so that at the bin corresponding to 20Hz the range is 1.0 and at 16kHz it is 2.0, with linear interpolation for intermediate bins.

lossyWAV beta 1.1.1b attached to post #1 in this thread.

[edit]The concensus (and what David and SebastianG said earlier) seems to be that dither is not required within lossyWAV.

On the processing of 32-bit integer samples, I'll leave it in at the moment, but I don't think that there are many packages that would output them in favour of 32-bit float samples. I don't know if the method would work on 32-bit float samples - I have a feeling it would be difficult to determine how many bits precision to remove from a float value - unless it was a simple "reduce a 32-bit float value (23-bit mantissa) to a 24-bit float value (15-bit mantissa) by brute force...." process.

It seems that some people like the correction file for analysis rather than reversion to lossless - maybe the --merge parameter can go?[/edit]

What do FLAC and other lossless encoders do wrt floating point data and "wasted bits"?

Depending on that, the appropriate lossyWAV processing could be tricky but useful, or pointless.

I only ever use 32-bit float files as intermediate files. Sometimes I archive them as-is, so I can re-work the project later. lossyWAV might be useful here, though TBH I haven't even bothered FLACing them because it's so rare that I do this. Other people might do this on a daily basis!

I have no experience of 32-bit integer audio files. 48-bit integer is common in audio processing (DSP IIR filtering etc), but never as an output.

IMO dither can go, if having the option available is slowing down processing even when it's not used.

If "Implementation of SG's new noise shaping method" means dynamic noise shaping, then depending on how aggressively you do this, it might be worth changing from rectangular spreading functions to something else entirely. I'm pointing this out because you might spend a long time playing with the current spreading function, only to dump it soon after. What you have is a narrow (fractional) version of something vaguely related to the ERB (equivalent rectangular bandwidth) scale - I reckon one day you'll end up with something which is a narrow (fractional) version of something vaguely related to overlapping critical band filters.

I can't help feeling that there's no more or less reason to have reconstruction with lossyWAV than with wavpack lossy, apart from the currently inevitable clunkiness of it. However, if the concept is there, it's another "tick" in the format comparison table, and someone can always come along and implement a more graceful re-uniting of the lossy and correction files later, if they feed the need. If you drop this ability entirely, this possibility is removed. Whether you support the merging in lossyWAV itself is up to you - having that available can't slow down encoding though, can it?

The noise floor already "floats" in 32-bit float. What do FLAC and other lossless encoders do wrt floating point data and "wasted bits"? Depending on that, the appropriate lossyWAV processing could be tricky but useful, or pointless.

FLAC doesn't even support floating point IIRC.

One could go both ways with that, and say that either lossyWAV has no need for floating point support, or that it provides a very nice way to gracefully encode the floating-point data. I'm leaning towards the latter.

The only issue I'd see otherwise is how to handle +0dbFS samples. I have no suggestions on how to handle that except perhaps to optionally right-shift the output by a few bits and scribble the gain down in the tags.

QUOTE

I only ever use 32-bit float files as intermediate files. Sometimes I archive them as-is, so I can re-work the project later. lossyWAV might be useful here, though TBH I haven't even bothered FLACing them because it's so rare that I do this. Other people might do this on a daily basis!

I would preferrably record my vinyl transcriptions in floating point on principle alone.

QUOTE

I have no experience of 32-bit integer audio files. 48-bit integer is common in audio processing (DSP IIR filtering etc), but never as an output.

Heh, if 32-bit float is going to be supported, why not 64-bit floating point too? It's a negligible code change.

In principle, the binning process used to establish critical band responses could be circumvented through clever frequency mapping. For instance, doing a frequency shift from 10khz down to say 100hz would mean that quantization noise that originally fit inside one bin could now fit in several. This could kick something into audibility.

Right?

What I'm getting at here is that perhaps more work should be put into tuning lossyWAV so that virtually all DSP effects/manipulations could not possibly cause an audibility difference, rather than merely ensuring that straight listening will not tease out a difference.

It would be possible to read 32-bit float values and write 32-bit integer values (having suitably scaled the output) - this would not change the file-size but only some of the fmt chunk information.

I'll look into it....

To fit in the range -2,147,483,648..2,147,483,647 the 32-bit float value would require to be scaled by a factor of 2^-97.

[edit] I've just been reading about the draft IEEE-754r standard and there will be a 16-bit float value in the range +/-1.## x 2^-15 to +/-1.## x 2^14 with a mantissa of 10 bits. This seems to open up the possibility of 11 bit precision in a 2^30 range, or taking what we know about lossyWAV into account effectively stores a 32-bit integer in a 16-bit float (albeit with reduced precision - but reduced precision is not proving to be a problem ).

A complete mapping of the floating point domain is unnecessary unless HDR techniques start creeping in from the video realm to the audio realm (which is rather unlikely). All I'd anticipate would be desired would be a bit shift from 0-4 bits if that.

... Not like I have any kind of valid use for that feature, so feel free to ignore it.

Can you shed some more light on what's currently happening in this regard? What exactly is fft_result[k], why is there an averaging and what happens after the averaging?

The FFT_result array is created by taking the magnitude of the raw results of the complex fft analysis and multiplying by the corresponding skewing value.

These results have always been averaged over a number of bins to remove zero or very low single bins. The most recent method now only takes into account a proportion of the bins on either side of the target bin rather than bins one or two bins away from the target bin. I feel that this will still remove single low bins but will possibly be better than the former method.

btw, thanks very much for the Matlab method - I can read matlab, not C!

I see what you mean about 32-bit floats. However the easiest way would be to convert to 32-bit integers (in the first instance) - maybe 24-bit integers later.

I just spend the last 15 min testing LossyWAV V1.1.0b Vs. LossyWAV V1.1.1c Beta at -q 1 (Ginnungagap), in order to see if there was any regression before the big noise shaping jump, personnaly I couldn't hear any major regression/progressions so I guess the serious things for 1.2 can start now

Edit1: Even if I reached 5/6 at the beginning I couldn't really tell what I was listening to (I mean inside the area of the usual artefact) ... so I think it was lucky guessing ...

Edit2: To be 100% sure it was lucky guessing I spend 10 min this morning to redo the test at -q 0 & I failed even more clearly, so for me there is no regression/improvement except the very small but welcomed kbps gain.