What do you get for your test set resampled to 32kHz, processed with -2?

Does 32k resampling followed by ReplayGain (only negative values applied) help even more?

It makes sense to have a -3 along the lines you're proposing, but I suspect the above will be dramatically more efficient, and still artefact-free (though with a 16k LPF and, with RG, loud tracks becoming quieter).

[edit] I tried a couple of albums and the results were a bit of a surprise: FLAC: 773MB, 914kbps; -3 @ 44.1kHz: 321MB, 381kbps; -3 @ 32.0kHz: 313MB, 371kbps. The size difference is welcome, but the resampling has a time overhead and the 16kHz LPF. [/edit]

QUOTE (GeSomeone @ Nov 13 2007, 17:41)

QUOTE (halb27 @ Nov 12 2007, 15:13)

Target b) for -3: OK, so we should think about the details.

Just following you dialog here.. This seems the right basic choice, there has to be a benefit for offering a (little) bit of quality. IMO that means a significant lower bit rate for -3 (compared with -2).

(Would -skew of -12 -18 -24 (for -3 -2 -1) be too agressive?)

QUOTE (Nick.C @ Nov 12 2007, 23:25)

I am wondering about the clipping reduction method - at the moment, if it finds 1 or more sample which clips after rounding then it reduces bits_to_remove by one and tries again, until bits_to_remove=0 then it just stores the original values. Is 0 permissible clipping samples a bit too harsh? At the time thatthe iterative clipping was introduced, I put in an "allowable" variable, implying that a number of clipping (but rounded) samples may be permitted.

I suppose you mean consecutive samples of the maximum (or minimum) value? To me in this case 0, 1 or 2 would make sense, only already badly clipping music would be affected by other values.

And yes, the dither function is obsolete as you no longer opt to lower the amplitude.

QUOTE (Nick.C @ Nov 13 2007, 00:22)

I also tried -3 [..] which yielded 420kbps [..] [edit]Maybe 400kbps for "real music" should be the target rather than approaching that for my problem set. [/edit]

The problem with this is that from the offset this method aims for constant quality (I like that BTW) so the bit rate will vary. I found for example that music that already compresses well (lossless) like in the 600's will not get half the bit rates with the help of lossyWav but rather still around 420.

I've settled on a set of settings which are in some ways similar to -2 but using different fft lengths, -nts and -spf, see below. 434kbps is a reasonable bitrate at a reasonable quality (I can't hear anything wrong, but my ears are 39 years old......). The -allowable parameter only counts individual clips, it doesn't look for multiples (although it could, at a slight speed penalty). The -window parameter hasn't made it into this revision as I have to check the bit reduction noise calculations for each new spreading function to ensure that I'm not adding the "wrong" amount of noise per bit removed.

Feedback, as always is requested and valued.

lossyWAV alpha v0.4.5 attached: Superseded.

-3 settings tweaked;

-allowable parameter implemented to allow a number of clips per codec block (total per block per channel).

I tried with/without -allowable 1 on a track that hits full scale.Doesn't make a lot of difference here.

BTW shouldn't the logging say 512 samples

erm, yes, you would be correct in that assertion!

-allowable 1 will only allow 1 sample per channel per codec_block to clip - try with -clipping instead to see what the maximum bits to remove for the track in question would be (this will also give you a count of samples which clip over or under) and then play about with -allowable. The parameter will take up to 64 permitted clips per channel per codec_block.

I was thinking today that it would be nice to be able to just drop a wav (or multiple wavs) onto a small app and get lossyFLAC files in return.

So after about 8 hours of opcode hexing, and batch file scripting... lFLCDrop is born. See attached.

You will have to download and/or copy flac.exe and lossyWAV.exe into the folder you will use lFLCDrop in, because I'm not sure what the licenses are for redistributing that stuff yet. I'll have to check that out, but if anyone knows off hand, you could let me know to save me the hassle.

I should note that lFLC.bat for lFLCDrop v1.0.0.0 is forcing 576 sample blocks for lossyWAV and FLAC, due to Winamp's in_flac plugin not showing the spectrum in the classic visualization when 512 sample blocks are used. (i don't have modern skins installed to test). If this fix is not ok with you, feel free to chance it in the batch file. For quick reference here's the command line used for both:lossyWAV [input] [quality] -o [output] -nowarn -cbs 576flac -8 -o [output] --delete-input-file -f -b 576 [input]and you should note that FLAC is deleting a temp file, not your source file. If you want to delete your source files, the option is available if you right-click on the lFLCDrop GUI.

The next thing I plan to do is create an lFLC.bat for use with EAC, including passing in variables for use in tagging. It might take a bit longer to test due to the possibility of it being impossible to get around certain characters being passed in. Mainly double-quotes & percent signs, but it will need some testing for sure.

at any rate, enjoy

p.s. thanks to all of the people involved with lossyWAV and of course FLAC, and to Layer3Maniac for making the original FlacDrop. Without all of you, this would not have been possible and I take no credit for anything I've done which belongs with you all. This is mentioned in the readme file, but I thought it would be considerate to have here as well.

So after about 8 hours of opcode hexing, and batch file scripting... lFLCDrop is born. See attached.

Nice to hear that you think enough of the processor to create a method of using it! lossyWAV is LGPL (although exactly what that means, I still need to get my head round.....), by the way.

Possible bug report:

I am in the process of batch converting circa 1500 tracks in Foobar2000 v0.9.5 beta1 using FLAC v1.2.1 and lossyWAV v0.4.5. I got a bit concerned when after a while I noticed that the total time of the output files is less than that of the input files. Narrowing it down, I find that some tracks are exactly 8 codec blocks (4096 samples / 16kB) shorter than they should be. I am at a loss as to why this is occurring.

[edit] I've looked at the throughput as one album with 2 affected tracks processes: the input and processed WAV files are the same length..... [/edit]

[edit2] As an aside, I'm 623 tracks in and the processing (-3) has brought the bitrate down from 854kbps to 392kbps (8.27GB / 18.0GB). [/edit2]

Really quite palatable to listen to. I think the interplay between the -nts +6 (take the minimum value found and add 6dB) and -snr 21 (take the average of all relevant bins and subtract 21dB), then take the lower of the modified minimum and the modified average, produces quite a robust check against added noise. I am listening to a lot of the output (4d17h27m27.333s) trying to find the artifacts I *really* expect to be there at that bitrate. None yet. Quite pleased.[/edit]

Thanks to everyone for bettering LossyWAV!! I don't know exactly what is happening here, but when I try to run version 0.4.6 it just outputs a wav header and no data. I attached a screenshot of the commandline, and it appears that LossyWAV doesn't even try to render any audio for me. I'm running an Intel Celeron processor @ 2.4ghz (the P4 based style) and I'm wondering if something SSE-wise just isn't meshing with my processor. If anybody has any answers they will be greatly appreciated, but I'm gonna for now hope that newer versions will work again for me.

Thanks to everyone for bettering LossyWAV!! I don't know exactly what is happening here, but when I try to run version 0.4.6 it just outputs a wav header and no data. I attached a screenshot of the commandline, and it appears that LossyWAV doesn't even try to render any audio for me. I'm running an Intel Celeron processor @ 2.4ghz (the P4 based style) and I'm wondering if something SSE-wise just isn't meshing with my processor. If anybody has any answers they will be greatly appreciated, but I'm gonna for now hope that newer versions will work again for me.

Thanks!-808

Did v0.4.5 work properly with the same settings? I thought that I had got rid of all SSE instructions in v0.4.5 and don't think that I've added any into v0.4.6 (although I'll check anyway). I'll look for bugs and revert.

[edit]There's a bug in the -detail parameter which seems to prematurely end the process. I'll amend and include in the the next revision.[/edit]

Using lossyWAV -3 -nts 6 -skew 36 -snr 21, my (small) test set achieved an average 344kbps with FLAC, compared to ~400 with -3 alone. Some files were smaller using FLAC, while others were smaller using WMALSL, and the difference between the two codecs over the whole set was negligible.

To me these are a very attractive bitrate variations for the various quality levels, and the average bitrate differences between regular and problems samples show at least in a statistical sense that lossyWav can differentiate well what to do according to the different situations.

Your new -3 candidate looks extremely attractive judging from the statistics, Nick.Statistics however doesn't really tell about quality, so I tried -3 -nts 6 -skew 36 -snr 21 on my problem samples as well as on some tracks of regular music.Surprise was the only issue I found was with badvilbel at ~sec. 19.0 where I could abx the added hiss 8/10. This added hiss is so negligible to me that it is well within the excellent quality I'd like to see with -3.I have never thought before that lossyWav is that good at an average bitrate of ~340 kbps with regular music.Great work, Nick.

So this is the way to go for -3 IMO as long as we don't get bad news. Maybe even for -2 in an adapted and more cautious way.

To me these are a very attractive bitrate variations for the various quality levels, and the average bitrate differences between regular and problems samples show at least in a statistical sense that lossyWav can differentiate well what to do according to the different situations.

Your new -3 candidate looks extremely attractive judging from the statistics, Nick.Statistics however doesn't really tell about quality, so I tried -3 -nts 6 -skew 36 -snr 21 on my problem samples as well as on some tracks of regular music.Surprise was the only issue I found was with badvilbel at ~sec. 19.0 where I could abx the added hiss 8/10. This added hiss is so negligible to me that it is well within the excellent quality I'd like to see with -3.I have never thought before that lossyWav is that good at an average bitrate of ~340 kbps with regular music.Great work, Nick.

So this is the way to go for -3 IMO as long as we don't get bad news. Maybe even for -2 in an adapted and more cautious way.

Well, I'm very glad to hear that you like the new -3 proposal. I will implement this in v0.4.8. I was pretty astonished when I got to the end of the 1496 track processing and the output was 16GB from 40.8GB input.

This gives a lossyFLAC output of 35.95MB / 405.5kbps with a fairly significant reduction in bits_to_remove for badvilbel, and also reverts to the original two fft lengths in David's script. This is in contrast to 34.62MB / 390.5kbps for alpha v0.4.8 -3 settings. Slightly more conservative, but if it reduces noticable hiss, then I;m all for it (however, I haven't heard any added hiss on my iPAQ at existing -3 settings - but the noise floor for audio output is not great on it).

I intend to implement these settings for the next revision, unless of course anyone feels strongly that I shouldn't (alternative settings welcomed).

If it's only about the (to me) negligible added hiss above hiss that is already there in the original badvilbel I personally wouldn't care about it. I've grown to love your current -3 setting. I've been listening to a lot of music with current -3 trying to abx problems on suspicious spots, and I'm very happy with it. To me it's a very good solution for people who want great quality on a FLAC enabled DAP.Sure it's all within the usual restriction of experience so far. But remember it's about -3 here.Everybody can increase -3 quality to his liking by lowering -nts.

Anyway I'll try your new -3 proposal tomorrow.

I've tried a lot of settings for -2 with your -3 idea in mind: using a rather high -skew and -snr value, a rather high -nts value, and being very restrictive with using spreading_length = 1, and I ended up with

It yields an average of 405/549 kbps on my regular/problem sample set which compares favorably with the 430/539 kbps of the current -2 setting.Moreover -nts 0 should be defaulted for security IMO but I guess using a positive user chosen -nts value is fine. Trading -nts 2.5 for -nts 0 for instance yields 388/540 kbps for my regular/problem sample set.

I will do a listening test with it (using -nts 2.5) tomorrow.

The idea behind the -spf setting is (apart from merging current setting with your -3 setting):a) Make the 64 and 128 sample FFT the primary decision basis for deciding on the 2 highest frequence ranges. Give the 64 sample FFT a minor influence on decision making for the 3 lower frequency ranges.b) Make the 512 and 1024 sample FFT the primary decision basis for deciding on the 2 lowest frequency ranges. Give the 1024 sample FFT a negligible influence on decision making for the 3 higher frequency ranges. Same for the 512 sample FFT with respect to the 2 highest frequency ranges.c) Make the 128 and 512 sample FFT the primary decision basis for deciding on the 3rd of your 5 frequency ranges.d) Details are chosen on a cost consideration. For instance the 2s in the 128 sample FFT setting cost next to nothing (at first I wanted to have them as a 3 as with the 64 sample FFT setting).

I will report on the listening test.

BTW I've found a little bug: -nts 1 doesn't do what it should do: -nts 0.99 is fine as is -nts 1.01, but with -nts 1.00 bits removed are far too low (less than with -nts 0.0).

The statistics says 343/473 kbps on average for my regular/problem set which is very close to the 338/468 kbps of the current -3 setting.I also tried the 'hiss spot' of badvilbel, and I can't abx the difference.

Looking more closesly at the new setting it is a bit of what I have in mind with -2: let the short FFT do the main decision job for the high frequencies (the short FFT is good at that), and let the long FFT do the main decision job on the low frequencies (the short FFT isn't good at that).

Sorry for having been pretty negative about the new -3 setting. Guess I was a bit upset cause I've done a lot of listening effort with the current -3 setting. But I think this wasn't useless when switching to the new setting. The major principle is the same, and it is a little bit more defensive. Sure I'll try the new setting with my usual problem samples tonight. To me this is sufficient and I won't go through part of my regular collection again.

What's more relevant IMO: why is this -nts x setting, with x>0 to a rather high degree so good? Can we trust it so much to use a positive -nts also for the higher quality settings?

A high -skew value is a good thing for differentiating good and bad spots (with respect to 'number of bits to remove') in the music. But -skew is effective only at rather low frequencies. Together with a high -skew value -snr also does a good job differentiating. But because of this interconnection I'm afraid -snr is effective also only in the low to lower medium frequency range below ~3 kHz.If this is correct using a positive -nts value leaves the high frequency range under reduced noise control.However from what we experienced so far this doesn't seem to have a practical negative impact.Maybe dropping the same amount of LSBs in an entire block usually gives a noise floor with frequencies below 3 kHz which is caught well by the skew/snr machinery even with a rather high positive -nts value?Or maybe maybe the ATH curve is relevant here which gives reduced sensitivity to the 3+ kHz range for low level signals?

In either case it would be very welcome if younger members could contribute listening. If for instance everything's fine in the high frequency range to my old ears this doesn't say a lot.

The statistics says 343/473 kbps on average for my regular/problem set which is very close to the 338/468 kbps of the current -3 setting.I also tried the 'hiss spot' of badvilbel, and I can't abx the difference.

Looking more closesly at the new setting it is a bit of what I have in mind with -2: let the short FFT do the main decision job for the high frequencies (the short FFT is good at that), and let the long FFT do the main decision job on the low frequencies (the short FFT isn't good at that).

Sorry for having been pretty negative about the new -3 setting. Guess I was a bit upset cause I've done a lot of listening effort with the current -3 setting. But I think this wasn't useless when switching to the new setting. The major principle is the same, and it is a little bit more defensive. Sure I'll try the new setting with my usual problem samples tonight. To me this is sufficient and I won't go through part of my regular collection again.

What's more relevant IMO: why is this -nts x setting, with x>0 to a rather high degree so good? Can we trust it so much to use a positive -nts also for the higher quality settings?

A high -skew value is a good thing for differentiating good and bad spots (with respect to 'number of bits to remove') in the music. But -skew is effective only at rather low frequencies. Together with a high -skew value -snr also does a good job differentiating. But because of this interconnection I'm afraid -snr is effective also only in the low to lower medium frequency range below ~3 kHz.If this is correct using a positive -nts value leaves the high frequency range under reduced noise control.However from what we experienced so far this doesn't seem to have a practical negative impact.Maybe dropping the same amount of LSBs in an entire block usually gives a noise floor with frequencies below 3 kHz which is caught well by the skew/snr machinery even with a rather high positive -nts value?Or maybe maybe the ATH curve is relevant here which gives reduced sensitivity to the 3+ kHz range for low level signals?

In either case it would be very welcome if younger members could contribute listening. If for instance everything's fine in the high frequency range to my old ears this doesn't say a lot.

I'm glad that the badvilbel hiss has disappeared - I tried quite a few permutations before arriving at this latest proposal - I have also done quite a bit of listening at current -3 .

I think that due to the high skew value, and the fact that it weights in favour of the lower frequencies, will produce minimum values at low frequencies quite often. As there are artificially weighted, to add 6dB to them has no major detrimental effect on the output.

-snr is currently the average of the skewed & spread fft results. I had thought about making it the plain average of the relevant bins (pre-skewing) to see what effect that has, but put it off as I feel that this will effectively weight the higher frequencies. Another option would be to take the average of the skewed results, pre-spreading.

I'll take a look to see what's wrong with -nts 1.0.

Ditto your request for younger ears to test the output - it would be very much appreciated.

I've been thinking about listening tests. In order to make listening experience expendable throughout our quality levels, and with regard to the very good quality of these -3 settings I think we should make -2 a more defensive version than -3 in any detail (and -1 a more defensive version than -2 with every detail).This way everybody can try -3 where problems can be heard most easily in case they exist. The resulting improvements on -3 can then be carried over analogously to -2 and -1.It would be different if a certain say -2 detail wasn't necessarily more defensive than the corresponding -3 detail.Moreover meanwhile I think we can use a slightly positive -nts value with -2 too when using a high -skew and -snr value.I also feel that 3 analyses should be enough for -2, so speed can be improved compared to the current 4 analyses used.So I have to change my -2 suggestion I wanted to listen to tonight.

Well, while we are talking about defaults settings... in these days I've been working, just for the fun of it, on a very simple algorhythm which, using as a base official defaults, apply some morphing between them and slowly goes to pure lossless, so that you can input a floating point value in the range between [0.00 .. 4.00] instead of (-1;-2;-3) as a quality setting.

Here's some examples (please note that 1.0; 2.0; 3.0 are official defaults). Though numbers look fine, it is also possible that many of these combinations are worth nothing, as they are obtained as pure morphing. All this is just to show a possible feature.

My statistics for the 0.4.9 -3 setting: 345/474 kbps on average for my regular/problem sample set which is very fine to me: pretty low bitrate for the regular samples and probably sufficiently high bitrate for the problems which is confirmed also by the listening experience so far.

As for your new proposals for -2 and -3: honestly speaking I don't like them very much.Your -2 proposal yields 420/543 kbps on average for my regular/problem samples, and this is not a lot better in comparison to the 430/539 kbps on average of the current -2 setting which doesn't 'suffer' from the a bit questionable positive -nts setting. I do favor a positive -nts value for -2 as much as you do, but when doing so I would expect a lower bitrate for regular tracks and/or a higher bitrate for problematic tracks.With -1 it's 523/601 kbps for regular/problematic tracks, and this too isn't a real progress from the 512/585 kbps for the current -1 setting.

I did a lot of variations for also finding a hopefully improved -2 and -1 setting.As you do I also favor a small positive -nts value together with -skew 36 -snr 24 when it's up to -2. I decided for -nts 2 but I really don't care whether it's 1.5 or 2.0.With the fft setting however my approach is different. I do want to let the longer FFTs decide on the low frequencies cause only they have a good resolution there. This also improves the differentiation between good and problematic spots which is enhanced by the high skew/snr setting. A spreading length of 1 with the short FFTs in contrary has a tendency to be rather contraproductive in this sense. So in principle a 64 and 1024 sample FFT should do the job, but I'm still a bit worried about the 1024 sample FFT stretching so far beyond the block borders. So I decided to use a 64, 512, and 1024 sample FFT.I tuned the details and ended up with

With -1 I also wanted to use a negative -nts value like you do (more exactly: 0 as the utmost limit).I found differentiation between good and bad still improves a bit when going -skew 40, but there's no real improvement in good/bad spot differentiation when using a higher -snr value. Going from -snr 21 to -snr 24 to -snr 27 put up bitrate by the same amount for the regular as well as the problematic set. Going -snr 30 was contraproductive already. So I used -snr 21 and decided for -nts -1 (with a larger -snr value I preferred -nts 0).I added a 128 sample FFT because even for the higher mid frequency range the resolution of the 64 sample FFT is a bit restricted.So I ended up with

Your -2 proposal yields 420/543 kbps on average for my regular/problem samples, and this is not a lot better in comparison to the 430/539 kbps on average of the current -2 setting.....With -1 it's 523/601 kbps for regular/problematic tracks, and this too isn't a real progress from the 512/585 kbps for the current -1 setting......