So, I would like to have mp3 files close to 192 kbps. I listen mostly metal (various types - heavy, power, black, ...) and --alt-preset standard seems to create mostly bitrates of 200-250 (tested with several albums). I consider this to be too high, how should I create the files to get lower rates? Should I use ABR with --alt-preset [rate] or maybe some VBR command line (for example, edited -r3mix), using -V[n] to adjust the bitrates? Any suggestions?

I would recommend only using --lowpass (try 17.5) to modify --alt-preset standard to get lower bitrates. Other parameters are likely to interfere with some of the internal workings of --alt-preset standard.

Something riskier would be to use --ath-lower with a small negative value, maybe -2 or -3. Be warned though, this is not tested and I doubt Dibrom will recommend it.

Another alternative would be to use --alt-preset 192

Or --r3mix if you really must use VBR

If you can wait a few months, Dibrom will be working on tuning for lower bitrates....

hi, hmm since ive read, that dibrom listens and encodes quite alot of metal music, and that he has made some modifications to help with that type of music, so i assume that --alt-preset standard is really needing that much, to encode the music,
like tangent says best bet is to try --alt-preset 192 if you wanna keep the bit rate down

I don't recall ever reading Dibrom say that with his ultra-well tuned presets that lowering the lowpass was not an acceptable way to lower the bitrate while maintaining the high quality of the presets. In fact, I think in the days of the normal preset, that was his recommended way of lowering the bitrate.

Since I know that I personally cannot hear the high frequencies lost with a slightly lower lowpass, I prefer to use that to bring the bitrate down. I personally think the resulting music is of a higher quality (for my ears) than if I lowered the bitrate by switching from standard to one of the ABR targeted presets.

Lowering the lowpass is by far, the most graceful way of lowering the bitrate. If you can't hear beyond 16khz in real music (not on a test tone.. maybe you should try this to know for yourself), then the best solution of all would be to add the -Y switch. If you do this, frequencies over 16khz will not be encoded if they require a significant jump in bitrate... meaning that probably 80% of the stuff over 16khz won't get encoded, and you'll save a very large amount of bits.. probably on the order of 40kbps or maybe more, depending on the file.

The other solution, which is much less graceful and much more dangerous, increasing the possibility of ugly noise based artifacts creeping in, is to lower the ath. This is part of the approach that --r3mix uses, and is one of the reasons that I don't recommend it (--r3mix, or this approach) at all. However, even with a lower ath, you'd still get some of the benefits of alt-preset standard, but it would be very non-ideal. Most of the other settings in this preset are all based around a very fined tuned threshold, so lowering the ath significantly could very well "break" most of that.

Lowering the -V switch puts you in nearly the same boat as the ath (everything being tuned right to the threshold). You might be able to get away with -V3, but I wouldn't go much lower than that.

The lowpass really is your best bet.

Slightly OT:

In between the experiments I've been running on and off lately in trying to modify the abr stuff for higher quality, I've also gotten some possible ideas (no promises) for ways to maybe reduce the bitrate some more.. either in aps or in another new vbr preset. Chances are that if this happens, there will be something sacrificed elsewhere... like employing an adaptive lowpass, with a lower nominal cutoff, but increasing it beyond 19khz in critical moments. A more agressive adaptive ath might help here also. Neither of these things would increase quality, but might allow again for a tighter quality/size ratio.

like employing an adaptive lowpass, with a lower nominal cutoff, but increasing it beyond 19khz in critical moments. A more agressive adaptive ath might help here also.

These are ideas that I've heard before a long time ago, but none of the developers were able to put in the time to make it happen. IMO, the "adaptive" approach is going to put the bitrate very close to where people want it.

--------------------------

On a side note, I tried this commandline based on your recommendations for lowering the bitrate:

--alt-preset standard -V3 -Y --lowpass 17.5

This resulted in a much lower file size and the quality was good IMO. If you were to try code level tunings around this approach, it might be able to deliver a fantastic bitrate/quality ratio for those who aren't concerned about ultra high frequencies. Looks like 150-195 kb/sec based on the complexity of the materiel right now and has room for improvement through tweaks.

Perhaps this could result in a preset that can satisfy high quality sound without the ultra high frequencies? Some people might consider this statement an oxymoron (i.e. "high quality/no ultra high freqs").

What I mean by "high quality without ultra high freqs" is keeping the joint-stereo improvements, low artifacts, low ringing, etc. The idea here being "the best sound possible in mp3 under 17khz" (or under 16khz which ever is necessary within the spec to achieve marvelous bitrate/quality results).

Currently --ath-lower x results in an unrecognizable error while using razorlame. Anyone know why this is?

I still donīt see why you donīt use --alt-preset 192... Thatīs what I would do if I needed my files at that bitrate... Works just fine and itīs also around 30-60% faster than --alt-preset standard depending on the music...

Originally posted by mp3fan Currently [b]--ath-lower x results in an unrecognizable error while using razorlame. Anyone know why this is?

It's --athlower without the '-' in the middle

QUOTE

Originally posted by kxywhen I use -alt-preset standard -b96 -Y --lowpass 17.5, it got a following warning msg....

*** WARNING *** the meaning of the experimental -Y has changed! now it tells LAME to ignore sfb21 noise shaping (VBR)

I am not really sure how sfb21 noise shaping works, could someone please explain that to me?

sfb21 = the band containing all frequencies above 16khz, in mp3. By using -Y, and disabling noise shaping in sfb21, it has the effect I specified... where it will not "go out of it's way" to encode this material if it requires a signficant jump in bitrate.

QUOTE

[b]Also by using the lowpass would it have any effect on the determination of bits in joint stereo and nspsytune?

Probably not. The chances that the frequencies beyond 16khz could have a significant effect here is extremely slim I think.

QUOTE

What about and the -V3 switch, would it have any effect on the determination of bits in joint stereo and nspsytune OR conflict with the internal switch of alt-preset?

Yes. The -V switches control the global adjustments to masking in LAME, so it would have an effect here. It would also break some of the tunings, but only in the sense that the thresholds would now be "out of whack".

So despite the sfb21 warning, if someone want a lower bitrate than standard and doesn't want to go with abr, -Y --lowpass 17.5 is a safest bet? I thought sfb21 is one of the important factor of VBR...:eek:

Originally posted by kxy To recap, the -Y or lowpass 16, will disable sfb21, in effect not encoding anything above 16kHz and --lowpass 17.5 is just lowpass 17.5. Correct?

Right

QUOTE

Now the question is should one raise ns-bass value in relation to the lowpass?

Not really. If the point is to save bits by using -Y or a lower lowpass, then using a higher ns-bass value (which shouldn't be necessary anyway) will kind of defeat that purpose by raising the bitrate again.

By using -Y, and disabling noise shaping in sfb21, it has the effect I specified... where it will not "go out of it's way" to encode this material if it requires a signficant jump in bitrate.

So by using a --lowpass 17.5 & -Y, I'm telling the encoder that when it does "bother to encode" freqs above 16k, it won't go beyond 17.5, right? You're former description seems to indicate that it won't encode ANYTHING beyond 16k while your prior statement indicates that some material beyond 16k will get encoded. Please explain.

Also, when I use an --athlower number up to 8, --athlower 8, there is no change in bitrate and I expected to see a filesize increase with this setting. Also, what is the effect of using -Y with ABR or CBR? Will it still work the same way in those modes?

You are forgetting about stereo settings!
Right now i use --nsmsfix 5.0 to make LAME use much less stereo frames, and reduce bitrate.
I use 5.0 value because 5.5 sometimes makes LAME use more joint frames than "normal" settings.

Originally posted by mp3fan So by using a --lowpass 17.5 & -Y, I'm telling the encoder that when it does "bother to encode" freqs above 16k, it won't go beyond 17.5, right?

Yes, though it's kind of pointless to do this because very rarely does anything over this point get encode, and when it does, it doesn't really increase the bitrate.

QUOTE

[b]You're former description seems to indicate that it won't encode ANYTHING beyond 16k while your prior statement indicates that some material beyond 16k will get encoded. Please explain.

Note the clarification as "very rarely does it encode over 16khz, and when it does, it doesn't increase the bitrate".

QUOTE

[b]Also, when I use an [b]--athlower number up to 8, --athlower 8, there is no change in bitrate and I expected to see a filesize increase with this setting.

I'm not sure what the problem is there, it should work. What's the full command line you are using? And did you verify that for sure this switch is being passed to the encoder by trying it on the command line instead of through some sort of frontend?

QUOTE

Also, what is the effect of using [b]-Y with ABR or CBR? Will it still work the same way in those modes?

No, because CBR and ABR do not increase the global gain (quantziation stepsize accuracy) to encode these frequencies in the manner VBR does. So both -Y doesn't do anything, and CBR/ABR don't have bitrate issues with sfb21 either.

Also, a small note which may be relevant, although I've stated it many times before, it seems people keeping forgetting this. At any rate, when content over 16khz is encoded, and it causes such a big jump in bitrate, the majority of the extra bits aren't actually being spent on the frequencies over 16khz. Instead they are being spent on all the frequencies below this point.

The reason for this is as follows. sfb21 is the scalefactor band which is assigned to the frequencies beyond 16khz. However, the problem is that it doesn't actually have a scalefactor, or basically a way to efficiently compress these frequencies, while all other sfbs do. Apparently, this was an oversight in the design process of the format, because either the developers thought frequencies beyond 16khz were inaudible (or irrelevant), or as robert suggested, maybe the format was designed for 32khz PCM as is used on DAT, which resultingly goes up to this frequency. In any case, to actually encode these frequencies then, you have to increase the resolution of all of the sfbs, so that you can "in effect" also control the resolution of sfb21 (which, because of a lack of a scalefactor, you cannot control independently). Make sense?

So what's happening when VBR spends massive bits when lots of 16khz content is present, is the bits are actually mostly being spent below these frequencies... so there's not really the problem (which seems to be a common misconception) of too many bits being spent on 16khz+ frequencies at the expense of the lower ones.

As for why I don't recommend using -k then, it's not actually because of too many bits being spent on the wrong frequencies, but more so that encoding irrelevant content could lead to more problems, like for example, maybe using too many bits from the bit reservoir than is necessary on a particular frame, etc. There is also the possibility that encoding more content, but not accurately enough, could lead to more errors, which would not actually be noticed if the content wasn't present at all. As an example, it'd be better to cutoff at 16khz in a particular situation, than have a file that goes up to 17khz but where everything beyond 16khz may be effected by ringing (this is a hypothetical situation).

Originally posted by jkeating You are forgetting about stereo settings!
Right now i use --nsmsfix 5.0 to make LAME use much less stereo frames, and reduce bitrate.
I use 5.0 value because 5.5 sometimes makes LAME use more joint frames than "normal" settings.

I wouldn't recommend doing this at all. This really is a bad way to decrease the bitrate, you're almost sure to introduce quite a bit of artifacts by doing this. The default LAME setting, which already uses too many joint stereo frames and tuned as well as the --alt-preset settings, is 3.5. At 3.5 LAME already has problems on some samples because of too high of a setting, so 5.0 or 5.5 is even more in the wrong direction.

The reason for this is as follows. sfb21 is the scalefactor band which is assigned to the frequencies beyond 16khz. However, the problem is that it doesn't actually have a scalefactor, or basically a way to efficiently compress these frequencies, while all other sfbs do. Apparently, this was an oversight in the design process of the format, because either the developers thought frequencies beyond 16khz were inaudible (or irrelevant), or as robert suggested, maybe the format was designed for 32khz PCM as is used on DAT, which resultingly goes up to this frequency. In any case, to actually encode these frequencies then, you have to increase the resolution of [b]all of the sfbs, so that you can "in effect" also control the resolution of sfb21 (which, because of a lack of a scalefactor, you cannot control independently). Make sense?

No. Sorry for beeing a bonehead, but i don't understand. Why isn't it possible to lower the other scalefactors (<21), when you add to the global scalefactor, if the only purpose is to increase resolution in sfb21? Ok, i admit that i haven't read a single line of code. I'm sure there is a reason why it's not done, but it would be interesting for me to know.

Originally posted by ErikS No. Sorry for beeing a bonehead, but i don't understand. Why isn't it possible to lower the other scalefactors (<21), when you add to the global scalefactor, if the only purpose is to increase resolution in sfb21? Ok, i admit that i haven't read a single line of code. I'm sure there is a reason why it's not done, but it would be interesting for me to know./Erik

*Cough*, my try to explain the problem with VBR coding and high freq content: There is a certain masking threshold created by psycoacoustics across the full frequency band (21 scalefactor bands with long blocks), and the encoder (quantization-noiseshaping loop) must try to obey this threshold so, that enough bits (resolution) is used for every SFB so that there wouldn't be audible quantization noise. Quantization-noiseshaping loop does this by adjusting scalefactors which are used to reduce the stepsizes of every SFB - except SFB21 because there's no scalefactor for it. Smaller stepsize of certain SFB means higher resolution in that SFB (more bits used).

Now, suddenly there is lots of loud high frequency content and the q-ns loop notices for SFB21 that damn - I need higher resolution for this scalefactor band number 21, what to do? q-noiseshaping loop can't adjust the scalefactor to decrease the q-stepsize (increase the resolution) of SFB21 in order to comply with masking threshold - simply because there's no scalefactor for SFB21. So, how can q-ns loop adjust the q-stepsize of SFB21 to comply with SFB21 masking threshold which demands higher resolution? There's still one way: Global gain value defines the largest q-stepsize value used. So, obviously a proper global gain value must be used (it's now used for SFB21) so that it complies with resolution demanded by SFB21 masking (of course this is not the same global gain which is used in loudness/amplitude tweaking by mp3gain). Too bad is that the smaller the stepsize of SFB, the more resolution (bits) will be used in that SFB. So, in order to comply with resolution demanded by SFB21 masking threshold, global gain (largest stepsize value which in this case will be used for SFB21) must be quite low. That means that there can't be larger stepsize values (lower resolution) for all the other under SFB21 bands, only smaller stepsize values. That also means that higher resolution than maybe necessary or wanted must be used for scalefactor bands 0-20 (under 16kHz).

However, there's one practical way to decrease this effect and still maintain somewhat decent high frequency production. You use --ns-sfb21 switch to raise the masking threshold for SFB21, so that it wouldn't need so high resolution. Of course high frequency production/accuracy is not as good anymore then.

JohnV and Dibrom -- thanks for the explanations, I've wondered about this. =] It definitely seems like whoever designed the mp3 format didn't intend to have frequencies over 16 kHz encoded...

On a side note, have there been any listening tests as to whether this sort of high-frequency data is usually audible in music? Will -Y audibly degrade sound quality in any significant number of samples?

Considering that MP3s primary function is to reduce bitrate while maintaining a fairly high degree of quality (but not audiophile/archival), it is understandable why the creators of the spec choose 16KHz as a cut-off. I've encountered a variety of musical tracks where I could not ABX a file with a 16KHz cutoff and a full-bandwidth one. I'm not saying this occurs in the majority of files, but this phenomenon does exist. This is because many CD tracks don't even store sonic data up to 22.05KHz...some experience significant roll-off above 18KHz or even less.

Also, when I use an --athlower number up to 8, --athlower 8, there is no change in bitrate and I expected to see a filesize increase with this setting.

I'm not sure what the problem is there, it should work. What's the full command line you are using? And did you verify that for sure this switch is being passed to the encoder by trying it on the command line instead of through some sort of frontend?

--alt-preset standard -V3 -Y --athlower -8 --lowpass 17.5 When I use that commandline vs. a command line without --athlower, there is absolutely no change in filesize or bitrate. I'm using the razorlame frontend in Windows XP Professional.