Don't ask me why, but the fact that at 96000Hz an extra byte order flip makes a difference to the success rate, gives me a gut feeling that sox.exe is somehow hitting an internal timing issue or lack of CPU capacity (e.g. thread timing issue?).

To go on, we’ll need to know at least where standard input is coming from – is it something processing a file, or directly from an audio input? Also, more about the kind of “distortion” might be useful: Are chunks of the audio missing? Is the pitch wrong? Unexpected noise? Etc.

See prior answer. I am working with four different source types: 2-ch, 24-bit, linear signed integer, big or little endian, 192000 or 96000Hz. The different sets of command line switches in my OP reflect those four source types.

QUOTE (chi @ Nov 19 2012, 02:04)

To go on, we’ll need to know at least where standard input is coming from – is it something processing a file, or directly from an audio input?

The incoming audio is being delivered from remote audio servers through a TCP socket connection via HTTP; my Whitebear application fetches those streams and squirts them into sox.exe for transcoding through its Std_In pipe.

QUOTE (chi @ Nov 19 2012, 02:04)

Also, more about the kind of “distortion” might be useful: Are chunks of the audio missing? Is the pitch wrong? Unexpected noise? Etc.

Unlike in the other three cases, it is not white noise in this case. I can actually recognise the music. But the sound it is all munged up. It is not a question of wrong pitch, or swapped channels or whatever. It sounds more like sox.exe is throwing audio samples in a box, jumbling them around, and taking them out again in an order that is close to, but not identical to, the correct order; or something like that... (I could send you one of them if you need it...)

I should perhaps add that I have no problems when not trying to down-sample the audio. If I take one of my raw stream (any of the above-mentioned four formats) and convert it to flac at the same rate, then the resulting flac plays fine. However if I take the exact same raw stream and (try to) convert it to flac down-sampled to 48000Hz (or 44100Hz) then, as described in my OP, the resulting flac either plays white noise, or munged audio. In other words the problem is set off by adding the "rate 48000" effects switch at the end of the command line...

I should perhaps add that I have no problems when not trying to down-sample the audio. If I take one of my raw stream (any of the above-mentioned four formats) and convert it to flac at the same rate, then the resulting flac plays fine. However if I take the exact same raw stream and (try to) convert it to flac down-sampled to 48000Hz (or 44100Hz) then, as described in my OP, the resulting flac either plays white noise, or munged audio. In other words the problem is set off by adding the "rate 48000" effects switch at the end of the command line...

A user must pass enough information to SoX on the command line so that it knows what type of data it contains. Audio data can usually be totally described by four characteristics:rateThe sample rate is in samples per second. For example, CD sample rates are at 44100.

So if you pass in 96k audio and then give sox -r 48000 I expect it to get quite screwed up. Perhaps you want to use the resample command instead?

I just tried this out for myself and it worked fine in all four cases. I started with 24/96 and 24/192 FLAC files and tested it two ways, each with the same results. First by using flac.exe to decode the FLACs to individual .pcm files with the desired endianness, and feeding these files into SoX, and then by just piping the output of flac.exe directly into SoX.

Here are the command lines for the latter. The SoX parts are exactly like yours:

No, that’s OK as it is. It is not the -r or --rate option, but “-” (standard input/output) and “rate”, the resampling effect, with a space between them.

QUOTE (andrewfg @ Nov 19 2012, 20:45)

I am working with four different source types: 2-ch, 24-bit, linear signed integer, big or little endian, 192000 or 96000Hz. The different sets of command line switches in my OP reflect those four source types.

Are you sure about those parameters? Did you, e.g., try to read what you expect to be big-endian as little-endian?

QUOTE (andrewfg @ Nov 19 2012, 20:45)

The incoming audio is being delivered from remote audio servers through a TCP socket connection via HTTP; my Whitebear application fetches those streams and squirts them into sox.exe for transcoding through its Std_In pipe.

Is it possible that processing is too slow for 192 kHz input, and that buffers are dropped by the framework?

QUOTE (andrewfg @ Nov 19 2012, 20:45)

It sounds more like sox.exe is throwing audio samples in a box, jumbling them around, and taking them out again in an order that is close to, but not identical to, the correct order; or something like that... (I could send you one of them if you need it...)

I should perhaps add that I have no problems when not trying to down-sample the audio. If I take one of my raw stream (any of the above-mentioned four formats) and convert it to flac at the same rate, then the resulting flac plays fine. However if I take the exact same raw stream and (try to) convert it to flac down-sampled to 48000Hz (or 44100Hz) then, as described in my OP, the resulting flac either plays white noise, or munged audio. In other words the problem is set off by adding the "rate 48000" effects switch at the end of the command line...

This would fit with it being a load problem (the resampling needing more processor resources). You could try if the 96k case stops working when you add some computationally expensive effect(s), like “sinc”. Otherwise, yes, a sample of the affected audio might be useful.

I just tried this out for myself and it worked fine in all four cases. I started with 24/96 and 24/192 FLAC files and tested it ... by just piping the output of flac.exe directly into SoX.

I confirm that I tried the same pipeline test as you [input file => flac (decode) => sox (resample) => output file] and got the same (positive) results as you. I even tried yet a longer pipeline ([nput file => flac (decode) => sox (resample) => sox (resample again) => output file] to check if the issue was related to the endpoints of the pipeline. However basically it always seems to work in such cases.

QUOTE (JJZolx @ Nov 20 2012, 02:43)

You're not actually starting with PCM files, are you? I'd guess that whatever you're piping into SoX isn't the format that you expect.

Yes I am using a pcm source, and yes it is a good pcm stream. I know this is because if I use the same pcm with the same sox command line but without the resample rate switch, then the output is properly audible; the process only fails on high bit rates and where there is a resample rate switch in the command line...

Is it possible that processing is too slow for 192 kHz input, and that buffers are dropped by the framework?

QUOTE (chi @ Nov 21 2012, 02:38)

This would fit with it being a load problem (the resampling needing more processor resources). You could try if the 96k case stops working when you add some computationally expensive effect(s), like “sinc”. Otherwise, yes, a sample of the affected audio might be useful.

I think yes; to both points above...

When I execute the command pipeline from a DOS prompt [input file => sox (resample and transcode) => output file] then it all works fine. But when I execute the same command pipeline as a sub process from my windows application [input socket => sub process input pipeline => sox (resample and transcode) => sub process output pipeline => output socket] then it fails; and only when using high bit rate files, and only when there is a resample rate switch in the command line.

I have even discovered that if I repeat the same command pipeline sub process several times, then sometimes the output audio is good, and sometimes it is not...

So I think there is indeed something odd in the way that sox is sucking in the data from the StdIn buffer, and/or the way it is blowing out the data from the StdOut buffer. And yes indeed it could be high computational load issue that causes bytes to get lost from the input or output buffers.

Note: if the number of dropped bytes happened to be an exact multiple of the audio frame size, then the audio would still sound Ok, whereas if the number of dropped bytes were not a multiple of the frame size, the audio would be munged. So this probably explains my experience with several repeats of the same command pipeline sub process, where sometime the output sounds Ok and sometimes not...