Audio glitches seem to be a fact of life with our radios. They can and do occur with both transmit and receive audio, although the receive glitches seem to be much more apparent. There are three sources of audio glitches:

1) Dropped UDP packets from the hardware to the PC. These will be counted by the OOOPS counter in the VAC1 setup tab. You will almost certainly see a non-zero count on the OOOPS counter because UDP is not a lossless protocol like IP (and why the designers of the firmware did not use IP I could not tell you). However, if you are seeing a lot of OOOPS counts then there is a networking problem of some sort. Check all of the obvious things, certainly, i.e. cables, connectors, etc. However, people have seen network issues because of obscure problems such as malfunctioning circuit breakers, so leave no stone un-turned.

2) Blocked process threads in the PowerSDR. The most likely culprit according to Warren, NR0V, is the “JanusAudio thread”. Per Warren,

"The JanusAudio thread retrieves ALL data streams from the network interface, runs them ALL through quite a variety of operations, and then takes ALL the DSP output back to the network interface. As PowerSDR has gotten more complex, I see more and more times that the Windows scheduling issues you mention impact this thread. In my opinion, it is overloaded. However, we do not propose to fix this in PowerSDR. In Thetis, we have separate threads FOR EACH DATA STREAM to perform most of these functions. No matter which thread is problematic, the result can be the same, if something doesn’t get scheduled for execution in a timely fashion, data is dropped on the floor."

Windows is not a real time operating system. There is only so much buffering can do to smooth things out. If a thread is blocked for longer than how much data a buffer can hold then you get a drop out or a glitch. You can get an idea if this is a problem by using two tools, LatencyMon and DPC Latency Checker. Sometimes these tools help you fix problems, sometimes they say there are no problems to fix, but it's always worth trying them to at least see if they report problems.

3) VAC buffer under and overruns caused by audio sample rate mismatch between the radio hardware and the PC. The resampler function, if enabled with proper VAC buffer adjustments, can correct this almost perfectly, and you can watch the under and overrun counters basically stop counting.

Even with everything in a relatively optimum state with respect to 1, 2 and 3 above, it is still not unusual to see a couple of glitches per minute. I'm convinced that these mostly come from category (2) because there are good metrics on (1) and (3) and if they are not indicating problems then there are none in those categories.

Just as an FYI and pertinent to the mention of the resampler to 'correct' things. I have tried ad nauseum to get the resampler going ... also my TX waveform was often very poor due to dropouts (clicks etc). I found setting the affinity to only 2 cores improves this noticeably but it did not help with the resampler. I was running 96K at the time, my preferred width...

Then I tried 192K and BINGO! Lo-n-behold the resampler works and works wonderfully! AND my TX is cleaned up a good bit. I can run with the affinity set to all cores although occasionally I can see, using Task Manager/Performance, that there is a LOT going on on most all cores, lots of spikes and my TX will suffer during this time - I find I can once again go back to 2 or 3 cores and that cleans it up again. [what the heck Windows/applications are doing to screw with ALL cores at these times is far beyond me ... ]

One man's experience that directly relates to audio issues in digital modes. Bottom line for me was 192K was a noticeable difference/improvement from 96K.

Interesting observation about the "golden" sampling rate of 192K Gary. I rarely stray from 192K and have not observed any audio glitches since the resampler was implemented.

That being said, I experienced them frequently when my ANAN was connected to my computer through the router. I have a couple of bandwidth hogs on my LAN with my network printer being a regular problem child that would bring the ANAN to its knees whenever my wife spooled up a print job from her computer. I also run a Plex Media Server and a Teamspeak server on the same computer that the ANAN is connected to and both of them had the potential to cause problems when they were serving up bandwidth across the LAN or to the WAN. I could probably have handled all those issues by setting up the QOS manager in my router, but I went with a much quicker and simpler solution:

I directly connected the ANAN on a separate subnet to the second NIC port on my computer which resolved all the issues completely. That took about 10 minutes to set up and was a lot simpler than endlessly tweaking the problematic QOS settings of my router.

I also experienced TX dropouts whenever I configured my external audio chain to run VAC through my UMC-202HD interface. I had no TX dropout problem at all when using a hardware connection through the LINE IN port or the MIC jack. The VAC resampler in OpenHPSR and Thetis resolved the VAC interface issue completely, however, it should be noted that I don't have much interest in running the main sampling rate at any setting other than 192K so I don't know how it behaves at higher or lower sampling rates. I wonder how many other people are running into RX/TX dropout issues when they interface with VAC for their TX voice audio or when running the digital modes?

73,

Rob W1AEX

"One thing I am certain of is that there is too much certainty in the world."

Just to fill in - I'm using Eugene's VAC for FT8 and RTTY only. Voice is direct to the radio. I had tried running Teamviewer to watch for new ones on FT8 while working but that caused glitches when it was running and I was actually working people. So I removed the program. This is a reasonably fast (new last year) 12 core i7 I think 4770 with 8G and an SSD drive.

As for connections - I have a ... I think it's called a switch, that has the incoming line for internet going to it - it has the radio and the computer going to it ... it is 'rated' at 1G switch. Could that be an issue? I don't know how I'd connect it up otherwise since there is of course only one connection at the computer itself.

All of your gear is connected to an Ethernet switch. This switch might be part of your router, but whether it is or not, it is still a "switch".

BREAK

All,

My PC is connected to the radio hardware and the internet via a switch. I have also noticed the phenomenon of audio glitches when using Chrome alongside PowerSDR or Thetis. But I see no counts on the OOOPs counter. if the glitches were caused by UDP packet loss I would see that as counts on the OOOPs counter. But again I don't see any counts on the counter. To me that says the glitches are caused by some other resource being blocked for too long a time by Chrome.

I only get this behavior with Chrome. I can rock out in LibreOffice, or live stream my PowerSDR display using OBS (Open Broadcast Software--very Ethernet intensive), run Teamviewer (another Ethernet intensive app), remote my station using RDP (Ethernet intensive), or do any number of things without getting glitches. But open a complex new web page in Chrome and it's glitch-city until the page is fully rendered. Again, since any number of network intensive app's run in a benign fashion, but Chrome, which uses very little network bandwidth, causes Glitches, this implies that it's Chrome hogging something.

Now the question is: what is Chrome hogging? I tried using CPU affinity to force Chrome and PowerSDR onto separate cores. Very surprisingly, this made absolutely no difference! I actually have two NICs in the PC. I have not tried running a separate connection to the radio hardware because I much prefer having everything on a common subnet for a variety of reasons. However, if I did try that and it fixed the problem with Chrome that would still leave me bewildered with respect to what the problem was. Because with an OOOPs count of 0 there should be no glitches that are related to networking, at least not on receive.

Scott, thanks for the detail on the switch. It's not a router - it is just a switch (there is a router on/from the incoming Spectrum cable line though). At least I think this is correct... but I believe a moot point.

As for Chrome. I found that Chrome produced much less 'activity' re glitches on the displayed transmit signal on FT8 as compared to Windows Edge (supplied with Windows 10). I only use Chrome now for that reason. I don't see any issues when running things on Chrome while transmitting as long as I am set to a main sample of 192Khz. I gave up on trying to use affinity on one app vs. another ... I just use it on PowerSDR setting it to 3 (of 12) cores on power up. Occasionally I'll forget to do that now, now that I'm using 192K and most of the time that hasn't been an issue - still clean TX ... but if I see it jumping around I will go to TM and (find that I forgot to set it) switch to 3 cores which will invariably clean it up. Note during those times when it is clearly jumping around, the TX signal, I can go to TM/Performance and see glitches on all cores!

That stated I do get a good bit of OOPS counts. At one point it would just go to 5 immediately upon selecting a bandstack that is FT8/DIGU/VAC1 and stay there... but something changed somewhere/somehow as now it will accumulate over time and at times jump a good bit past 100. So more sleuthing to be done.

I have tried to see if there was any correlation between increasing OOPS count and TX quality but have never seen a connection directly. I've tried to change the MANUAL DELAY a bit here and there (have it at 30 from experimentation) but not sure that accomplishes anything.

Interestingly, I never see this problem on transmit. I have intentionally tried to induce glitches in my transmitted audio by heavily exercising Chrome during transmit and have never had any reports of glitches in my transmit audio. This problem affects only my receive audio. Again, this is why I believe it is a Windows resource issue of some sort and not a LAN or NIC issue.

Also, the OOOPs counter is only looking at UDP packet sequence numbers that are arriving from the radio hardware. If a packet is missing it will cause the counter to increment. So it cannot diagnose packet loss in the PC to radio direction.

Adding to the above...I can clearly see CPU spikes caused by Chrome activity. Huge spikes on all CPUs. This is generally true on complex web pages like CNN, etc. where there is a lot of advertising and whatnot. Indeed, when I close a tab it can be worse than opening a tab on such a page. I can only guess that such tabs are writing cookies like crazy at that time.

Also, I think I figured out that I need to write a script of some sort to set affinity properly on Chrome. You have to do every Chrome sub-task separately and that's a huge pain, because Chrome will spawn 30+ sub-tasks!

edited to apply to the correct processes, of course, to set the affinity of ALL Chrome processes to my first two CPU cores, and then Thetis to the remaining six. This has almost completely cured my receiver audio glitch problem with Chrome. It does make Chrome slower! And it confirms that is a thread blocking problem and not anything to do with NICs or LAN.

I need to write a script to do this automagically. I also need to experiment with distribution of CPU cores/threads to optimize both programs.

yes indeed I've seen the MANY threads for Chrome ... cool on the automation. That sounds promising. It would be nice if Windows provided a way to contain a program to X number of threads - it only ... wouldn't think this would be onerous for them but obviously don't know the knock-on effects.

I wrote what I hope is a handy little Windows Powershell script to look at and set CPU affinity. For processes with a lot of sub-processes, like Chrome, it will set each one automatically. You do have to know what you are doing a little bit, i.e. the script does not encode the affinity setting for you, but only accepts a pre-encoded decimal value corresponding to the CPU mask you want. For example, on an 8 core machine, 255 will turn them all on, but 3 will only turn on CPU 0 and CPU 1.

First, here is the contents of a Windows batch file you can use to kick off the script:

Using my fun little script, I've been playing around with CPU affinity between Chrome and Thetis.

Thetis definitely does not like to run with only two cores on my system. It'll be OK for a while, but then it gets to a place where I can't get the resampler to settle down. It is perfectly stable with four cores (again, on MY system, YMMV). I haven't tried three.

Putting Chrome on the second four cores and Thetis on the first four has dramatically reduced glitches when Chrome renders or closes a page, but surprisingly it has not eliminated them. I can't say I understand this at all. I guess the next step is to bite the bullet and see what two NICs does, but I really don't want to run that way.

Scott have you tried firefox as a test to see if it exhibits the same behavior? I use firefox here and everything seems very stable . I have the Anan on the motherboard ethernet port and a Wi FI adapter for internet, perhaps that is the difference maker.