Archives for: January 2014, 08

01/08/14

Today this message appeared, and I knew that I needed to find a socket with a QLIM smaller than QLEN=8, but couldn't remember what the formula was.

But, the topic had come up on the bind-users list back on November 14th, 2013, where the messages was about '16 already in queue'.

Where for months before this I had been getting messages for '10 already in queue', and the only tcp socket I found that might be a problem The only thing with a QLIM of 10 was the submission port on sendmail, which didn't make sense...and bumping it up didn't help.

And, searching my system for the pcb was a bust (using lsof ‑i ‑Tfs | grep LISTEN or netstat ‑LAan)

Reducing end digits until I got matches, resulted in matches that didn't seem to fit.

So, I tried to ignore it....

When it popped up on the bind-users list. The discussion went to that the tcp-listen-queue default is 10. But, it didn't seem to apply in my case, until later when I did see some messages for "5 already in queue", because the base bind in FreeBSD 9.2 is 9.8.4-P2 where the default tcp-listen-queue is 3. It was changed to 10 in bind-9.9.

Anyways, when the thread came up on bind-users list, I decided that I needed to really dig for the answer. Searching through the kernel source, I eventually found my answer.

Couldn't figure out how to change the listen queue in it through its configuration file, so I stopped using it. And, the messages stopped. I had filled out the proxy settings in chromium with squid for http & https and ss5 for Socks5....and evidently some update around the same time as when I upgraded to FreeBSD 9.2 (or perhaps FreeBSD 9.2 made the message show up for dmesg?). Switching to using squid for all protocols fixed it.

Meanwhile...while I was looking for that old message, which I had posted back on November 20th, 2013. I stumbled upon some older threads on freebsd-stable.

I was searching on home computer, where I'm subscribed to the list, while my work email isn't subscribed to the list... and all my old freebsd list emails have since been purged. Still trying to get my email back under control after switching providers...both personally and at work. Plan to let an old personal domain expire once the migration is fully done, but its going so slowly that I let it auto-renew last year...and perhaps forgetting to change to the default 2 year auto renew to 1 year was intentional? New expiration date is November 20th, 2015. It was an early domain that I had registered, before I knew that '-'s in domains are considered bad. There were a number of different blogs that I would try to leave comments at, and the comments would claim to go to moderation but actually get discarded. The owner of one site eventually responded saying the system automatically does that to domains with '-'s in them, since most of them are spam. But, he'll whitelist my domain for the future. (IIRC, it was about a different antispam patch he had written for our blogging platform, functionality that never made it into newer releases and hadn't gotten updated. Wishing something like it was back again.)

That made me wonder if another site, running under my employer's domain...with a '-' in it, was rejecting my comments under my work email account, because it has a '-' in it. Switching to the form without the '-', and the comments would appear. I suggested to the site owner that he should remove that filter or at least whitelist our employer's domain.

The threads were older, and associated with upgrading to FreeBSD 9.2....first thread was started on August 1st, 2013. Was for "8 already in queue", and later indicated that the system was for backups and did outgoing rsync's and also did NFS and Samba. The discussion talked of strangeness of only having a queue limit that small, and that the default limit (128) is like 20 times that. The last reply to the thread was October 7th, 2013. Another thread started on September 30th, 2013 for "193 already in queue", with the last reply on November 12th, 2013.

The main hanging point again was that the pcb couldn't be found...and the suspicion is that its how daemons fork processes to listen to sockets and/or to handle requests, plus that they might create all these things and then use fork to detach to run in the background. The last thread was about using dtrace to maybe see if the process could be found that way.

I've been meaning to play around with that, but when I had last tried...found that its a module, and kldload dtrace wasn't the right way to load it.... its kldload dtraceall Guess I've rebooted since then, so it should be right (and done automatically in /boot/loader.conf.) Guess when I have time....

So, I wonder if I should reply to one or both of the threads....but first, its been a while since I blogged....so here I am.

As for today's message?

QLEN = 8 => QLIM = 5

At first I looked for the full address:

Shell

# netstat -LaAn | grep fffffe006acd9310

nothing

trimming, I eventually got:

Shell

# netstat -LaAn | grep fffffe006acd

fffffe006acdb7a0 tcp4 0/0/5 *.5666

fffffe006acdb3d0 tcp4 0/0/128 *.587

fffffe006acdcb70 tcp4 0/0/50 *.445

fffffe006acdc3d0 tcp4 0/0/128 *.621

nrpe? Hmmm, did that one new disk check push me over?

What else is 5?

Shell

# netstat -LaAn | grep '/5 '

fffffe012b909b70 tcp4 0/0/5 *.10143

fffffe006acdb7a0 tcp4 0/0/5 *.5666

fffffe006aab2b70 tcp6 0/0/5 *.5666

fffffe006abd17a0 tcp4 0/0/5 *.9032

fffffe019a1503d0 tcp4 0/0/5 *.873

fffffe012b9093d0 tcp6 0/0/5 *.873

fffffe006abd0000 tcp6 0/0/5 *.2049

fffffe006abd03d0 tcp4 0/0/5 *.2049

....

10143, imapproxyd - wasn't accessing roundcube

9032, there shouldn't be anything accessing pyTiVo

2049, NFS hmmm....well, my MacBook Air might be doing a PowerNap and doing its TimeMachine backup to the NFS share on my FreeBSD server.

873, rsyncd - BackupPC is constrained against running more than 3 jobs at once, and at most 3 against this server (I break up my [bigger] systems so its not all backed up at once, using lockfile in DumpPreUserCmd, though I have exceptions on this server so that certain rsync shares aren't blocked if a really long backup is running (recently had an incremental take 1 day and 11 hours - at least on my FreeBSD/ZFS system I have a comamnd in DumpPreShareCmd to take a snapshot.... a couple of weeks earlier, I had an incremental take 1 day and 15.5 hours.

Tweaked some sysctl's, and deleted some old snapshots seems to have sped things back up.

Probably NRPE

So some of the messages convert to:

QLEN => QLIM
==== ====
193 128
16 10
10 6
8 5
5 3

OTOH, "8 already in queue" is what the first thread in August had, and he had added about being a backup server that does output rsync and had also mentioned NFS (and Samba).

Additionally, in the output looking for QLIM == 5, were these lines

Shell

unix 0/0/5 /tmp/.org.chromium.Chromium.wpVy4H/SingletonSocket

unix 0/0/5 /tmp/ksocket-beastie/klaunchere40501.slave-socket

unix 0/0/5 /home/beastie/.pulse/zen.lhaven.net-runtime/native

unix 0/0/5 /tmp/.esd-1000/socket

unix 0/0/5 /tmp/seahorse-QY4SIO/S.gpg-agent

unix 0/0/5 /var/run/samba/nmbd/unexpected

When I was previously looking for QLIM == 6, there were only the two tcp sockets, so it was only 50-50 on picking the culprit, and since the other was minidlna which I haven't done more than build/install it so far. It was really only the one socket to explain it, and it did clear up immediately once I stopped using it.

As for NRPE, there doesn't seem to be a way to change it easily....so I'll just see if the problem continues to happen, before investigating other solutions.

Now instead of subjecting some poor random forum to a long rambling thought, I will try to consolidate those things into this blog where they can be more easily ignored profess to be collected thoughts from my mind.