Re: problem while restarting auditd

From: Steve Grubb <sgrubb redhat com>

To: linux-audit redhat com, Eric Paris <eparis redhat com>

Subject: Re: problem while restarting auditd

Date: Thu, 15 Sep 2011 21:30:34 -0400

On Thursday, September 15, 2011 02:32:59 AM Vipin Rathor wrote:
> One strange thing I'm seeing in /var/log/messages w.r.t. auditd restart.
>
> 2011-09-14T11:49:14.541661-07:00 audisp-remote: audisp-remote is
> exiting on stop request
> 2011-09-14T11:49:18.741166-07:00 kernel: audit: *NO* daemon at
> audit_pid=1652525 2011-09-14T11:49:18.741190-07:00 kernel: __ratelimit:
> 366 callbacks suppressed 2011-09-14T11:49:18.745558-07:00 auditd[1654362]:
> Started dispatcher: /sbin/audispd pid: 1654364
> 2011-09-14T11:49:18.746081-07:00 audispd: max_restarts_parser called with:
> 10 2011-09-14T11:49:18.746099-07:00 audispd: priority_boost_parser called
> with: 10 2011-09-14T11:49:18.746666-07:00 audispd: audispd initialized
> with q_depth=90000 and 1 active plugins
> 2011-09-14T11:49:18.747047-07:00 audisp-remote: Connected to
> <remote_audit_logging_server_IP>
> 2011-09-14T11:49:18.750761-07:00 kernel: audit: audit_lost=3823
> audit_rate_limit=0 audit_backlog_limit=20480
> 2011-09-14T11:49:18.750773-07:00 kernel: audit: auditd dissapeared
> <========= why this message?
> 2011-09-14T11:49:18.750777-07:00 kernel:
This comes from the following code:
http://lxr.linux.no/#linux+v3.0.4/kernel/audit.c#L401
It sort of follows this:
446 if (audit_pid)
447 kauditd_send_skb(skb);
Then
401 err = netlink_unicast(audit_sock, skb, audit_nlk_pid, 0);
402 if (err < 0) {
404 printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n",
audit_pid);
405 audit_log_lost("auditd disappeared\n");
So, what looks like happened is you have a busy system and an event was queued to be
sent to user space, the audit_pid exited so it started the call, but by the time the
call was made, the netlink layer couldn't find the pid and then failed.
Eric, is there anything that can be done about this race?
> Whenever I'm restarting the auditd using 'service auditd restart'
> command, the auditd gets restarted. But the very next moment, I get
> "kernel: audit: auditd dissapeared " message & auditing stops
> (actually it falls back to syslog). I've to again run 'service auditd
> restart' to get the auditing back. So it is taking two restart
> operation to do the job. This behavior is consistent & I can recreate
> at will.
This is something strange too. But sounds like perhaps another race of some kind.
-Steve