Bug report for Sys::Syslog 0.17.
Disclaimer: I'm merely piggy-in-the-middle, trying to get a bug
recognised and fixed. I'm not actually subject to the bug myself.
The MailScanner project (www.mailscanner.info) has for many years used
Sys::Syslog extensively. In early July, a few MailScanner folk noticed
problems with 0.16, which were apparently OK if they backtracked to 1.15.
See:
http://lists.mailscanner.info/pipermail/mailscanner-beta/2006-July/
thread:
Sys::Syslog 0.16 problems
Then later in July 0.17 appeared, and apparently seemed OK for some folk.
(So I guess that the new 0.16 problems were quickly recognised and fixed:
thank you!) See same URL, thread "Sys::Syslog 0.17".
But note the reports about "make test" failing in that thread.
Since then (during August) there have apparently been more failures (in
"make test", I understand).
Are you aware of any problems in 0.17, particularly in "make test"?
Are there any plans to resolve this?
If you need more detail, the best authority is Julian Field, the developer
of MailScanner, to whom this is Cc:d. (I could attempt to handle queries
about this issue if necessary if you wish.)
--
: David Lee I.T. Service :
: Senior Systems Programmer Computer Centre :
: Durham University :
: http://www.dur.ac.uk/t.d.lee/ South Road :
: Durham DH1 3LE :
: Phone: +44 191 334 2752 U.K. :

> Disclaimer: I'm merely piggy-in-the-middle, trying to get a bug
> recognised and fixed. I'm not actually subject to the bug myself.
>
> The MailScanner project (www.mailscanner.info) has for many years used
> Sys::Syslog extensively. In early July, a few MailScanner folk noticed
> problems with 0.16, which were apparently OK if they backtracked to 1.15.

Apologises for creating this problem.
Show quoted text

> Then later in July 0.17 appeared, and apparently seemed OK for some folk.
> (So I guess that the new 0.16 problems were quickly recognised and fixed:
> thank you!) See same URL, thread "Sys::Syslog 0.17".

A bug was reported but I admit I didn't see the urgency of it until
someone opened another ticket, with the precision that the bug was
breaking Spamassassin (and after I released 0.17, someone also reported
problems in RT). Had I been told that it broke Spamassassin, I would
have certainly tried to fix it earlier.
Show quoted text

> Since then (during August) there have apparently been more failures (in
> "make test", I understand).
>
> Are you aware of any problems in 0.17, particularly in "make test"?
> Are there any plans to resolve this?

If you or other people have failure cases, don't hesitate to fill a bug
with the full output, and also tell me what software is broken by the
bug. Currently, as reported by CPAN Testers, Sys::Syslog 0.17 seems to
pass all its tests in a few systems:
» http://cpantesters.perl.org/show/Sys-Syslog.html#Sys-Syslog-0.17
And there is no recent failure caused by Sys::Syslog reported by Perl
smokers, which cover a larger set of operating systems:
» http://www.nntp.perl.org/group/perl.daily-build.reports/
Someone has also confirmed the existence of a heisenbug related to the
UDP mechanism on Darwin. I'll give it a look this week-end.
There's also the ticket #20635 about tests failure on Cygwin:
» http://rt.cpan.org/Ticket/Display.html?id=20635
Of course, one of the problem is that the test suite only checks the
API, but has currently no way to check that the data are correctly sent
or received. I've begun working on a data validation test script, but
have not yet finished it.
Show quoted text

> If you need more detail, the best authority is Julian Field, the developer
> of MailScanner, to whom this is Cc:d. (I could attempt to handle queries
> about this issue if necessary if you wish.)

I've added Julian in the requestors of this ticket so he should automatically
receive updates to this ticket.
I haven't look at the source of MailScanner yet, but one of the problems
with Sys::Syslog is that it didn't provide the "native" mechanism until
very recently (version 0.15) and unfortunately kind of recommended the
"unix" mechanism even though it is quite fragile. The native mechanism,
as it uses the real C functions, is what the module should have provided
since day one as it make Perl behaves like any other program.
Therefore, I'd now suggest to avoid changing the mechanism with setlogsock()
unless necessary. I'll modify the documentation to make that statement
more clear.
--
Sébastien Aperghis-Tramoni
Close the world, txEn eht nepO.

Many thanks for your swift reply! Much appreciated.
Sadly, I'm just the middle-man, with no first-hand knowledge of the
problem.
I had spotted a criticism of the recent module development on the main
MailScanner list, because of claimed failures in recent revisions,
including (apparently) a residual "make test" hanging problem in 0.17.
I simply wanted to check that the developer (you, Sebastian) were either
aware of the problem or willing to accept fault reports. (It seems that
the problem is new to you (fair enough), and that you are open to reports
(excellent!).
Show quoted text

>

> > Since then (during August) there have apparently been more failures (in
> > "make test", I understand).
> >
> > Are you aware of any problems in 0.17, particularly in "make test"?
> > Are there any plans to resolve this?

>
> If you or other people have failure cases, don't hesitate to fill a bug
> with the full output, and also tell me what software is broken by the
> bug. [...]

Thanks.
Earlier this afternoon (UK time) I asked (on the main MailScanner email
list) for a more detailed report of the alleged problem.
It apparently claims a hang (presumably very, very long wait) during the
"make test" on some platforms.
No report forthcoming as yet.
I've also searched the archives of the main list, but was unable to track
down a detailed report. Sorry.
Show quoted text

> I've added Julian in the requestors of this ticket so he should automatically
> receive updates to this ticket.

Thanks. Julian actively reads his email! The chances are that he can dig
out the exact report of the alleged "make test" hang and send it to you.
The responsibility now fairly and squarely lies with us (the MailScanner
community) to provide you (Sebastian) with decent information.
Best wishes! Have a good weekend. (Time for me to go home...)
--
: David Lee I.T. Service :
: Senior Systems Programmer Computer Centre :
: Durham University :
: http://www.dur.ac.uk/t.d.lee/ South Road :
: Durham DH1 3LE :
: Phone: +44 191 334 2752 U.K. :

> Sadly, I'm just the middle-man, with no first-hand knowledge of the
> problem.

No problem. People can contribute to this ticket from the RT interface.
Show quoted text

> I had spotted a criticism of the recent module development on the main
> MailScanner list, because of claimed failures in recent revisions,
> including (apparently) a residual "make test" hanging problem in 0.17.
>
> I simply wanted to check that the developer (you, Sebastian) were either
> aware of the problem or willing to accept fault reports. (It seems that
> the problem is new to you (fair enough), and that you are open to reports
> (excellent!).

I'm more the maintainer than the developper. Sys::Syslog is a piece of
old code, with roots dating back from Perl 4, and has evolved in a strange
and organic way. I became the maintainer last year when it was decided to
released it on the CPAN, independently of the main Perl distrbution, in
order to address CVE-2005-3962 (the sprintf() buffer overflow problem).
Since then, I've tried to address the tickets that were opened and accept
the patches that were waiting, also trying to tidy up the code and add
new features.
Show quoted text

> Earlier this afternoon (UK time) I asked (on the main MailScanner email
> list) for a more detailed report of the alleged problem.
>
> It apparently claims a hang (presumably very, very long wait) during the
> "make test" on some platforms.

I've seen this occurs on a Debian system, but have not yet investigated
this case.
Show quoted text

> No report forthcoming as yet.
>
> I've also searched the archives of the main list, but was unable to track
> down a detailed report. Sorry.

No problem, I can wait.
Show quoted text

> Thanks. Julian actively reads his email! The chances are that he can dig
> out the exact report of the alleged "make test" hang and send it to you.
>
> The responsibility now fairly and squarely lies with us (the MailScanner
> community) to provide you (Sebastian) with decent information.

I took a quick glance at MailScanner::Log, and I think the code can be
simplified by using the features from the recent Sys::Syslog versions.
Namely, the eval{} can be avoided by using the "nofatal" option of
openlog().
» http://search.cpan.org/dist/Sys-Syslog/Syslog.pm#openlog
Also, I really think that you should avoid calling setlogsock() to change
the mechanism and just keep with the native one. Especially on systems
like Solaris, where Alan Burlison, Solaris developper and the main Perl
contact for this system, expressed his clear opinion about this subject
and explained that he prefers the native mechanism over the others.
Show quoted text

> Best wishes! Have a good weekend. (Time for me to go home...)

Thanks. Although I'll have to spend part of it on the Sys::Syslog problems.
--
Sébastien Aperghis-Tramoni
Close the world, txEn eht nepO.

>> Sadly, I'm just the middle-man, with no first-hand knowledge of the
>> problem.

>
> No problem. People can contribute to this ticket from the RT interface.
>

>> I had spotted a criticism of the recent module development on the main
>> MailScanner list, because of claimed failures in recent revisions,
>> including (apparently) a residual "make test" hanging problem in 0.17.
>>
>> I simply wanted to check that the developer (you, Sebastian) were either
>> aware of the problem or willing to accept fault reports. (It seems that
>> the problem is new to you (fair enough), and that you are open to reports
>> (excellent!).

>
> I'm more the maintainer than the developper. Sys::Syslog is a piece of
> old code, with roots dating back from Perl 4, and has evolved in a strange
> and organic way. I became the maintainer last year when it was decided to
> released it on the CPAN, independently of the main Perl distrbution, in
> order to address CVE-2005-3962 (the sprintf() buffer overflow problem).
>
> Since then, I've tried to address the tickets that were opened and accept
> the patches that were waiting, also trying to tidy up the code and add
> new features.
>

>> Earlier this afternoon (UK time) I asked (on the main MailScanner email
>> list) for a more detailed report of the alleged problem.
>>
>> It apparently claims a hang (presumably very, very long wait) during the
>> "make test" on some platforms.

>
> I've seen this occurs on a Debian system, but have not yet investigated
> this case.

The hang that was reported to me was on a CentOS 4 system. But another
CentOS 4 system worked fine, so it may be a kernel revision problem or
something subtle like that.
Is there a better Syslog module that I could use instead? The problems
with 0.16 and 0.17 have caused me quite a lot of grief, forcing me to
issue more than 1 revision after I published my latest stable release,
which damages my (and MailScanner's) reputatio as a reliable package.
Show quoted text

>

>> No report forthcoming as yet.
>>
>> I've also searched the archives of the main list, but was unable to track
>> down a detailed report. Sorry.

>
> No problem, I can wait.
>

>> Thanks. Julian actively reads his email! The chances are that he can dig
>> out the exact report of the alleged "make test" hang and send it to you.
>>
>> The responsibility now fairly and squarely lies with us (the MailScanner
>> community) to provide you (Sebastian) with decent information.

>
> I took a quick glance at MailScanner::Log, and I think the code can be
> simplified by using the features from the recent Sys::Syslog versions.
> Namely, the eval{} can be avoided by using the "nofatal" option of
> openlog().

I will look into that, many thanks.
Show quoted text

>
> » http://search.cpan.org/dist/Sys-Syslog/Syslog.pm#openlog
>
> Also, I really think that you should avoid calling setlogsock() to change
> the mechanism and just keep with the native one. Especially on systems
> like Solaris, where Alan Burlison, Solaris developper and the main Perl
> contact for this system, expressed his clear opinion about this subject
> and explained that he prefers the native mechanism over the others.

I will check that out too. I have used Sys::Syslog without exactly the
same code for several years now, it's just all been broken by upgrading
to a newer release. I don't usually keep absolutely up to date with the
latest module releases, as I prefer to stick with the same set that I
know work; but the Sys::Syslog module version I was using was getting
very old so I upgraded it. Slightly wish I hadn't now :-)
Show quoted text

>

>> Best wishes! Have a good weekend. (Time for me to go home...)

>
> Thanks. Although I'll have to spend part of it on the Sys::Syslog problems.

What would we all do at the weekend without a good pile of work to do!
- --
Julian Field
www.MailScanner.info
Buy the MailScanner book at www.MailScanner.info/store
MailScanner customisation, or any advanced system administration help?
Contact me at Jules@MailScanner.biz
PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654
Get your PCs and servers from Transtec.de, very well built and reliable!
-----BEGIN PGP SIGNATURE-----
Version: PGP SDK 3.7.0
Charset: UTF-8
wj8DBQFE3dheEfZZRxQVtlQRAoSwAJ9BxxnP0cLfrtyQgI3A8UVY/wnEFwCgjiAi
KiIomYnmLJiU4dQkKJSLvIk=
=S2AI
-----END PGP SIGNATURE-----
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
For all your IT requirements visit www.transtec.co.uk

> The hang that was reported to me was on a CentOS 4 system. But another
> CentOS 4 system worked fine, so it may be a kernel revision problem or
> something subtle like that.

Once again, I am unable to reproduce the bug on my systems. This is
totally frustrating. I'll try tomorrow on the Debian system where I
once saw the problem.
Show quoted text

> Is there a better Syslog module that I could use instead? The problems
> with 0.16 and 0.17 have caused me quite a lot of grief, forcing me to
> issue more than 1 revision after I published my latest stable release,
> which damages my (and MailScanner's) reputatio as a reliable package.

AFAICT, Sys::Syslog is the only maintained Perl module for sending
syslog messages. There are old modules like Net::Syslog or Unix::Syslog,
but they are unmaintained and have less features than Sys::Syslog.
Show quoted text

> I will check that out too. I have used Sys::Syslog without exactly the
> same code for several years now, it's just all been broken by upgrading
> to a newer release. I don't usually keep absolutely up to date with the
> latest module releases, as I prefer to stick with the same set that I
> know work; but the Sys::Syslog module version I was using was getting
> very old so I upgraded it. Slightly wish I hadn't now :-)

Sorry for the problems caused by Sys::Syslog. OTOH, the positive side
of the upgrade is to find this sort of regressions, which can't be
detected by the current test suite, which I'm also working on to improve
it. On this matter, I'd say that if the test suite hangs or is longer
to execute, but that the module works correctly on normal runs, people
should see the module as operational, because the test suite is not
perfect and it may try some mechanisms that just won't never work on
their systems.
Here, I think that my error was to actually connect to the syslog
facility of the system. But I'm afraid that several of the functions
used by Sys::Syslog can't be replaced or mocked, making data validation
tests (which could catch more regression problems) more difficult.
--
Sébastien Aperghis-Tramoni
Close the world, txEn eht nepO.

>> The hang that was reported to me was on a CentOS 4 system. But another
>> CentOS 4 system worked fine, so it may be a kernel revision problem or
>> something subtle like that.

>
> Once again, I am unable to reproduce the bug on my systems. This is
> totally frustrating. I'll try tomorrow on the Debian system where I
> once saw the problem.

Ok, I gave it a try on a Debian system I have access to, and I can reproduce
the problem there:
grive:/tmp/Sys-Syslog-0.17# prove -bv t/syslog.t
...
ok 135 - setlogsock() called with ['console']
ok 136 - setlogsock() should return true: '1'
ok 137 - setlogsock() called with 'console'
ok 138 - setlogsock() should return true: '1'
ok 139 - openlog() called with facility 'local0' and without option 'ndelay'
ok 140 - openlog() should return true: 'Sys::Syslog::SYSLOG'
ok 141 - openlog() called with facility 'local0' with option 'ndelay'
ok 142 - openlog() should return true: 'Sys::Syslog::SYSLOG'
ok 143 - syslog() called with level -1
ok 144 - syslog() should return false: '0'
ok 145 - syslog() called with level 'info,notice'
ok 146 - syslog() should return false: '0'
ok 147 - syslog() called with level 'info,notice'
ok 148 - syslog() should return false: '0'
And indeed, it hangs at this points, before test 149, at line 140 of
t/syslog.t, while trying to write the message to the console:
grive:/tmp/Sys-Syslog-0.17# strace -p 15891
Process 15891 attached - interrupt to quit
write(4, "<134>Aug 14 18:07:24 perl: uid 0"..., 124 <unfinished ...>
Process 15891 detached
grive:/tmp/Sys-Syslog-0.17# ls -lF /proc/15891/fd/4
l-wx------ 1 root root 64 Aug 14 18:14 /proc/15891/fd/4 -> /dev/console
Adding the "nowait" option to the openlog() avoids this hangs, but then,
two processes are left trying to write to the console. So the best solution
is probably to just forget this test for the console.
As these tests pass on both Mandriva Linux systems I have access to
(x86-32 and amd64), I guess it's a packager choice. I note that it also
works without hanging on my OSX.4 system.
Time to write a rant about OSes that offer a writeable /dev/console
but indefinitely hang the processes that ever try to write there.
--
Sébastien Aperghis-Tramoni
Close the world, txEn eht nepO.

Thanks for hunting for it, well found. I couldn't reproduce it myself.
Show quoted text

>
> Time to write a rant about OSes that offer a writeable /dev/console
> but indefinitely hang the processes that ever try to write there.

Indeed!
Does anyone explicitly only write to the console any more these days?
We're a bit beyond the days of attaching line-printers to the console
serial port. I still have a VT100 for connecting to Sun's serial ports
though :-)
Show quoted text

>> And indeed, it hangs at this points, before test 149, at line 140 of
>> t/syslog.t, while trying to write the message to the console: [...]
>>
>> Adding the "nowait" option to the openlog() avoids this hangs, but then,
>> two processes are left trying to write to the console. So the best solution
>> is probably to just forget this test for the console.

>
> Thanks for hunting for it, well found. I couldn't reproduce it myself.

It was a 10 minutes tracking once I had access to a system where the
problem occured.
Show quoted text

>> Time to write a rant about OSes that offer a writeable /dev/console
>> but indefinitely hang the processes that ever try to write there.

>
> Does anyone explicitly only write to the console any more these days?
> We're a bit beyond the days of attaching line-printers to the console
> serial port. I still have a VT100 for connecting to Sun's serial ports
> though :-)

Well, if a system offer a writable /dev/console, I'd say one can expect
to write to it. Anyway, I now wonder whether the people who added this
mechanism really tested it because I'm not totally sure it works as they
expected it to work.
--
Sébastien Aperghis-Tramoni
Close the world, txEn eht nepO.

Here is a beta of the upcoming Sys::Syslog 0.18.
» http://www.maddingue.net/Sys-Syslog-0.18_01.tar.gz
It passes all tests on my systems:
- Perl 5.6.2, 5.8.5-threads, 5.8.8 / Mandrake Linux / x86-32
- Perl 5.8.6-thread / Mac OS X.4 / PowerPC
- Perl 5.8.8 / FreeBSD 6.0 / x86
Can you and other people on the MailScanner list test it on the
systems where previous versions hanged and confirm that this version
doesn't present the same problem?
(I can't send a mail to the mailing list myself because I can't
connect to the MailScanner site since a few hours now.)
Thanks in advance
--
Sébastien Aperghis-Tramoni
Close the world, txEn eht nepO.