Simon 3 hangs

Sun, 2011-02-20 15:57 — jgonder

Greetings -

Having trouble with the new v3 Simon, similar to that previously reported in 2.5.7. This began in my third day of test and setting up. I've been using a PING test, changing the IP from the right one, to a bad one, so that the test alternates between down an up, to verify sending emails, and trying to get twitter and sms to work.

That's a separate issue: although the report is now going through to the FTP site, email has been inconsistent and twitter and SMS have not yet worked. *There are no network problems here, I've been capturing and looking at all packets w/ Wireshark to try to see diagnose problems Simon has been having in sending out messages.

Now, in the last hour or so - after about 20 sec of pinging a non-existant IP the beachball of death comes on, Simon hangs, and eventually, if I don't force quit, will hang the whole machine. After 20 seconds Simon has stopped sending out packets, although the test shows it's still going on, although Simon's timer no longer updates. This is what went on previously with 2.5.7.

I uninstalled Simon and trashed com.dejal.plist, but the re-install of 3 still picks up my tests and SN - so - evidently there's info somewhere else as well, which, if it's munged, may be the problem. AFAIK there's no 'uninstall' - too bad. It would be v. nice to be able to go back to square one when there's a problem without needing to do a treasure hunt to find the rest of Simon's files.

Per David's previous request, I took screen shots, console and process samples, and I'll send the zip in - attached here is the hang dump sent to Apple.

Any thoughts appreciated -

* The need to make a test go up and down to test notifiers is not v. convenient. For my USB thermometer software, the email setup window has a "test" button to send a test message without having to cause a test failure. A similar test button for SMS, twitter, email in Simon setup windows to verify the settings right then and there would be nice. Or perhaps I've overlooked something?

* Also - even when the report function is failing to upload to FTP, it happily reports done and available - even when I could see by the packets that it got an error because the path was typoed. A test button here, that understood when an error is generated, would keep me from having to catch the data stream and look at all the packets to diagnose things.

I'm sorry Simon is having issues again. Thanks for the logs and other info via email.

Firstly, regarding your "test button" requests, Simon does have those already. Show the Notifiers window, and you'll see a Notify Now toolbar button (and menu command), which will run the selected notifier with placeholder variables. Similarly, there's a Report Now item in the Reports window.

Looking at the hang log, it seems to be blocking when connecting to the SMTP server when sending the SMS notification. Email delivery has long been the most error-prone area of Simon, and I plan to replace the email delivery engine in an upcoming release — either version 3.1 or 3.2. With your issue adding weight to the other examples, I'm leaning towards 3.1, which I hope to have in beta within a few weeks.

As previously mentioned in the forum, I also want to split the app into multiple processes in an upcoming release (probably 3.2 or 3.3), which should also help with such issues, which are probably related to running out of network sockets in the Simon process.

In the meantime, try pausing the SMS and email notifiers, and see if that fixes the hanging. If it doesn't, please send another hang log, so we can trace further.

Right - "report now" I'm familiar with - and great to hear I'm the victim of "pilot error" on the other -

> Looking at the hang log, it seems to be blocking when connecting to the SMTP server when sending the SMS notification. SNIP I plan to replace the email delivery e SNIP within a few weeks.
SNIP related to running out of network sockets in the Simon process.

Probably not running out in my case, as I'm only running one thing at a time here -

> In the meantime, try pausing the SMS and email notifiers ,SNIP

I will go to one notifier ( properly tested ) at a time and report -

Thanks -

Once again, world class developer and world class support for a world class product that has an occasional problem, sometimes user induced, but still promptly addressed.
There's a template for Redmond …

Bottom line - Simon seems not to like failure in a notification process.
Simon doesn't seem to do IMAP or SSL, at least with my dotMac account.
It would only login successfully using POP3.

If a service, like dotMac, has a round robin server group, so that the DNS returns a list of IP numbers in order of precedence, Simon takes 10-15 minutes to completely time out for logging on, if it hasn't been able to (that's the short story on this - it's going to make a great video for my class on TCP-IP analysis with all the screenshots and traces of each test for each service).

Since Apple dotmac authentication craps out several times a week - I anticipate that being a problem - I may need to go to a gmail account for Simon.

Same with Twitter - I didn't get to testing SMS yet - Since I had made a separate twitter account for Simon, it was not allowed to send a direct message to "me" yet, and that was a failure Simon didn't like.

Failing from the notification window "check" button did not crash Simon, as failure while running a check did. Don't know more about this. It may be that having two login failures running at the same time was the problem. In this set of tests I only tested one thing at a time, and not while running a monitor test.

Now that, or as long as, logins are working, monitoring a downed IP proceeds as expected, and I'm getting tweets and emails for up and down.

So - bottom bottom line - David, you're probably right about a more tolerant email / notification system. So long as life is good it works. So - I'll definitely /not/ point it at our Groupwise system, and I'll hope it is not trying to send while dotMac auth is down …

Thanks -

Hope this helps -

If I see something interesting w/ SMS or more on this, then I'll let you know on email.

Yes, DotMac aka Mobile Me isn't the most reliable... I've experienced trouble connecting it it via Simon (and elsewhere) on a number of occasions. It would probably be best to avoid it for reliable emails.

Though I do still suspect my mail delivery code doesn't cope as well as it should. I actually have two mail delivery systems currently: the Automatic transport uses direct socket communication, talking to the SMTP server, and the non-automatic transport uses the third-party Pantomime framework. So if you're using Automatic, you could try setting up a mail transport, and might have better luck.

The SMS notifier only uses the Automatic transport, though (unless you use the third-party Clickatell service). But you can also send SMS via email, if you get that working.

Your theory about two simultaneous mail server logins causing the difficulty could be feasible. Hard to know how to solve that, other than to re-engineer the notifier system to only perform one notification at a time, at least for each notifier plugin. Might be worth doing. (As I recall, the Speech one already does this, since of course talking over itself wouldn't be helpful.)