I have the Tricklestar 300ZW and see these from time to time. It seems once I start getting this error it continues till I reboot the computer. While this error is occurring LinuxMCE is unable to control any ZWave devices.

Just curious if anyone else has seen this. Is it limited to the Tricklestar dongle? Is there any way to prevent it?

Going through the zipped files for the last 6 days, all of them have between about 7:30 and 8:30 that line about rotating the logs. After that there are 15 to 25 lines of "No callback received" stretching across the day in what seems to be random intervals. I can't imagine a z-wave device that we would use at those times.

They do not stop my lamppost from turning on at sunset and off at 10:00.

Hari will have to confirm, but if I've understood correctly, a missing callback means that the ZWave driver couldn't receive a confirmation for the message it sent. This might be caused by communication problems, like having radio interference or too far distance between nodes.

I suspected communications problems at first too. But my ZWave dongle has nearly direct line-of-sight to the closest device, and the two are less than 20ft away from each other. The weird thing is that the errors go away immediately after a reboot.

The errors are always in a set of four: three await_callback and one Dropping command. They will continue this way for hours, sometimes going back to normal on their own. When everything is working fine I don't get any errors except for an occasional single await_callback.

I'm happy to try resetting my devices and moving the ZWave dongle to one of my Media Directors (will this work?) just to see if being in a different part of the house has any effect.

I should also mention, the last time I got these errors I issued a "#788 StatusReport" to the ZWave plugin. I stopped getting these errors after sending that command. I think that is because implementation of the StatusReport command Soft-Resets the dongle. I've only tried this one time however, so it might have just been a coincidence that things started working again.

Maybe sending the Soft-Reset does work. I just noticed I was getting these error messages again. I sent the StatusReport command to the ZWave device and, after the 30sec device polling was complete, the system went back to normal.

I'll enable debugging on the pl2303 module and also crank up the logging on the ZWave device. Maybe between the logs I be able to catch something out of the ordinary going on.

Failing that, Hari, would you mind if I submit a patch to do a Soft-Reset if all devices fail to respond to the poll that takes place every 30 seconds?

Yes, the soft reset is triggered by that dce command. If I remember correctly I did implement that for domodude to test. I've never seen that device hang myself, but I'm using a seluxit dongle as primary device.

I'd appreciate such a patch as i did never find the time to implement it myself, and domodude was working around with a script that parsed the logs and used messagesend to reset the device when seeing a bunch of those messages in the logs..

The poll is probably the wrong place as this can be deactivated with a dce command.. Detecting excessive miss of callback (maybe a counter) would probably work...

Thanks for the reply. With the full debugging turned on, I'm seeing some weird things. For example, I see the ZWave dongle send a message to the driver at the same time the ZWave driver sends a message to the dongle. This is causing an issue as in Serial.cpp line 95 all the data in the input buffers is flushed when a message is transmitted to the dongle. This causes a message from the dongle to be missed in the driver and sometimes winds up in a ERROR! Out of frame flow!! message.

Is there a reason for flushing the input buffer like that? I'm hesitant to touch that code because it might be there to handle a special case with another ZWave dongle.

I haven't gotten into the infinite retry state yet with full debugging turned on. Still waiting...

I finally got the error to occur again. However I cannot find anything telling in the logs that would indicate why it is failing. Again, performing the soft-reset cleared things up. If you'd like to take a look at the logs for yourself, you can download them from: http://esev.com/zwave.tar.bz2 Both the ZWave and kernel debugging logs are included.

One thing that did arrise from the logs is that when this issue occurs the ZWave controller is sending back notifications that the message cannot be sent to the ZWave stack: i.e. "ERROR: ZW_SEND could not be delivered to Z-Wave stack" shows up in the logs.

Since I cannot see any other obvious way to fix this, I'm using your suggestion of a counter to trigger a soft-reset after receiving two consecutive ZW_SEND delivery errors. I chose two because the retry limit of a job is currently set to 3. My thought was that if the reset was successful, the third retry would still have a chance to work. To do this, however, I needed to insert the soft-reset job at the beginning of the ZWSendQueue.

Let me know if this will be acceptable. I've attached a preliminary patch. I'll wait to confirm this works before posting it to a ticket on Trac. I can add the "Out of frame flow" fix from my previous post to this patch too, if you'd like.

I'm having a bit similar situation here with the Selexit dongle. Everything works perfect, except when using timed events to control lights, sometimes an excessive repetition of the command is broadcasted. It occurs about once every millisecond. Reloading the router gets things back to normal. The system may work several days OK yet sometimes this erroneous function takes place within a day.

I'm having a bit similar situation here with the Selexit dongle. Everything works perfect, except when using timed events to control lights, sometimes an excessive repetition of the command is broadcasted. It occurs about once every millisecond. Reloading the router gets things back to normal. The system may work several days OK yet sometimes this erroneous function takes place within a day.

Thanks for the reply. With the full debugging turned on, I'm seeing some weird things. For example, I see the ZWave dongle send a message to the driver at the same time the ZWave driver sends a message to the dongle. This is causing an issue as in Serial.cpp line 95 all the data in the input buffers is flushed when a message is transmitted to the dongle. This causes a message from the dongle to be missed in the driver and sometimes winds up in a ERROR! Out of frame flow!! message.

I'm not sure if I can follow..

Quote

Is there a reason for flushing the input buffer like that? I'm hesitant to touch that code because it might be there to handle a special case with another ZWave dongle.

no afaict there is no special case for other dongles. I was just using the existing serial routines as I was too lazy to write my own...

Unplugging my other USB devices consistently fixes the problem. The better fix I've found is to unplug the Aeon, delete the ZWave devices from webadmin, reboot, replug the Aeon, and then go through the setup wizard from scratch. I don't claim to understand why either of these procedures work.

It hasn't died on me in a month or so which is the longest it's gone without problems.