I'm trying to get OpenBSD 5.4-release amd64 working on a MacbookAir5,1. So far it runs pretty well...that is, until I use Gnome. After 1-3 hours of Gnome use my machine hangs entirely. I cannot switch to a different virtual console, move the mouse cursor, or do anything besides power cycle the machine. I get the same behavior using Xfce instead of Gnome.

I've searched for answers but I haven't turned up much. I'd like to do more to narrow down the problem and either figure out if this is user error or figure out its a bug and then do what I can to gather information about the bug so I can file a report that the devs will find useful. So I've been looking around for documentation on how to troubleshoot hangs in OpenBSD and that search hasn't turn up much either.

I thought I'd ask y'all--do you know how to go about troubleshooting a hang in OpenBSD? Is there any logging I can turn on or clues left behind when the system freezes up? Or should I just experiment with turning various subsystems or hardware on and off to see if I can figure out where the trouble lies that way?

Last edited by quisquous; 23rd December 2013 at 02:44 AM.
Reason: removing most caps in title to be more like the other post titles

"Hang" when using X can be difficult to diagnose. If there has been a kernel panic, you would have the same symptoms. The default is to enter the ddb(4) kernel debugger, and when running X this cannot be seen. If the keyboard is active in ddb -- and it may not be --, you can blindly type ddb commands such as ddb>boot crash to force a kernel core dump and reboot. Along with having a keyboard connection to ddb without being able to see ddb, you'll need to have sufficient default swap space to hold a kernel core dump, which means it should be larger than your physical RAM. You won't know if your boot crash command is working unless you can monitor disk I/O. Writing a kernel core dump to swap takes time.

You can disable ddb so that you can avoid blind typing, and automatically obtain a kernel dump. See crash(8) as well.

If the MacbookAir has a 9-pin serial port, with a null-modem cable you could set up another computer as a serial console. This has two advantages:

If there is a kernel panic, you can actually see it.

If there is a real hang, you can force the kernel to enter ddb. See ddb(4).

Thanks jggimi! My swap is twice as big as my RAM, so I should be good there. I'll try typing 'boot crash' and waiting 10 minutes next time I get a proper hang.

That said, last night when I was turning off various services in an attempt to see if one of them was the culprit, I tried running plain startx with the default desktop, no gnome, then I played a video for 20m or so using Totem. Screen froze as before, but the audio from the video kept going through what I believe was the end of the video. Pressing keys on the keyboard and mousing didn't do anything, at least, that I could see.

So...perhaps I have a video issue and not a hang. I'm going to try this again in Gnome, get the video to play in a loop, see if it hangs or just looks like its hung, try and SSH into the box if the video keeps playing. Maybe there's a way I can reboot the video driver while SSHed in, that would be something.

I installed -current but it still hangs every couple hours or so. It varies between a hang where I can still SSH into the box, and ones where I cannot SSH or ping the box. I have a new appreciation for the package maintainers--building Gnome takes a long time.

In those times when you are able to reach the workstation via SSH, what have you been able to discover? (e.g.: top(1), systat(1), vmstat(8), etc.)

In those times when you are not able to use SSH, what have you been able to discover? (eg. setting ddb.panic=0 to avoid dropping into ddb, or setting ddb.console=1 and invoking ddb through one of the console methods and blindly typing boot crash, etc.)

Gnome3 adds many additional layers, gstreamer pulseaudio and video composting, to the base OS. Some of the layers are not BSD friendly. If you are using totem in fvwm, it likely utilizes pulseaudio and gstreamer. The Parole video player in XFCE4, which is based on Totem, also uses gstreamer but not pulseaudio. I believe in 5.4 and current 2 versions of gstreamer are utilized as gnome3 required the newer version. Also, if you are starting fvwm from gdm, some of these services may be running in the background. top in the fvwm xterm will let you know.

Another option is to try a different video players. Both VLC and mplayer utilize sndio directly and do not depend on gstreamer. Lack of hangs in either of these two media players would tend to focus debugging efforts and also assist you in choosing a desktop.

Last edited by shep; 31st December 2013 at 04:15 PM.
Reason: added comment on composting

Gnome3 adds many additional layers, gstreamer pulseaudio and video composting, to the base OS. Some of the layers are not BSD friendly. If you are using totem in fvwm, it likely utilizes pulseaudio and gstreamer. The Parole video player in XFCE4, which is based on Totem, also uses gstreamer but not pulseaudio. I believe in 5.4 and current 2 versions of gstreamer are utilized as gnome3 required the newer version. Also, if you are starting fvwm from gdm, some of these services may be running in the background. top in the fvwm xterm will let you know.

Another option is to try a different video players. Both VLC and mplayer utilize sndio directly and do not depend on gstreamer. Lack of hangs in either of these two media players would tend to focus debugging efforts and also assist you in choosing a desktop.

Thanks shep. I'm able to reproduce the problem when gdm is not running and I run startx using the default fvmw and then run Firefox and leave it open for a couple hours, so its not specific to playing video. Basically running X for a couple hours, regardless of what I'm doing within X, leads to the screen locking up. The screen does not freeze when I'm outside of X, i.e. using a virtual console. I note that it does not freeze on the virtual console even if gdm *is* running. It seems like something that's happening while X controls the screen is leading to the video freeze. When the screen is frozen, CTRL-ALT-F1 does not unfreeze the screen and take me to a virtual console. BUT, notably, when the screen is frozen and I ssh in and issue the reboot command, right before the shutdown sequence finishes and right before the machine reboots, the screen unfreezes and the last shutdown event lines of text appear. So it would seem there may be some way short of a power cycle to get the screen unfrozen.

There is only one thing that stands out for me in your status reports, and that is from top(1): Xorg is in an ACPI lock state. This has been previously reported to the misc@ mailing list by someone with similar hardware. The discussion continued on tech@. I did not see a resolution in either thread.

I noted that your swap size and RAM appear to be 1:1, not 2:1, and at 1:1 it is possible that swap is not large enough to store a core dump in the event of a kernel panic or by your forcing one through ddb. Note that kernel core dumps will only be stored on the default swap device, should you have more than one.

Last edited by jggimi; 31st December 2013 at 07:33 PM.
Reason: clarity

Hmm...I neglected to create a big enough swap when I switched to -current. I'll repartition and reinstall so I can capture core dumps. I posted to the tech list in reply to the existing thread, though I munged a couple things on my post (didn't reply to the thread properly and didn't wrap the lines properly) so, lower chance of it being useful.