Change History (22)

When this can be reproduced, olpc.fth should be modified to add the olpc_keyboard.debug=1 kernel parameter, then reboot, and assuming it reproduced again, we should have some useful diagnosis in the kernel logs.

The parameter can also be accessed with:

echo 1 > /sys/module/olpc_keyboard/debug

but in this particular case, running the above after boot will be too late.

i have two C1 machines which exhibit this issue repeatedly. it can be reliably reproduced with repeated "modprobe -r psmouse; modprobe psmouse" operations, which cause a high volume of bidirectional mouse traffic. when the problem hits, the mouse effectively locks up, and will apparently not respond to any more commands. both machines have AVC touchpads.

working with one of these machines, it seems that running q4c04 (and ec 0.3.04) has a _much_ lower occurrence rate for the problem than running q4c05/0.3.05. nothing jumps out at me from the changes in either the OFW/CForth changelogs, or in the EC changelog. unfortunately, the earlier firmware isn't completely free of the bad behavior -- it's just much harder to provoke (though i'd love corroboration of this.)

another interesting symptom, presumably related to bad communication with the mouse, is that the touchpad is sometimes misidentified. correct identification looks like:

I just got a string of reproducible conditions and this is what I found. Each time I soft-rebooted I would check test /mouse in ofw and the touchpad worked fine there. Then booting to linux would lead to this output in dmesg

log-wo-rcd.txt is another failure case I was running into. It turns out that the SoC didn't respond to PS/2 commands. For example, the 0xe9 command at 8.107605 never got response when compared with 8.030017, which received reply within 10ms.

The same failure repeated again since 208.106654(when fsp_onpad_vscr() was invoked). Even worse, the subsequent 0xf3 command initiated by psmouse_set_rate(808.163444) caused another timeout in olpc_keyboard.c.

OTOH, there's a stray 0xfa at 812.308753, which probably is the response to the timed out 0xf4 sent at 811.256076; however, psmouse.c already considered there's an 'enable mouse failure' at 812.246668.

According to various logs I've read, it looks to me that SoC somehow delayed or never replied to command sometimes(heavy load perhaps?). I suspect this could be related to the lost of control described in ticket/11357 as FSP wasn't initialised correctly or being took down by the 0xf5 command since certain command packet such like 0xf4 never reached to FSP.