On 2/19/07, Russell King <rmk+lkml@arm.linux.org.uk> wrote:> I think something else is going on here. I think you're getting> an interrupt for the UART, and another interrupt is also pending.

Correct. An interrupt for the other UART on the same IRQ.

> When the UART interrupt is handled, it is masked at the interrupt> controller, and the CPU mask is dropped.

Correct.

> The second interrupt comes in, and when you go to disable that> source, you inadvertently re-enable the UART interrupt, despite it> still being serviced.

Incorrect. An attempt has been made to service the interrupt usingthe only ISR currently in the chain for that IRQ -- the ISR for thefirst UART. That attempt was not successful, and when __do_irqunmasks the interrupt source preparatory to exiting interrupt context,__irq_svc is dispatched anew.

> This leads to the UART interrupt again triggering an IRQ.

Right. The _second_ UART's interrupt. There's another problem withthese UARTs having to do with the implementor's inability to read andfollow a bog-standard twenty-year-old spec without asking software tofix up corner cases, but that's another backtrace for another day.

Don't have 'em handy; I'll be happy to post them when I do, perhapslater today. I would hope they're pretty generic, though; it's aFeroceon core pretending to be an ARM926EJ-S, hooked to the usualhalf-assed Marvell imitation of an ARM licensed functional block.Trust me for the moment, it's the same IRQ line.

> This shows that you don't actually have an understanding of the Linux> kernel boot, especially in respect of serial devices. At boot, devices> are detected and initialised to a safe state, where they will not> spuriously generate interrupts.

Sorry, 'taint so. Not unless the chip support droid has put the rightstuff in arch/arm/mach-foo. LKML is littered with the fall-out of thedecision to trust whoever jumped to main() to have left the hardwarein a sane state. If you don't enjoy this sort of forensics (which Ifor one do not, especially not when there is a project deadlinelooming and a Heisenbug starts firing 9 times out of 10), you mightconsider systematically installing ISRs that know how to shuteverything up before turning on any interrupt sources at all.

As I said, this is not going to happen overnight, and is not evenparticularly in the economic interest of people who get paid by thehour to wear bringup wizard hats. That category currently includesme, but I am intensely bored with this game and aspire to greaterthings.

> When a userspace program opens a serial port, which can only happen> once the kernel boot has completed (ergo, devices have been initialised> and placed in a safe state) the interrupts are claimed, and enabled> at the source.

As you can see from the console dump I posted (which begins with"Freeing init memory: 92K" and ends with do_exit -> init -> sys_open,which is obviously sys_open("/dev/console")), this happens long beforeuserspace comes into the picture. Our 8250.c has some nasty hacks init but otherwise this call chain is from a very nearly vanilla2.6.16.recent.

We've already worked around this on our board, and the whole kit andkaboodle will eventually be posted to linux-arm-kernel in tidy patcheswhen my client lets me spend billable hours on it (immediately afterthe damn thing passes its first functional test, long before itships). I'm not asking for anyone's help except in thelet's-all-help-one-another spirit. I'm trying to help with root causeanalysis of Frederik's (Jose's?) fandango on core. If it's notrelevant, my apologies; and although it goes without saying, I saluteyou for both the serial driver and the ARM port.

Now please take a second look at the backtrace before toasting melightly again. Mmm'kay? Oh, and by the way -- is there an Alt-SysRqequivalent on an ARM serial console?