Between your mailer and mine (Thunderbird 3.1 on Ubuntu), the quoting
has become something of a dogs breakfast, so let me just lay things out
here as best I can.
I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
diff.
I am no longer associated with MIPS Technologies and no longer have
access to my email archives from that period. If I did, I could tell you
which LMO kernel version(s) had SMTC working "out of the box". There
definitely was at least one, and I commented on it in an email. You
might be able to find it in the LMO email archives, but it's possible that
I only sent it to a MIPS internal mailing list.
There was also a message I wrote that I had *thought* had gone to
the LMO mailing list, but may have only been sent to a group of internal
MIPS and customer engineers, in which I described the recommended
procedure for debugging exactly this canonical problem with porting
SMTC.
The recommended procedure was, and remains, to isolate clock
propagation problems by using command line options "maxtcs="
and "maxvpes=".
First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
a virtual uniprocessor. If that doesn't run, you've got some fundamental
problem with support for your platform, or someone has really fundamentally
broken the SMTC build somewhere. Next, try booting with maxtcs=2
and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
If those fail, your problem is probably in the interrupt mask
management algorithms I described.
On the other hand, if you boot with maxtcs=2 and maxvpes=2,
there will be only one TC per VPE and far less vulnerability to interrupt
mask lockup, but you need to have cross-VPE IPI interrupts working.
The preferred method of doing cross-VPE IPIs would be to use a physical
interrupt input that's instantiated per-VPE and manipulable by software.
Malta didn't have one, so there's the historical hack of using
MIPS MT instructions to freeze the other VPE and set up a
software interrupt using MTTR to the remote Cause register.
The PMC-Sierra platforms did, if I recall correctly, have some kind
of register that one could write to cause a real cross-VPE hardware
interrupt, but I don't recall whether it got used in the SMTC port.
Your dump below looks as if it comes from 2 TCs running on
2 VPEs, and that the interrupt mask issues I alluded to earlier
are neither relevant nor manifest. It looks instead as if the
initialization of "CPU 1" (VPE1/TC1) may not have been done
properly. Under normal operation, it would be pretty rare to
catch TC 1 in the exception vector dispatch code, so the first
hypothesis that comes to mind is that something isn't right in
the vector/handler setup, and TC 1 is stuck in an infinite exception
loop, unable to handshake with TC 0 and thus locking up the
system. But that's just my best guess based on limited data.
Regards,
Kevin K.
On 12/14/10 07:25, Anoop P.A. wrote:

it ended up being cleaner and more efficient to have *some* hooks in
platform specific timer code. It was there for Malta in the

kernel.org

mainline once upon a time, and I *thought* we'd propagated working

code

for the initial PMC-Sierra 34K-based SoC's at least as far as

[Anoop P.A.]
I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
c0_compare_int_pending changed as following "return (read_c0_cause()>>
cp0_compare_irq_shift)& (1ul<< CAUSEB_IP)"

linux-mips.org, but the source tree has been considerably reorganized

-

there was a time when some of the hooks were under
arch/mips/mips-boards/generic, which no longer exists - and I'm not

sure

where to point you. Git and grep are your friends.

[Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
Can you recollect the version of l-m-o kernel with a known working SMTC
support ?.

The first order of business is to break into that hung timer

calibration

loop and dump the CP0 registers for the VPE and the TCs, in particular
checking the interrupt enable mask in Status against the pending
interrupts in the Cause register. If you're seeing the timer
interrupt's bit set in Cause, but clear in Status, you need to fix the
SMTC interrupt mask hook for your platform timer.

[Anoop P.A.]
I tried dumping registers from calibration while loop.
It looks like the timer interrupt bit stay high on both cause and status
register ( in my case timer interrupt is connected to Cascaded CIC
interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
below

check to see if you're building for "tickless" operation. Tickless

ends

up being really important for SMTC, and I did get it working properly
back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c

going

by that I rather doubt were ever tested against an SMTC

build/platform.

There might have been breakage there, and configuring to use a fixed
interval timer (say, 100Hz) would be a way to test that hypothesis.