For the record this happens on both 9S08QE8 and 9S08SH8 but not in the full-chip simulator.

This has rather caught me by surprise as I've previously always just ignored UART errors and this leaves me with a lot of potentially vulnerable old code across various Freescale devices to fix up, possibly even on MCUs from other manufacturers.

So what is the intended way of reliably reading data?

The following is a simple repro case using the loopback mode, which always gets stuck after the overrun.

Yes, it would be possible for an overrun condition to occasionally occur after reading SCIS1 register, but prior to reading the SCID register, due to the arrival of the next character. I would expect that this should not occur very frequently with only a few cycles between the two processes (I assume this is actually the case). Here, I would assume that the RDRF flag would be cleared, but the OR flag would remain set.

Possible solution 1:

Immediately after reading SCID, read SCIS1 register again. If any error flag is set, do another dummy read of SCID.

It is likely that the overrun error has already occurred by the time that the ISR code executes. An overrun occurs when a new character has been received by the SCI, but the previous character within the receive buffer has not yet been read, with the new data being lost.

The RDRF flag does not actually clear until both the SCIS1 register has been read, followed by the SCID register.

One possible cause of the overrun is that the commencement of the SCI receive ISR is delayed by the execution of another ISR, possibly one of the TPM interrupts, which have higher priority in a pending interrupt situation. There may be a number of reasons for this -

The SCI baud rate is too high for the bus clock frequency in use.

The execution cycles of the SCI receive ISR is too long. Any lengthy manipulation of the received data should be done from outside the ISR.

The execution cycles of one or more of the other enabled interrupts is too long.

To ensure that overrun does not occur, 10 times the baud period needs to be greater than the execution cycles of the SCI receive ISR itself, plus the worst case execution cycles of the longest "other" ISR. Assuming this is the problem, it demonstrates the need to always keep all ISR code as short as possible.

The state of the OR flag, as well as the RDRF flag, may be tested within the ISR. Whenever an overrun error is detected, you will probably need to flag that an error response needs to be returned, so the the original data can be re-sent by the remote end.

I do not know the reason for the overrun not appearing during FCS, assuming bus frequency and baud rate is the same in both instances.

Hello Mac and thank you for taking a look at my ramblings however I think you misunderstood my problem. An occasional overrun during communication is quite tolerable, we use checksumming and a reliable delivery scheme to insure to that data will eventually get through intact.

My real issue is that the overrun happening to arriving between the polling of SCIS1 and reading of SCID appears to leave the receive buffer full without RDRF set (and OR raised.)

The effect being that no further incoming data will *ever* be processed, since RDRF cannot be set until the buffer gets flushed.

I can work around the issue easily enough by implementing an overrun interrupt which manually flushes the buffer, but it seems to me that this behaviour leaves the receiver vulnerable to a very subtle and insidious race condition.

This leaves me working whether I am reading data in the wrong way, whether this might be silicon bug on the 9S08 SCI, or if I am labouring under some other misapprehension.

As for the transmit routine I'm explicitly not reading the status register in that example to be able to demonstrate the issue through the loopback interface. In the actual application the data is sent by a different host.

Yes, it would be possible for an overrun condition to occasionally occur after reading SCIS1 register, but prior to reading the SCID register, due to the arrival of the next character. I would expect that this should not occur very frequently with only a few cycles between the two processes (I assume this is actually the case). Here, I would assume that the RDRF flag would be cleared, but the OR flag would remain set.

Possible solution 1:

Immediately after reading SCID, read SCIS1 register again. If any error flag is set, do another dummy read of SCID.

Indeed the window of opportunity is very short and the problem occurs exceedingly seldom. In this particular application it would occasionally lock up a node every few days when the receiver interrupt happened to be sufficiently delayed and the stars were in alignment.

After that the operator would need to manually reset the system to recover.

I believe the actual cause of the delays were a FLASH write temporarily disabling interrupts, though it may have been something else. Obviously frequent overruns are not acceptable and I'll certainly strive to avoid them, but guaranteeing minimum interrupt latencies in all possible paths of the application would be an awful lot of work.

Oh well, time to go back and add overrun handling to various old applications. Thanks for confirming this behaviour!

Please note that (void)SCIS1; doesn't necessarily produce a read of the register, even if SCIS1 has been declared as volatile. Whether a compiler is forced to generate such code or not has been debated, no matter what the standard says there are plenty of compilers that will not generate any code when encountering that line. I would disassemble to code to ensure that an actual read takes place.

I've double-checked the compiler output and CodeWarrior does indeed generate loads for these statements. Unfortunately storing the result to a volatile dummy variable generates an additional write of the result to the stack, increasing the window of opportunity for this particular issue by a couple of cycles. I don't suppose there's a more efficient way of reliably forcing the desired behaviour?

To be honest I had rather thought this was guaranteed by the standard and universally supported but after my adventures with Microchip's MCC18 I should have learned never to take anything for granted.

I've double-checked the compiler output and CodeWarrior does indeed generate loads for these statements. Unfortunately storing the result to a volatile dummy variable generates an additional write of the result to the stack, increasing the window of opportunity for this particular issue by a couple of cycles. I don't suppose there's a more efficient way of reliably forcing the desired behaviour?

I have never had a problem with the CW compiler providing a read for a volatile register. However, an alternative method would be to provide macros that use HLI assembler to generate the read process, without the overhead of creating, and writing to a variable.