AnsweredHot!EZBL over RS-485

EZBL over RS-485

I don't have a specific problem, but was able to get EZBL to work over RS-485 with a few quick fixes. First I copied over _UxTXInterrupt from uart1_fifo.c to my Hardware Initializer file, tossed the weak attribute, and made the changes below. Obviously RB13 is the transmit enable line, which is tied to both transmit and receive enable. Next, I changed the following line in main for a much longer boot task: NOW_SetNextTaskTime(&EZBL_bootTask, NOW_ms*128u). Yes this does slow things down a bit, but I can handle 19 seconds total time. This was done with a dsPIC33EP64GS504. I have achieved similar results with a PIC24FJ256GA106 and PIC24FJ256GA108.

I would point out that this is still experimental and does fail if I attempt to program it when the application is running. However, another attempt will work. If anyone has a better method, I'd be happy to adopt it :-). For now, though, I hope that this helps somebody out.

// Transmit a byte, if possible, if pending // NOTE: The FIFO internal data structures are being accessed directly // here rather than calling the TX_FIFO_Read*() functions because // any function call in an ISR will trigger a whole lot of compiler // context saving overhead. The compiler has no way of knowing what // registers any given function will clobber, so it has to save them // all. For efficiency, the needed read-one-byte code is duplicated here. while(!U1STAbits.UTXBF) { if(UART1_TxFifo.dataCount == 0u) break;

For half-duplex mediums, you might want to change the EZBL_FLOW_THRESHOLD value, for example, at global/file-scope, add: EZBL_SetSYM(EZBL_FLOW_THRESHOLD, 96);The default was 18 bytes in EZBL v2.04 and is 32 bytes in EZBL v2.10.

The net effect of making this change will be that the bootloading logic will generate a lot less outbound software flow control messages and better throughput could be achieved. Each message is a 16-bit integer advertising how many bytes of free space are currently available in the software RX FIFO, less any free space previously advertised, but not yet delivered to the RX FIFO (or recognized by software, even though the data may actually be already sitting in the FIFO due to background ISR processing).

For the best performance and a solution that doesn't depend on the bootloader task execution frequency, I recommend implementing a custom, blocking TX_FIFO_OnWrite() callback function and delete or disable the _U1TXInterrupt() ISR. In your UART1_TX_FIFO_OnWrite() callback, you would:1. Wait for all RX activity or potential RX activity to cease2. Set your TX Enable line3. Copy reqWriteLen bytes of data from *writeSrc to U1TXREG4. Purge the existing data from destFIFO by calling EZBL_FIFORead((void*)0, destFIFO, bytesPushed);5. Poll for hardware transmit completion6. Clear the TX Enable and revert to RX mode7. Return reqWriteLen

Implementing step 1 might entail polling until:U1STAHbits.RIDLE && ((UART1_RxFifo.dataCount >= EZBL_bootCtx.bytesRequested) || (NOW_32() - startTime > NOW_ms*2u))where startTime is set to NOW_32() before the test and continuously during the polling loop whenever (!U1STAHbits.RIDLE). The idea is to timeout after the greater of several character idle times or a couple milliseconds to ensure the medium is idle and the remote host node won't transmit anything else. A fixed couple milliseconds is helpful since things like USB to UART transceivers will operate at the mercy of the USB bus, which typically implements a 1ms polling interval for Full Speed USB devices and adds a relatively fixed round trip communications latency.

In normal operation dataCount should match bytesRequested when the PC node stops transmitting, but the timeout is still necessary since the end of file activities involve advertising buffer free space, even though the host node doesn't have any .bl2 file data left to transmit to the Bootloader.

The _U1TXInterrupt() code can be left in place but disabled by setting the interrupt priority to 0. Statically you could use:EZBL_SetSYM(UART1_TX_ISR_PRIORITY, 0);At run-time you could write directly to the IPCx registers to control this, or in EZBL v2.10, via the EZBL_FIFOSetIntPri(&UART1_TxFifo, 0); API. Of course, deleting all code that sets the _U1TXIE bit will also work, but finding this code could be error prone since it could theoretically be written to indirectly via calls to EZBL_FIFOIntEnable(), EZBL_FIFOIntEnableSet(), UART1_FIFO_EnableInterrupts(), EZBL_SetIntEn(), EZBL_WrIntEn(), EZBL_InvIntEn(), UART1_TX_FIFO_OnWrite(), etc., even though the current releases of EZBL don't actually call all of these.

I have to admit that working with EZBL has been a humbling experience. Out of the box it worked great for a project utilizing an FTDI USB/UART chip. Digging into the guts of the project has shown me techniques that I've never seen in practice. Hopefully I'll be able to step up my game.

If you were instrumental in writing EZBL, which wouldn't be a surprise as your name was all over the source code for the old MLA TCP/IP stack, then my hat goes off to you. It truly is something to behold.

Yes, I might have had some involvement writing EZBL. However, bootloaders can downgrade expensive hardware into bricks or enable Internet hackers to reprogram your ECU as you drive down the freeway, so I uhh, prefer to think that some "applications engineering team at Microchip" was primarily responsible for EZBL's development.

I tested my prior suggestion on an Explorer 16/32 + PIC24FJ1024GB610 PIM @ 460800 baud and monitored a GPIO (TX_EN) pin + U2RX and U2TX using a logic analyzer. I observed successful bootloading and half-duplex communications that should be compatible with an RS-485 or RS-422 network using the following code. This example can be simply added to ex_boot_uart\main.c under EZBL v2.10 without digging into the UART code in ezbl_lib.a or rewriting the transmit ISR (although the ISR does become a minor amount of dead code by ignoring it).

The only adjustment I needed compared to my prior description was some compensation while reading EZBL_bootCtx.bytesRequested. EZBL_Install2Flash() adds the free-space flow control advertisements to this variable just before transmitting said flow control advertisements. This makes the RX Idle test always end by the slower timeout fallback if the variable is used without un-adding the advertisement first.

Sorry for the late reply, as I have been away from any hardware for the past couple of days. It didn't take long to show just how much better your method is: 2.175s vs. 19s to program. Thank you again for this!

Hello,First of all, thank you very much for your contributions. They are awesome.

I want to share my own experience with Howard´s code:I have an rs485 setting and I was getting an error until I deleted the compensation while reading EZBL_bootCtx.bytesRequested. After that I successfully loaded a program.

Howard, do you know what could be happening? You would be very kind solving this mystery ;)

The COM read error at time = 0.223375s is a PC-side ezbl_comm.exe fread() call failure against the PC's "\\.\COM3" communications resource path.

The Bootloader only transmits 16-bit flow control messages or the 16-bit 0x0000 termination code, followed by a 16-bit status/error code. Right when the PC COM read error occurred, a normal 16-bit flow control message was expected. The 0x00 RX bytes @ offset 49 and 50 are a 0x0000 termination message, and the 0xEC and 0xFF RX bytes @ offset 51 and 52 are the EZBL_ERROR_COM_READ_TIMEOUT error code. The singular 0x80 RX byte @ 48 is not something your EZBL Bootloader ever generated and transmitted (apparent since you have the default 96 byte RX FIFO defined in your Bootloader, which is always the first flow control advertisement after the bootloader has finished erasing or blank checking the existing App flash regions).

My conclusion is that you have severe errors occurring on the wire of your communications medium. The PC has lost one or more RX bytes and corrupted 1 or more bits to see the 0x80 orphaned byte, and simultaneously, the Bootloader has lost numerous RX bytes from the PC's transmissions @ 64 and @ 160 in order to timeout expecting more data to arrive. These conditions all point to bus contention on a half-duplex medium. Both nodes are blindly jabbering away, not willing or able to listen to the other node before opening its own mouth and correspondingly hearing truncated garbage when they finish talking and revert to RX mode again.

Whatever compensation code you deleted needs to be reinstated. It serves an essential function and is required for compatibility with a half-duplex medium. Without it, the bootloader will transmit when it must instead remain listening. RS-485 does not have hardware carrier sense, so software must implement a stateful algorithm to know when the remote node is able to listen.

Since you indicated some kind of failure with the compensation code, you may need to modify it in some way to make it more pessimistic, such as by adding an additional, fixed RX to TX turnaround delay after the expected medium silence. I don't know what kind of RS-485 transceiver you have connected to your PC, but it may require one or two USB polling intervals to elapse before it switches from TX mode back to RX mode. I.e. try adding > 2ms of bootloader pre-TX holdoff in addition to the EZBL_bootCtx.bytesRequested tracking code.

As Howard states you require enough time for the 485 to switch from Tx to Rx.I have also found adding "Passive Failsafe bias resistors" help in limiting the false characters when switching from Tx to Rx and the bus is idle as described in this pdf for example http://www.ti.com/lit/an/slyt324/slyt324.pdf

/** * EZBL Bootloader "main" file for initializing communications hardware, * checking app-installed flags, receiving new application firmware, and * dispatching execution to the Application's reset/main() routines. * * Interrupt Service Routines implemented in this project are: * - One 16-bit Timer or CCT Interrupt, selected in the hardware initializer file (ISR implementation defined in ezbl_lib.a, see ezbl_lib -> weak_defaults/_TxInterrupt.s (or _CCTxInterrupt.s)) * - UART 2 RX (defined in ezbl_lib.a, see ezbl_lib -> weak_defaults/uart2_fifo.c; can be a different UART instance, see UART_Reset() in hardware initializer) * - UART 2 TX (defined in ezbl_lib.a, see ezbl_lib -> weak_defaults/uart2_fifo.c; can be a different UART instance, see UART_Reset() in hardware initializer) */// DOM-IGNORE-BEGIN/******************************************************************************* Copyright (C) 2017 Microchip Technology Inc.MICROCHIP SOFTWARE NOTICE AND DISCLAIMER: You may use this software, and any derivatives created by any person or entity by or on your behalf, exclusively with Microchip's products. Microchip and its licensors retain all ownership and intellectual property rights in the accompanying software and in all derivatives here to.This software and any accompanying information is for suggestion only. It does not modify Microchip's standard warranty for its products. You agree that you are solely responsible for testing the software and determining its suitability. Microchip has no obligation to modify, test, certify, or support the software.THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES, WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE APPLY TO THIS SOFTWARE, ITS INTERACTION WITH MICROCHIP'S PRODUCTS, COMBINATION WITH ANY OTHER PRODUCTS, OR USE IN ANY APPLICATION.IN NO EVENT, WILL MICROCHIP BE LIABLE, WHETHER IN CONTRACT, WARRANTY, TORT (INCLUDING NEGLIGENCE OR BREACH OF STATUTORY DUTY), STRICT LIABILITY, INDEMNITY, CONTRIBUTION, OR OTHERWISE, FOR ANY INDIRECT, SPECIAL, PUNITIVE, EXEMPLARY, INCIDENTAL OR CONSEQUENTIAL LOSS, DAMAGE, FOR COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE SOFTWARE, HOWSOEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE POSSIBILITY OR THE DAMAGES ARE FORESEEABLE. TO THE FULLEST EXTENT ALLOWABLE BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY, THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.MICROCHIP PROVIDES THIS SOFTWARE CONDITIONALLY UPON YOUR ACCEPTANCE OF THESE TERMS.*******************************************************************************/// DOM-IGNORE-END#include <xc.h>#include <stdlib.h>#include <libpic30.h>#include <string.h>#include <stdarg.h>#include "ezbl_integration/ezbl.h"

// EZBL ezbl_lib.a link-time options://EZBL_SetSYM(EZBL_NO_APP_DOWNGRADE, 1); // Uncomment to disallow upload of App image with a version/build number that is < existing App version (assuming a valid App exists). Note: This can be circumvented if someone starts uploading a recent App image, they interrupt power or communications, then upload older firmware. With nowhere to store an offered Application but in an erased flash, the act of starting a valid upload results in prior knowledge of which older version(s) are disallowed.//EZBL_SetSYM(EZBL_FLOW_THRESHOLD, 48); // Optional water mark to adjust when outbound flow control advertisements are generated. Small values consume more TX bandwidth by generating more messages. See EZBL_Install2Flash() API documentation for additional information.//EZBL_SetSYM(UART2_TX_ISR_PRIORITY, 1); // Optionally change the U2TX Interrupt priority. The default is IPL 1 (lowest possible priority).//EZBL_SetSYM(UART2_RX_ISR_PRIORITY, 2); // Optionally change the U2RX Interrupt priority. The default is IPL 2 (low priority).

int main(void){ unsigned long appLaunchTimer; // Timestamp tracking when we last saw COM/Bootloader activity unsigned long ledTimer; // Timestamp tracking when we last toggled an LED unsigned long now; // Cached return value from NOW_32()

// Main Bootloader LED blink and App dispatch timeout loop while(1) { ClrWdt();// Every 62.5ms toggle a heartbeat LED (8 Hz blink rate) indicating this Bootloader is executing now = NOW_32(); if(now - ledTimer > NOW_sec/16u) { ledTimer += NOW_sec/16u; TRISAbits.TRISA5 =0; LATAbits.LATA5= ~LATAbits.LATA5; }// Check if it is time to jump into the application (default is 1 second // of nothing being received, as defined by BOOTLOADER_TIMEOUT. if(EZBL_COMBootIF->activity.any) { // Activity happened, so reset launch timer to full interval remaining EZBL_COMBootIF->activity.any = 0x0000; appLaunchTimer = now; } if(now - appLaunchTimer > EZBL_bootCtx.timeout) { // If auto-baud is used, automatically reset the UART back to // auto-baud mode while idle for longer than the timeout. if(EZBL_COMBaud <= 0) { EZBL_FIFOSetBaud(EZBL_COMBootIF, EZBL_COMBaud); // This will set EZBL_COMBootIF->activity.other, so appLaunchTimer will also be reset on next loop iteration } if(EZBL_IsAppPresent()) {// LEDOff(0xFF); // Executes LEDToggle(0x00FF & LEDToggle(0)); to turn off all LEDs we were blinking

// Optionally turn off all Bootloader ISRs and forward the // interrupts to the Application so we become a passive classic // bootloader. // NOTE: You are giving up useful timing and communications APIs // if you do this. Also, the automatic bootloader wakeup upon // .bl2 file presentation won't work. To minimize flash, the // App can just reuse the bootloader APIs as is, keeping the // NOW Timer interrupt and communications ISRs active in the // background (which have minimal run-time execution cost).

EZBL_RAMSet((void*)&IEC0, 0x00, (unsigned int)&IPC0 - (unsigned int)&IEC0); // Clear every bit in all IECx Interrupt Enable registers EZBL_ForwardBootloaderISR = 0xFFFFFFFF; // Forward all Interrupts to the Application NOW_EndAllTasks(); // Terminates the EZBL_BootloaderTask() from executing via background NOW ISR and when a NOW_32() or NOW_64() call is made which indirectly needs the timer ISR to execute and perform a carry propagation.EZBL_StartAppIfPresent(); // Sets EZBL_appIsRunning = 0xFFFF and temporarily disables IPL6 and lower interrupts before launching Application } }// Not much going on in this loop. Let's save power. The NOW Timer ISR // will wake us up, as will COM RX events. Idle(); }}/** * Callback function automatically invoked anytime a write is made against the * UART TX FIFO (with or without any data actually getting written). Writes * occur when the EZBL_FIFOWrite() function is called (possibly indirectly * through a wrapper function). * * This callback executes after the write has already taken place and is a * convenient place to trigger necessary hardware actions, spy on the data * passing through, tee it to another communications or storage interface, or * implement additional FIFO features, such as a blocking/almost guaranteed * write with timeout mechanism to simplify other communications code. * * If a write is made without enough FIFO space to store all of the data, this * callback executes after the FIFO has been completely filled and provides an * opportunity to process the residual data. * * The default callback implementation will block when the FIFO is full and * trigger the UART TX ISR as needed to ensure the data doesn't have to be * thrown away. However, blocking will abort and throw unbuffered data away * after a fixed timeout is reached (ex: UART's 'ON' bit is clear, resulting in * clearance of data from the FIFO) The default timeout is set to 250ms (2,880 * bytes @ 115200 baud, 480 bytes @ 19200), which can be changed in the callback * code. * * @param bytesPushed Number of bytes that actually got written to the software * FIFO before this callback was invoked. * @param *writeSrc Pointer to the memory originally used when the FIFO write * call was made. As no other code has executed yet, ordinary * RAM data pointers can be re-read here to see or peek at all * of the requested write data. * @param regWriteLen Number of bytes that were requested to be written when the * FIFO write call was made. Generally this will match the * bytesPushed value unless the TX FIFO is completely full. * @param *destFIFO Pointer to the EZBL_FIFO that called this callback function. * * @return Number of bytes that you want to return back to the EZBL_FIFOWrite() * caller as reportedly being successfully written. Generally you should * return bytesPushed unless you've taken some action to transfer more * data (or perhaps stolen some back out of the FIFO). */

My apologies for resurrecting this thread, but I was busy with other things recently and couldn't get back to this. As an update, I'm using EZBL 2.11 and all of the advice that Howard gave still applies and works very well. I did add some additional features as they pertain to my project.

In the bootloader and the application, I've removed the "heartbeat" LED blinking. I don't need it at this point, so I got it out of the way. I also removed the debricking interval. This may come across as a poor decision, but we don't want to wait even a small period of time after the device is turned on to get to the application. However, I did add the use of a port pin as a method to force the bootloader to stay there on startup UNTIL a valid application has been programmed. This is what I do with my Harmony projects. If the "switch" is pressed, then I'll set a flag to true prior to the main while loop. From there the only way that the application may be launched is if an application is present AND the flag is false. The only way for the flag to become false again is by programming a valid application and implementing a function that clears the flag through the EZBL_AppPreInstall callback, and then setting that callback to zero again prior to jumping into the application.

On the application side, since we use Modbus RTU over 485, I will forward the UART interrupts to the application. If the same port pin that forces the bootloader is pressed in the application, the interrupts will forward back to the bootloader and suspend any processing relating to Modbus communications. Since the main bootloader task is always running in the background, I can bootload a new application with no fuss. In addition to the pin, I have a special modbus code that will do the same thing so that all of this can be accomplished digitally.

So I do have a question after all of that: is there a good way to lock this down like a PIC32 processor? Serial bootloaders using Harmony fit (usually) in the boot block, which I promptly write protect. Doing this gives me the peace of mind that it's even more unlikely that an inadvertent write will brick the device. I was hoping that I could do the same with this bootloader, but I'm nowhere near as familiar with 16 bit devices as I am 32 bit. I'm looking into CodeGuard and how it would apply. By default EZBL doesn't seem to write protect anything, which is definitely a good thing. It seems that I can define the size of the boot block and protect it with different security levels. I think this is what I want to do. Does anyone have pointers? For now, I'm enabling the boot segment, and setting BSLIM to something like 0x1FF8 to protect the first seven pages. Based on my very uninformed read of the gld files, it appears that the bootloader size is 0x1C00 and page size is 0x400, which equates to seven pages.

In implementing the web programmer, I'm having a little trouble with the programming phase. The first 64 bytes are sent just fine, and I'll then get the keep alive messages (0x13 0xff) followed by (0x11 0xff) and then finally (0x60 0x00). I will then send the requisite 96 bytes, but I receive the following response: 0x00 0x00 0xfc 0x8e 0xfc. What is this? I'm assuming that it's an error code of some sort.

Looking at ezbl_comm_log.txt, I've noticed that when the payload messages are sent (like 0x60 0x00 above) the PC responds very quickly: 236 us in the case of the first response (96 bytes). Do I have to send it back this quickly? If my error above is truly a timeout issue, how can I increase the timeout to something more reasonable?

Edit: It almost seems like I have to drastically increase the following statement: NOW_SetNextTaskTime(&EZBL_bootTask, NOW_ms/2u) to something like NOW_SetNextTaskTime(&EZBL_bootTask, NOW_ms * 40u). Throughput would go way down, but I wouldn't have to rework my underlying communications engine, which is based on modbus RTU.

Edit 2: Yeah, that didn't work either. How would I increase the timeout value for a response after the payload message is sent? I.e., if the bootloader sends 0x60 0x00 I would like to be able to wait several ms to respond.

Good grief. I think that I have to redo all of my testing. The way the application and bootloader are linked together I will have to recompile the application any time changes are made to the bootloader. Super frustrating.

Edit: Let me apologize to all forum users as the error is most definitely on my side of things. I dropped back to simply exchanging messages over the serial port with parts of the .bl2 file saved as a constant array. Everything works very well and the timeout built into EZBL is more than enough for my purposes (phew).

I have marked this thread as solved, as I've finished my web loader and all is right with the world. However, I did run into a problem where the bootloader would turn around too quickly for the master: both master and slave would be in transmitter mode at the same time. This would only occur once in a while, though the response from the bootloader was always less than 1 ms after the master was done transmitting. I chalk this up to the underlying full duplex nature of EZBL.

Anyway, I'm posting a change that I made to Howard's code above. It blocks for 5 ms prior to sending the data. Your miles may vary on this part, but for me it worked much better.