Recommended Posts

So i've been trying to control ws2812/ws2812B led strips with my Tiva launchpad, the tm4c1294xl. First i will explain what i've been doing. Later when i have a clean code for you all to read i will post it. I use only the WS2812B led strip.

I wanted to make a big RGB matrix so i wanted alot of outputs with the least processor usage, taking advantage of the ARM peripherals.

First i tried using the SSI module, it worked but it could be better, plus it used alot of RAM. Here it is working:

Then i saw alot of controlers using DMA transfers, some change PWM duty values and others just changed the state of GPIO. I went for the second aproach of sending data to the GPIO.

It's the same method as the teensy uses. The idea is to send 3 values per bit. A 0xFF, data values, 0x00. This should explain better:

Well this uses 2 timers interrupt and a GPIO interrupt. Well the guys at TI E2E teached me that the TIVA PWM module has inverting capabilities with 2 comparators. So what do i do? Well i just use 1 PWM output and 1 GPIO interrupt. The PWM inverts the PWM state (HIGH or LOW) at 0.4uS, 0.8uS and 1,25uS (end of PWM period). The GPIO triggers the DMA for both edges so it always sends the 3 values needed at the right timing.

With this i can control 8 outputs for the WS2812B. But wait! The tm4c1294 has 15 GPIOs! Unfortunaly just 4 of them have the 8 pins available in the breakout. So i use the same PWM signal and 3 more GPIO pins for interrupt. With this i control 32 outputs using only 4 GPIO interrupts and 1 PWM module output. So if you use 512 LEDs per output like he teensy 3.1 then you have control over 16384 WS2812B.

Well, now problems:

This method uses 1 byte values, since it sends the 8 bits for the GPIO pins right? But i need 3 values per each brightness bit (0xFF, 0xXX, 0x00) so i require 24*3 bytes to control 1 WS2812B per 8 output (so total of 8 WS2812B are being controled). This method uses alot of RAM.

Second problem, the Tiva DMA can only transfer 1024 itens per transfer set. So that means it can only control 14 WS2812B before the processor needs to set the transfer again. Since this takes alot of time (relative to the timing of the ws2812b), i am going to implement DMA ping-pong mode to solve this.(alredy solved)

TODO:

Do the code to receive new data and update, possibly from UART or USB.

Optimize the control with Scatter-Gather, this would solve both problems i have with the control but it's realy complex and there isn't much information about Scatter-Gather.

However, I would use Flash instead of RAM to store 2048 0xFF and 0x00s.

Unfortunately, I don't think you can use the uDMA from Flash.

Page 540 lm4f120h5qr Data sheet9.2 Functional Description"The μDMA controller can transfer data to and from the on-chip SRAM. However, because the Flash memory and ROM are located on a separate internal bus, it is not possible to transfer data from the Flash memory or ROM with the μDMA controller."

As far as using a separate DMA channel to send the 0xFF and 0x00's

It does not seem necessary to synchronize the feeding of the 0xFF/0x00 channel with the feeding of the main data channel. All you have to do is keep it spewing out alternating bytes in time. So set it up as a ping-pong (both data sets coming from the same buffer of alternating FFs and 00s). When it interrupts, just reset the data to the same buffer.

As was pointed out, limited to at most 1024 items, so would be at most 512 0xFF and 512 0x00's. - 1024 bytes total.

If 1024 bytes is still too much overhead - use less (just a tradeoff of memory vs. processor time - the shorter the buffer the more interrupts have to service).

If one is running several ports, then all the channels handling 0xFF/0x00 can use the same buffer. Might want to feed some of them different amounts of data for just the very first cycle so that the ping-pong interrupts are not all bunched up. e.g. if give one of them only 100 bytes to start with, and one of them 200, etc. Then the interrupts will be spread out, rather tan all of them needing service at same time. On subsequent refreshes they will each get the same amount of data, so they will hopefully remain somewhat spread out. Could do same with the pixel data (so pixel data pumps are not as likely to compete for processor, either with each-other, or with the 0xFF/0x00 pumps).

The more DMA channels you have going, the more likely they are to compete for memory access, so using just 2 DMA channels per port, rather than 3 might help reduce timing glitches? (I don't know how tight the timing requirements are on the LEDs.)

The data handling DMA can still churn out 1024 data items before have to reset it, as long as it can keep up a continuous flow of bytes, it should not matter where it is in its' count relative to the other DMA. Could do ping-pong on pixel data DMA as well, or else may have to stop the clock for the DMA channel that is sending the 0xFF and 0x00's (otherwise might not have enough time to just update the pointers on the fly.) If need them to expire together - keep the 0xFF/0x00 channel byte count a divisor of the pixel data channel byte count, and stop reloading the 0xFF/0x00 channel after right number of refreshes.

Share this post

Link to post

Share on other sites

@@igor The timing doesn't need a realy big precision, it can have an error of 150nS. Timing gliches i rarely have them, i did some test with all the 4 GPIO (so it's 32 outputs) being fed from 1 buffer.

i don't realy know what you mean with : "It does not seem necessary to synchronize the feeding of the 0xFF/0x00 channel with the feeding of the main data channel."

Btw what i meant by overhead is CPU ocupation. Sorry if i mixed-up the names. I want the CPU free the most i can.

I'm still considering the way i'm going with this but, besides scater-gather, i'm betwen using 3 DMA chanels so i just use 2 bytes for both 0xFF and 0x00 or use 2 DMA chanels and use 512 byte for 0xFF and 512 bytes for 0x00. Of course the last i can reduce the number of bytes but i would mean the CPU would interrupt more times

Share this post

Link to post

Share on other sites

ooooo...i think i saw what i did rong. i'll later study this better next week. RobG to the rescue once again

Btw i have it right now in ping-pong mode with 32 outputs. Still using 3x RAM. For pratical purposes that will probably not be a problem since i can't realy make a matrix that big. I can stop worrying about it for the project but i can improve it for the chalenge

Share this post

Link to post

Share on other sites

Well, here is a small test to show that the brightness control is actualy working. I had to give back the strip i had so now it's just with the 8 LEDs, hoping tomorrow i'll have 1 strip of 60 LEDs with this patern: