The Adafruit_ZeroI2S description says it has "DMA / interrupt support". How do interrupts work with the library? I'd like to be able to register a callback to feed the next sample in from a ring buffer rather than use the write() method in the main loop. I thought about using Adafruit_ZeroDMA to shuttle the data, but the example only covers a loop over a pre-filled buffer. I noticed the ZeroDMA library can have a callback registered when it finishes a job. Would it be possible to double-buffer the audio and have it switch buffers when each job completes? Since I2S seems to be very sensitive to variations in timing, I'd rather keep the code for updating it outside of the main loop. Has anyone tried this?

Thanks for any help you can give.

Eric

Update 1: Okay, so I'm planning on implementing a ring buffer for audio data. Already have this working with the .write() method. What I'm thinking is that I set up a non-looping job that transfers 2 integers out of the ring buffer to the I2S register, fires the callback, which modifies the descriptor with the source address being the next point on the ring, and re-runs the job. Does this make sense?

Update 2: Got it working. Didn't solve the problem. Communication with I2C or Serial still blocking the transfer to I2S. Also ended up with weird stair-stepping artifacts in the waveform (like it was skipping samples). I'm going to try the Arduino I2S library and see if I fare any better.

Update 3: Nope. No good. Compiles, runs, blocks for the expected amount of time, but produces no sound. I had really hoped to avoid using a timer interrupt to generate the audio, but I've really got to get that part out of the main loop somehow. Getting frustrated. Any guidance on this would be greatly appreciated.

DMA means data is passed from one peripheral to the other without ever touching the microcontroller. In the SAMD21, DMA requests are triggered by 'events', which are similar to interrupts but again, can be completely independent of the microcontroller.

The devices we call '32-bit microcontrollers' are really more like a small, simple microcontroller connected to a table full of external devices.. ADCs, DACs, GPIO expanders, and so on. The part that runs code is only a small part of the chip, and almost all of the functionality is handled by standalone circuits at the far end of a fast data bus. Each peripheral has its own block of memory that it uses to do its job, and the microcontroller can read or write parts of that memory to tell the peripheral what to do.

The DMA peripheral knows how to read data from one peripheral's memory and copy it to the memory of another peripheral, independently of the microcontroller. You can build a fairly sophisticated data input and processing system that runs with the microcontroller itself completely shut down.

For the ARM architecture, interrupts are signals that only exist in the microcontroller itself, and are triggered by conditions in the peripherals. Unlike 8-bit microcontrollers, where each GPIO pin triggers a separate interrupt, an ARM microcontroller only gets one kind of interrupt per peripheral. Any interrupt the ADC generates arrives at the microcontroller as 'ADC interrupt', and then the microcontroller has to read the ADC's registers to figure out what happened.

The 'Event' system makes it possible for one peripheral to generate signals that are picked up by another peripheral. The DMA peripheral is a major Event client, since almost every peripheral can send it events saying something specific has happened. The DMA peripheral can then be configured to move data from one peripheral to another when it receives specific events.

The DMA peripheral has the option to send a 'DMA interrupt' to the microcontroller as part of its response to an incoming event, and that's how the callbacks come into play.

The Zero_DMA library keeps an array of function pointers associated with each DMA channel. Any time a DMA interrupt occurs, the Zero_DMA library finds the DMA channel associated with the interrupt, checks the callback array to see if a callback has been assigned to that channel, and if so, executes the callback with a reference to the Zero_DMA object as a parameter.

The callback's job is to query the Zero_DMA object to find out what kind of interrupt happened, and decide what code to execute based on that information.

Okay, so if I'm understanding you correctly, it isn't necessarily a given that the callback is fired when the DMA channel finishes transferring data, and I should be checking for something to make sure the callback is being called for the condition I care about. I guess I'll need to poke around in the ZeroDMA object to find out what I need do to query for the "job done" state.

I've sort of got it working, but I'm brute-forcing it right now. I've switched from the ring buffer to a double-buffer, and in the callback I change the DMA descriptor to switch buffers, and set the flag for which buffer needs to be written to next. The audio still glitches every time I write to I2C (currently updating an SSD1306 display from the main loop with some profiling counters once per second). Unfortunately, for the audio to be clean, I've had to go with two 2K buffers, which consumes a large amount of RAM (and creates a lot of latency). Anything less and the audio glitches once per second when the display updates. It only seems to happen if I update the descriptor to switch buffers. If I go with a single buffer (that doesn't get modified), and just have "dma->startJob()" in the callback, the audio doesn't glitch. It's as if something is preempting the code in the callback, delaying the job restart just long enough to glitch the audio. I originally thought that maybe I2C was interfering with (or delaying) the callback, but if that were the case I'd expect that a callback with nothing but "dma->startJob()" would be affected as well, but it isn't. Something about I2C must be preempting the code inside the I2S DMA callback.

I've also thought about the possibility of trying to figure out how to drive I2C from DMA as well. Since both I2C and SPI use the SERCOM devices (and the NeoPixel DMA library runs over SPI), it seems like it should be possible, but I haven't found an example of this yet. I'm also concerned that there might not be enough DMA channels for what I want to do, since ultimately I'd want I2S, I2C and NeoPixels all running over DMA. I could try using a Seesaw board to drive the NeoPixels, but there's still the problem with I2C clobbering the I2S audio.

If you think it might be helpful, I can strip down the code I'm working with to essentials and post some code that's small enough to reasonably fit in a post but demonstrates the problem. I'll start working on that tomorrow, but if you have any insight as to what I might be missing, I'd appreciate any advice you can give.

This is running on an ItsyBitsy M0, wired to an SSD1306 over I2C and a UDA1334A audio DAC over I2S, all using the standard example pin assignments. The example code plays two different frequencies of sine wave, one in each channel. By default, it does not update the SSD1306, however connecting pin 12 to ground will enable the display update, triggering the side-effect I'm having problems with. Unfortunately, I do need I2C to work, as I plan to connect the M0 to a NeoTrellis for control over waveforms, frequencies, etc. as well as possibly the SSD1306 so I can see what changes I'm making. Any help you can provide with this would be greatly appreciated, and please let me know if you need more information.

EricLarge wrote:Okay, so if I'm understanding you correctly, it isn't necessarily a given that the callback is fired when the DMA channel finishes transferring data, and I should be checking for something to make sure the callback is being called for the condition I care about.

It's more like the callback will fire multiple times for a single job, as the job progresses through actions that fire interrupts.

Regarding the glitch in audio, it sounds like you might be seeing the effects of another feature in the ARM interrupt system: nested interrupts.

Interrupts are handled by the Nested Vector Interrupt Controller, or NVIC. The NVIC has resources to suspend not only the main code, but each interrupt handle assigned to a given peripheral. The peripherals are arranged in order of priority (which is programmable), and interrupts from peripherals with lower numbers will suspend interrupt handlers for peripherals with higher numbers.

I2C generates a ton of interrupts because there are lots of state-dependent decisions the higher-level code has to make. It sounds like the SERCOM for I2C is ranked at a higher priority than the DMA peripheral, and the I2C interrupts are making the DMA hang.

There are a couple of ways to approach that:

One would be to rearrange the priorities so the DMA peripheral has higher priority than the SERCOMs. That would reverse the order of suspension, with I2C transactions getting the occasional delay while the DMA refills the buffer. You'll have to dig into the details of NVIC configuration to do that though, since I don't think we've done much with those settings.

Another approach would be to use a ring buffer with a different update strategy. Instead of calling for a refill when the head and tail pointers are down to 10% separation, for instance, use a routine that adds 5% any time it sees less than 90% separation.

The 90% of the buffer that normally goes unused will be what covers the delays associated with I2C updates. The small, fixed-size additions to the buffer have a better statistical chance of firing and finishing between I2C interrupts, and they can recover from losses due to longer delays as long as they normally push data into the buffer faster than the output can pull data out of the buffer.

Unfortunately, that did not resolve the glitch in audio. Either I'm doing something wrong, or something else is afoot that's interfering with the buffer switch.

I'm currently using a double buffer for the audio. I tried using a ring buffer for I2S (starting a DMA job for each sample set), but the results were terrible (update 2 in my first post). Going without DMA was better in that scenario, but didn't solve the glitching problem. I think I may have an idea of what might be happening though. I managed to capture one of the glitches with my oscilloscope (image attached). I can see a couple instances in the waveform where it is exactly the same between breaks, suggesting that the buffer isn't switching at all. The DMA job isn't set to loop (I have it explicitly disabled using "myDMA.loop(false)" in setup), instead the job is re-started in the callback.

Given that the audio doesn't freeze, it's reasonable to assume that the job is still being re-started and therefore the callback is indeed running. This brings up what you mentioned about the callback firing multiple times per job. If that's the case, the callback could be switching buffers prematurely, essentially switching back to the buffer it was originally using rather than the new one.

Unfortunately, this might also be a red herring. I just wrapped the buffer switch / job restart code in an if statement ( "if (!dma->isActive())" ) and the glitch is still there.

Hmmm... If I set up the DMA channel as a loop with two descriptors, one for each buffer, would it still use the callback when switching to the next descriptor? If that's the case, I could make the callback code much smaller (reducing the chances of getting interrupted) and still let the main loop know which buffer is safe to fill. Any thoughts on that?

Update 1: I set up a counter in the DMA callback to see how many times it was being called per second. I came back with 345. Given that the buffer size is 256 and contains pairs of samples, it's effectively 128 sample sets per buffer. 44100 / 128 is 344.5, so it's definitely within the margin of error for the callback only being called once per job. I'm going to try setting up the DMA channel with two descriptors and see if the callback gets called once per descriptor or once per job. I'm really hoping it's the former.

Update 2: No such luck. If it's set to loop, the callback doesn't get called at all. If I set it not to loop, then the callback only gets called when both descriptors have been run through. I'm hoping there's some sort of interrupt I can attach to that can fire off when the DMAC changes descriptors.

Update 3: I tried just flipping back and forth between the two buffers, pre-filled with different frequencies so I could tell which buffer was being used. No glitches, and perfectly fine on the oscilloscope. Buffers alternated fine without any changes to the buffer-switching logic. The glitch only appears when I make modifications to the buffers, and only when I2C is active. I tried looping the addSampleToBuffer() function to completely fill the buffer right after a buffer switch, delaying any I2C communication until the buffer was filled, but it still glitches. I tried moving the buffer switch code out of the callback and into the main loop, so it couldn't possibly be sending DMA at the same time it's filling the buffer. Still glitches, and as a bonus, sometimes swaps the left and right channels. I'm at a complete loss at this point. Do these peripherals simply not coexist?

Hmm.. from that scope trace, I'm not sure if the DMA is missing most of a cycle or backtracking.

Try having the callback toggle an IO pin every time it executes. That will give you a better idea of the actual timing problem. The glitch in the output happens far enough after the memory glitch that the information is kind of ambiguous.

Brilliant suggestion. I'm kicking myself for not thinking of it. I have learned many things.

I set up 3 test scenarios. The first was to toggle pin 10 based on which buffer was selected, and updated the pin from within the callback. This produced a nearly perfect square wave, indicating that the problem is not in the callback routine. Then, I moved the pin toggling to the main loop, and then the waveform on the scope had some gaps. BIG gaps, like several buffer-switches worth. Finally, set up a loop counter and had it toggle the pin after 300 loops. This produced the same gap as the previous test. Apparently, I2C is stealing so many cycles off the main loop that the buffer could have been easily filled 4 times or more. This is why 4K worth of buffers works. The function that fills the buffers has enough cycles to catch up after I2C hogs all the cycles.

Now, I know we're not supposed to put loops inside of interrupt callbacks, however that might be the easiest fix. If I had it fill a very small buffer, say 32 sample sets or so, it should still return fairly quickly. A better solution would be to find out why I2C (specifically the SSD1306 display update) is stealing so many cycles off the main loop. I'd love any suggestions as to how one can make a display update to SSD1306 more efficient.

Update 1: Audio... is... PERFECT. I set the buffers to 32 sample sets each, and put the loop inside the callback. Absolutely no glitching, and apparently I2C can tolerate it because I don't see any issues with the display.

Attachments

This one is where the pin is toggled every 300 executions of the main loop

toggled-in-loop-by-counter.jpg (234.26 KiB) Viewed 82 times

This one is where the pin is toggled within the main loop, based on which buffer is currently selected

toggled-in-loop.jpg (200.67 KiB) Viewed 82 times

This is where the pin is toggled in the callback routine, based on which buffer is currently selected

Heh.. that puts me even for the day. I spent the afternoon trying to work out the sequence of operations to make a part on the milling machine that's fragile on one end, and has to held firmly while it's tapped at the other end. After a few of hours I realized that I was already milling faces so I could hold the part with a wrench while installing it.

The first rule of hardware and software is, "if it works, it works." Everything else is subject to negotiation.

ISRs that spend a long time running tend to cause the same kind of problem you started with, but in reverse, and occurring somewhat at random. This time you found a case where spending a little more time in the ISR solved a timing problem created by the sequential code.

EricLarge wrote:A better solution would be to find out why I2C (specifically the SSD1306 display update) is stealing so many cycles off the main loop. I'd love any suggestions as to how one can make a display update to SSD1306 more efficient

IIRC, that one transmits the whole screen buffer every time you call display(), and the driver doesn't provide any way to do partial updates. Update timing will be a function of the buffer size and the I2C bitrate.

You can try boosting the I2C clock from the default 100kHz to 400kHz and see if that helps, but it sounds like you already have the right solution: use interrupts to insert fast operations into a slow one.

Do check the headers for the display you're using though. If you can update small areas of the screen instead of the whole screen, it will pay off extremely well. Making rectangles smaller reduces the number of pixels geometrically.

The only other optimization I can think of for the sequential code would be to do as much prep as you can for the ISR. If you're generating the waveform in code, precalculate as much as possible and shove it into an array. Then let the ISR pull values directly or do linear interpolation over short ranges.

So far, things are working beautifully now. I'm using the DMA callback to update counters, which are then used for timing other things in the project. The rest of the stuff isn't nearly as time-sensitive, so the counters are being picked up in the main loop. I've got DMA-driven NeoPixels running without conflicts. It's been fantastic since I got over the I2S hurdle.

Incidentally, I did precalculate a lot of the stuff for the audio, as well as other info needed for driving the neopixels. I'm using bit-shifting to fake floating-point math in a lot of places, trying to be as miserly with my CPU cycles as possible. I've got a lot of checks to make sure things don't update in the main loop unless there's actually some reason to update. It looks like I still have quite a few CPU cycles to spare, as the loop counter I set up says that the main loop is running about 450K times per second. I'm pretty sure most of those loops it's doing absolutely nothing but updating the loop counter due to my checks.