SAM is relatively straightforward to use, and I don't see any obvious reason why you won't be able to make it work on the ESP32 -- you create a sam_memory, SetInput, then SamMain, and are expected to provide "void SamOutputByte(unsigned int pos, unsigned char b)" which will get passed the synthesised audio data. The micro:bit port then translates that into PWM on output pin (see audio_play_source, which uses the speech_iterator_t, then ultimately https://github.com/bbcmicrobit/micropyt ... daudio.cpp in audio_ticker calls set_gpiote_output_pulses to do the PWM config).

(I looked into implementing this for a micro:bit simulator I built at my previous job but due to technical constraints with the way the simulator ran on the server but the audio played on the browser-based client, it wasn't possible to get the PWM timings to work well - just getting music.play() going was enough of a challenge).

But yeah, you will have to do this in C. Although... just a crazy random idea from reading the SAM docs -- you may be able to use their technique of how they reverse engineered the original disassembly, and convert it to (Micro)Python instead of C.

But you might be better off finding a more modern speech synthesis library that wasn't written for the Commodore 64. (At least on the ESP32 you've likely got plenty of RAM/ROM, unlike the micro:bit)