kdenlive, audacity and lessons in audio sync

During the last foss-gbg meeting I tried filming the entire event. The idea is to produce videos of each talk and publish them on YouTube. Since I’m lazy, I simply put up a camera on a tripod and recorder the whole event, some 3h and 16 minutes and a few seconds. A few seconds that would cause me quite some pain, it turns out.

All started with me realizing that I can hear the humming sound of the AC system in the video. No problem, simply use ffmpeg to separate the audio from the video and use the noise reduction filter in Audacity. However, when putting it all together I recognized a sound sync drift (after 6h+ of rendering videos, that is).

ffprobe told me that the video is 03:16:07.58 long, while the extracted audio is 03:16:04.03. This means that the video of the last speaker drifts more than 3s – unwatchable. So, googling for a solution, I realized that I will have to try to stretch the audio to the same duration as the video. Audacity has a tempo effect to do this, but I could not get the UI to accept my very small adjustment in tempo (or my insane number of seconds in the clip). Instead, I had to turn to ffmpeg and the atempo filter.

This resulted in an audio clip of the correct length. (By the way, the factor is the difference in length of the audio and video).

Back to kdenlive – I imported the video clip, put it on the time line, separated the audio and video (just a right click away), ungrouped them, removed the audio, added the filtered, slowed down audio, grouped it with the video and everything seems nice. I about 1h43 I will know when the first clip has been properly rendered :-)

This entry was posted in foss-gbg, KDE. Bookmark the permalink. Both comments and trackbacks are currently closed.

11 Comments

Dammit. I love so many things about Kdenlive and FFMPEG, but I’ve been getting this exact error with audio from a video shrinking for about 10 years now, to the point that I just stopped using Kdenlive. Now, with all of the recent refactoring, I’m hoping that some of the long time bugs will finally get sorted (yeah, I was submitting bug reports), but it still drives me crazy to hear that the exact same thing is still happening after all this time.

ANyway, I hope you filed a bug report with all relevant projects. Thanks and good luck.

TBH, this is more of an ffmpeg issue. When I look at the files it seems that the audio is too short (merged.ac3) when extracted from the original (merged.mts). The filter does not seem to affect this either (filtered.ac3).

The issue is that I recorded the whole event, so 3+ hours, then cut it into three separate videos. I could have stretched the audio for the first one, but the other two needed some sort of offset management as well so I decided to take the easy route.

Hello,
It is often worth trying with WAV audio rather than compressed codecs, with which libavcodec used to warn about “Estimating duration from bitrate, this may be inaccurate”. This has solved sync/seek problems several times (from the moment you split audio from video).
This is a problem in FFmpeg & MLT; it can’t be improved from Kdenlive…

For making the sound of my podcast better I’m using http://auphonic.com/ you should test it too, you get 2 hours per month for free and it works with video too. It makes the speaking person louder and removes humming and stuff like it with help of many different machine learning algorithms specialized on human speach.

* do a test-compile with extremely low compression, to test not only for audio desync, but also areas with extemely low/high/noisy audio or badly angled video (all interesting stuff happening in the corner of the frame).

I have developed a script to support the above called localvideowebencode, available as part of git://source.jones.dk/bin