Download, slice and dice podcasts on Linux

I’m trying to replace my Windows applications with Linux applications. On Windows, I use I use Juice to download podcasts as MP3s. Recently I decided to switch over to Linux for receiving podcasts. After looking around at various podcast catchers (especially ones that ran on the command-line, so that I could automate them with a cron job), I ran across Podracer. I decided to combine Podracer with a script to split long MP3s into shorter MP3s so that I could play them more easily in my car. Here’s what I did on my Ubuntu Linux machine:

cp /etc/podracer.conf ~/.podracer/podracer.conf
Edit ~/.podracer/podracer.conf so that you can pick the download directory you want. I changed
#poddir=$HOME/podcasts/$(date +%Y-%m-%d)
to
poddir=$HOME/rawpodcasts
because I want all my podcasts in one directory where I can do a batch process over them afterwards. Go ahead and run “mkdir ~/rawpodcasts” to create the directory that podcasts will be stored in.

sudo vim /usr/bin/podracer
(it’s okay, Podracer is a shell script). Find the line that says
m3u=$(date +%Y-%m-%d)-podcasts.m3u
and comment it out so that podracer won’t automatically create an .m3u playlist as it downloads podcasts.

Run podracer in “catchup” mode to avoid downloading all the old podcasts from your subscriptions with “podracer -c”. podracer will create a file ~/.podracer/podcast.log to keep a record of all the podcasts that have been downloaded (the “-c” catchup mode creates this text file without actually downloading the MP3s). If you want to re-download a file (e.g. while you’re testing your configuration), you can edit the file ~/.podracer/podcast.log and just delete the line for any MP3 you want to re-download.

Step 2: Install and configure mp3splt (optional)

At a terminal window, type “sudo apt-get install mp3splt”. In Step 1, we configured Podracer to download podcasts as MP3s into a “rawpodcasts” directory. In this step, we’re going to take those long MP3s and split them into individual segments into a new “finishedpodcasts” directory. Make the “finishedpodcasts” directory with the command “mkdir ~/finishedpodcasts”.

Make a file /home/username/download-mp3s-and-process.sh that looks like this.

#!/bin/bash

# Run podracer to download any new podcasts
/usr/bin/podracer

# Now split the podcasts into segments
for i in /home/username/rawpodcasts/*.mp3
do
nicename=`basename $i .mp3`
# Send both stderr and stdout to /dev/null so that this is a quiet cron job
mp3splt -eqd /home/username/finishedpodcasts -o $nicename-@n $i &> /dev/null
done

This script will run podracer to download any new podcasts. Then we list all the MP3 files in the rawpodcasts directory and run mp3splt on each podcast. If you had a file test.mp3, you would be running the command

-e means “split on sync errors.” If someone created an mp3 by concatenating multiple mp3s (e.g. with a program such as mp3wrap), that could cause sync errors. mp3splt looks at those sync errors to split the concatenated mp3 back into multiple mp3 files.

Quiet mode suppresses this interactive question on the last two lines above.

-d is the directory to place the split mp3s.

-o lets you specific an output file. “@n” stands for the track number after splitting. So if test.mp3 were made out of two mp3 files, the output of the command above would be two files (in the finishedpodcasts directory) named test.mp3-001.mp3 and test.mp3-002.mp3 . It doesn’t hurt to run mp3splt on existing mp3s because it will just overwrite any old files that had been created.

Step 3: Periodically download and process podcasts

To download podcast files periodically and process them, make a crontab entry for podracer or your script. This will make the cron daemon run your script every few hours to download new mp3s.

Whenever you’re ready to put the podcasts on some type of media (SD Card, iPod, iPhone, whatever), just copy over anything from the finishedpodcasts directory (if you used mp3splt in step 2) or the rawpodcasts directory if you skipped step 2. Then delete anything left over in either directory.

23 Responses to Download, slice and dice podcasts on Linux(Leave a comment)

I guess, I “save” in mind some details of what I read under specific key words. When I need to recall those details, I use GOOG to search them under those specific keywords. One of the few advantages of being a SEO, you may say

Matt discovered that already in May 2007 and wrote : “Harith, you have an amazing memory for details.”

Matt have you ever thought of doing a post on why your not a Mac person? I would be interesting in reading that review. I have thought about switching to a Mac at times, but still have not done it. No real reason other than I am used to pc’s

Awesome tutorial. I’ve left my ubuntu box growing dust until just the other week when I upgraded to 7.10 from my old Dapper version. I hope to get my audio working in screencast mode so I can contribute a few tutorials focused at the entry level user/webmaster…eg adding gFTP, skype, some basic GIMP tutorials. Thanks again for this tutorial, it was very clear. I’ll give it a whirl tomorrow. :thumbsup

“Rumor has it that Emmy has left the house last Friday to the plex and started tinkering at the data centers.”

Nope nope. For one thing, she’s a stay-at-home cat, Harith. Emmy also has the ability to shut off a computer merely by walking across a keyboard, so we have to keep her far away from Google data centers.

Added: Harith, you can see where Emmy might want to hide in a backpack to get to a data center though. I have to be careful that I don’t pick up a stowaway hiding in my backpack.

I think stowaway cat takes the cake! I could work on tech issues all day with companionship like that. Mine does the same, likes to type out gibberish as she walks on the keyboard… perhaps IM buddies I don’t know about.

Hi there Matt and other readers. Just wanted to thank you for such a clear, concise, instructional post, that does EXACTLY what I needed! Up until now I have been manually downloading my podcasts using Firefox Live Feeds cut and pasted into a terminal ssh session, running aria2c all during my ISP “peak” time. No more! Now first thing in the morning, I can open my ~/Downloads/Podcasts directory I my already running session of mocp and there they all there from the previous night! And all downloaded in my ISPs offpeak hours! I know there is nothing revolutionary in these instructions, but it just worked for me! So thank you very much and keep up the terrific work!
Bill