If you’re using Ubuntu 14.04, and want to compile a version of the nginx nchan module that works with Redis, this is the guide for you. This is useful if you want to install security updates without recompiling nginx from scratch every time.

This’ll install an old version of nchan, but it will install most of the dependencies that we need.

Test the new nginx version to make sure it works. As we’ve just installed a different version, we might have caused some compatibility issues compared to the standard one (i.e. what’s in the normal repositories)

So you want to get into visualised radio. Great! That’s what we’re doing, too.

This is a very beefy post. When I have more time, I’ll update it to include more detail and justification.

All the national players have automatic vision mixing – so all their videos are automatically generated with a lot of complexity, and also with some huge license costs. How can we achieve this on the cheap, and with high quality?

The solution is easier than you think. I’ll describe how we did it, using an analogue mixer (Sonifex S2) with no auxiliaries. Sadly I didn’t take any photos before writing the article, so it is just a big wall of text.

Some cameras (we used Marshall CV500MB’s) – make sure they can send audio in a recognisable format.

A (rackmount) server

A joystick controller port (USB, in this case)

Solder, some D-sub 9’s, XLR connectors, 3.5mm jacks, and lots of wire

I’ll assume you’ve set up all the cameras how you want them. We’ve used three – a wide angle, presenter side view, and guest side view.

Firstly, we need to make some cables. For each microphone, a Y-splitter. We use this to split the return signal from the processor into two – one goes back to the desk, one is to go to the cameras. We’re not actually using the sound from the cameras, we’re just going to measure its levels, so using a Y splitter doesn’t actually impact the quality.

The next cable we need to make (one per “focused” camera) are an odd one – they are female XLR to 3.5mm jack (replace the 3.5mm with whatever inputs your cameras have). Leave the cold core in this cable completely disconnected – don’t pull it to ground like you normally would when balanced to unbalanced. As above, we don’t care about the audio quality going into the camera, and doing so won’t affect the audio return to your mixer. Connect it up neatly, and do a few sanity checks on your mics to make sure you have the wiring correct.

In our cameras, we had to adjust the audio setting so that it used the line input [from the 3.5mm jack]. The audio levels should then became visible in the ATEM mixer, pre-fade. Don’t turn the channels on. This being pre-fade doesn’t matter too much. As the 3.5mm jack is stereo, and (hopefully) our camera supports stereo audio, we can wire two microphones up to each camera to avoid having to use over-the-top mixing circuits or the like.

Note: to actually get audio directly out of the ATEM mix, we provided it a copy of our PGM from the distribution amplifier to make it happy – we don’t want to use the audio we’re getting from the mics as it’s pre-fade and hence always on, and also it sounds somewhat bad.

That’s great, but how do we monitor events like fader starts, and, most importantly, which microphones are live? The solution: a simple joystick controller.

We created some cables to connect the opto-isolated MIC CUE lights for each channel to the joystick port. This is very simple on the S2, as the cue lights behave exactly like virtual switches. The outputs on the S2 can be connected directly to buttons 1-4 on a joystick port. Make sure you get the polarity the correct way around, otherwise it’ll leave you scratching your head as to why it only sometimes works. Once you’ve made one for each channel, connect it to your joystick port, open up a test application (HTML5 Gamepad Tester is excellent for this), and knock a fader slightly to see if you have a connection. On the S2, we had to fit jumpers to the mic channels (Jumper 1) so that the cue light was latching and not momentary.

Boom! The ATEM can now see the camera audio levels, and our server can see what mics are live. Now, we need some software to tie it all together. Enter libatem.

We wrote libatem to address the alarming lack of ATEM APIs – there are a few existing ones but all in low level languages and designed for arduinos and other embedded devices. This is used in a simple Ruby script that combines it with RJoystick, a Ruby library for interfacing with joysticks on Linux. RJoystick only runs on Ruby < 2.2, as it hasn’t been updated in 6 years, so we used 2.0.0 on the server. If using CentOS or RHEL, update your kernel as only the most recent revision contains the correct drivers.

This is the software we use to tie it altogether. It mixes based somewhat on audio levels and die throws. It also responds in real-time to fader opening and closes.Of course, season to taste.

And there you go! Automatic, operator-less vision mixing, using regular video kit, and more kit we had lying around gathering dust.

Automation in broadcast is tricky. Computers aren’t always the best at deciding what two songs sound right when played back-to-back, and can leave you with some horrible segues. However, this can now be improved using cool modern tech.

tl;dr / Overview

This project required a few tweaks to the music scheduling software, and a Nerve module to automatically fill in information. In AutoTrack, one can use “Characteristics” to determine what songs can follow each other without sounding rubbish. This information and tech came from Spotify, in their acquisition of the Echo Nest. This ensures that songs with similar tempo, energy, and mood follow each other properly.

Loading the Information

Spotify provide a free web API. Nerve has, for a while, tried to tie every track on the system with a Spotify ID. In automation, the hit rate is around 90%. The other ten percent is made up of tracks with slightly different spellings between the Spotify platform and others, and artists who opted out.

It picks either the first result to perfectly match the title and artist, or the top result. The search tags used in the query work with many use cases, and there is not a single track in that 90% that is completely the wrong song. A few remixes are incorrectly tagged, but these are updated when we notice them.

Nerve then makes a request to https://api.spotify.com/v1/audio-features/XXX to load this information.

This information is then written to playout, in this case through the AutoTrack database.

Bonus: Mass Loading Information

When Nerve was created, it did not use Spotify. This was, in reality, an unfortunate decision, as MusixMatch quite frankly sucks. Hence, we had to bootstrap the existing library to load in this information.

Spotify rate limit requests, but that didn’t stop us writing a simple script that, honestly, just iterated through every track in the library, tying it to a Spotify ID where possible.

The next bit (loading audio information) used the https://api.spotify.com/v1/audio-features?ids={} API reference. It loaded 100 bits of track information at a time, significantly faster than before. Not all tracks on Spotify have an audio analysis, interestingly, so after about 24 hours (the 404 search results seemed to enqueue the tracks for analysis) we ran the script again. This time, near enough every track completed.

All code is in the Nerve repository on GitHub, so you can use it free of charge for your own station. Migrating your library fully to Nerve is a bit of a pain. We’re investigating ways to make it easier, but sadly it’s not a priority at the moment. Feel free to hack on the code and make pull requests – we’ll accept ’em.

AutoTrack Scheduling

Four characteristics were defined: Energy, Tempo, Dancability and Mood. 1 through 6 were defined as Very Low to Very High.

Within AutoTrack, the global rules were edited. These became:

These stayed roughly the same for each characteristic. Energy changed a little from the above.

Next, the Clock Rules were updated. The Characteristic Follow rule became a guide, so it can be broken if necessary. It’s rare that this is needed, but initially about 10% of songs weren’t being scheduled/left blank, and we don’t want that.

That’s it. Through this, we managed to significantly improve the quality of automation. 😁

This tutorial will run you through how to build a crazy scalable streaming platform for your radio station, using HLS. This is a long one, so grab a cup of coffee or a Coke.

First, you’ll need to sign up to Google’s Project Shield. This is totally free for independent (e.g. community) radio stations. You can do that here. If you want to use another provider, that’s fine also.

You also require a Linux (or other UNIX-y) server. Although nginx builds fine on Windows, we haven’t tested it.

Although this works on most platforms, there are some exceptions. Notably, in Internet Explorer on Windows 7 and below. For that, you can probably use Flash as a polyfill. We’re not using HE-AAC here, as it’s not supported in all major browsers. If you want to (you’ll break support for Flash/IE, Firefox, and probably Opera – far too many users) add “-profile aac_he_v2”.

Now, we can start nginx with service nginx start and chkconfig nginx on.

Our HLS stream becomes available at http://host-name/hls/stream/index.m3u8, and DASH at http://host-name/dash/stream/index.mpd.

Adding the Plumbing

We have our origin server configured now. Next, we need to pipe some audio into it.

First, install ffmpeg. We need it with “libfdk-aac” so we can get a good quality stream. This will use the most modern HE-AACv2 codec, as it ensures the highest quality at the lowest bitrates. You can follow a guide here for CentOS on how to do so. For Ubuntu/Debian, make sure you have installed autoconf, automake, cmake, and pgkconfig (we did this above). You can then follow the CentOS guide, which should be mostly the same.

In /usr/local/run.sh, add the following (replacing the stream.cor bit with your regular stream URL):

These commands tell ffmpeg to grab a copy of your Icecast stream, encode it with High Efficiency AAC, and send it to the server so that it can repackage it for browsers. If it crashes, it will automatically restart after 1/10th of a second (this is intentional, as if the system is ill-configured, the batch job would choke available system resources).

If you want to get even better quality, you can use an RTP stream or something else in the ffmpeg command. The reason I’m not using this in the example above is that, to achieve that, it requires an overhaul of your streaming architecture to use PCM on ingest. If you have a Barix based STL, you can configure it to RTP send to your streaming server, thus saving the need for an extra audio interface.

Append to /etc/rc.local the following line to instruct the system to automatically restart our script on boot:

/usr/local/run.sh&

(You could do this using systemd or initv for production, but this works well enough)

Excellent. When you run /usr/local/run.sh& or reboot, you should be able to access your streams.

Setting up the CDN

CDNs are designed to copy your content so that it can scale. For instance, instead of having one server (our origin) serving 100,000 clients, we can send this to lots of “edge” servers. Providers make this easy and are more cost efficient.

We’re using Project Shield, because it’s totally free for indie media, and harnesses the same power of Google. Other freebies exist like CloudFlare, but CloudFlare’s terms of service don’t let you use it for multimedia.

The nginx config we used above should allow your CDN to work properly from the onset. It will cache the manifest/index files (that instruct the client which audio segments to get) for 10 seconds. After 10 seconds, the files will have changed so we want the edges to update. The media files will be cached for a day so once the CDN grabs it once, it will never need to again.

Usage & Testing

I’ll post some sample player code on GitHub, but you probably want to use the HLS stream with hls.js. You can test it on this page. Why not DASH? nginx-ts-module, at the time of writing, has tiny gaps between audio segments. These are pretty audible to an average listeners, so until that is fixed we might as well continue using HLS.

Once again, the streaming URLs will be http://yourcdnhost.com/hls/stream/index.m3u8 (HLS), and http://yourcdnhost.com/dash/stream/index.mpd (DASH).

Do consider setting up SSL. It requires a couple of tweaks in your nginx configuration, and LetsEncrypt makes it much easier. Google are working on support for LetsEncrypt in Shield – hopefully when you read this it will be much easier to use, instead of having to manually replace SSL/TLS certificates every 60 days.

Going Further

The nginx-rtmp-module, similar to install, provides adaptive bitrate HLS (but not DASH). However, there is a bug in Mobile Safari that adds silent gaps between the segments. The TS module will soon support adaptive HLS, this guide will be updated when that happens. You can also hack it yourself.

It’s no secret that Insanity Radio is working on a visualised radio platform. This post will be the first in a series documenting how we did it.

For context, we’re a very small team of (mostly student) broadcast engineers. We’ve never engineered a telly station before, so this is an entirely new field and we’re practically re-inventing it as we go along.

Oddly, we’ll start at what most will consider the last step: distribution.

Falcon is built upon a Blackmagic ATEM Television Studio (TVS). The 1080p (HD) version was released a few months after we put in the original purchase orders, which kinda sucked as it limited us to 1080i or 720p. At least we have the built in encoder to work with.

Input to such a device usually would come from a station’s distribution amplifier. In this case, our TVS is our distribution amplifier and runs our main playout facility. Cheap and dirty, but reliable.

When working with online distribution, you can summarise the components in a pretty short list:

Capturing the programme (“PGM”) video and audio.

Transcoding this video to suitable format

Serving this video to users over the Internet

[Logging this captured video]

Capturing PGM

The ATEM Television Studio has a built in USB H.264 encoder. Great. Encoding H.264, even with modern hardware, is computationally expensive. If we could use this, then we wouldn’t have to worry about re-encoding later in the chain (except to downscale – but that’s not important at this point).

Problem: support for this H.264 encoder is shoddy. Live streaming with it is a bit complicated, and the drivers don’t run on Linux (why? Even Blackmagic support don’t know). As Insanity’s backbone is Linux, this left a bad taste in the mouths of the engineers. Moving through it, the first thing we had to do is install Windows Server on a spare rack server, and then the Blackmagic drivers.

Next problem: “Media Express” can’t stream. We looked at several solutions to this that could stream. We purchased an MX Light license, but discovered that it would crash after being left to free run for over 24 hours. Drat. We needed better reliability.

The solution? A piece of open source software called MXPTiny.

MXPTiny doesn’t have many features, so you have to script it yourself. We did this, but were greeted later in the chain with horrible encoding failures – likely due to a bug in ffmpeg streaming RTMP. This wouldn’t do.

After several days of scratching our heads, we came across another piece of software: Nimble Streamer. Although “freeware”, Nimble works off of a cloud configuration platform called WMSPanel. This software is very expensive ($30/month, which is massive for a community station). Fortunately, a Stack Overflow answer said after initial configuration, if you remove the node from WMSPanel it will continue to operate just fine. Great.

MXPTiny

Make sure to lower the video bitrate (<5 Mbps is ideal, otherwise most users will stall). Also, having a high bandwidth causes inconsistent chunk sizes in DASH, which can confuse quite a few players.

Nimble Streamer (via WMSPanel)

Set up a UDP server on localhost:30001

Enable RTMP server for the node

Transcoding

We didn’t need to transcode our video. For us, 1080p25, as captured from the H.264 encoder, was ideal. The less overhead, the better. The solution was easy: serve the .TS files from MXPTiny up to Nimble Streamer, which would instead of transcoding just transmux them into RTMP format.

If you’re looking for a better answer, sorry: we don’t have one.

Serving Video

If you haven’t already, read up on HLS and MPEG-DASH. Insanity’s platform exclusively uses them – the RTMP server isn’t public.

These files were generated using nginx-rtmp-module (the sergey-dryabzhinsky fork). This contained a complete enough DASH implementation to be playable with DASH.JS. Why this and not use Nimble Server? The soonest that we could get back into familiar open source territory the better.

The NGINX server was configured to pull video from our Nimble Server, and to serve falcon’s video in both HLS and DASH formats.

Making It Scale

Video serving is expensive – you can easily saturate a gigabit link with ten odd clients. So, CDNs were created: CDNs are great. Most big media companies use Akamai. Most small companies use Cloudflare.

Available exclusively to small, independent media groups, this was ideal for us. The best bit is that Shield doesn’t have a SLA that prevents you from serving multimedia elements.

As DASH and HLS both create long (ish) lived segments that are statically served to clients, they are ideal to scale out. The manifest files only update when a new segment is published – every ten seconds.

The edge nodes were configured to cache the manifest files for 10 seconds, and to cache the DASH segments infinitely. As the presentation delay in DASH is set to 60 seconds, this allows plenty of time for the CDN appropriately expire and update manifests if and when necessary.

Initial tests showed that if several people were playing the stream through Shield, most client requests hit its cache. With the power of Google’s infrastructure behind the project, we can hopefully sleep well at night with this platform being able to handle the worst spikes. Wonderful.

Logging

The DASH server is currently configured to store/serve 1 hour of video. We’ll likely up this in the future, but in the mean time this allows us to reconstruct video. As there is no transcoding of DASH segments, this is trivial.

A piece of software (Grabby, soon to be on the Insanity Radio GitHub) is able to pull DASH segments and reconstruct them without much overhead. This works by concatenating the initialisation segment with the segments we want. This is done twice: once for video, once for audio. ffmpeg then joins our two temporary files together to create our final rip.

When used as part of this design chain, there is absolutely no loss in quality from multiple encoding. We’re still using the original H.264 data from the Blackmagic, so video is never re-encoded after capture.

This morning I received an email informing me that the Insanity Radio mail server had been added to two email blacklists.

*Groan* go most system administrators. This usually means one of our users’ has had their email account compromised, probably by using a weak password. Most of the time, this is probably right. However, the reasons behind this were more alarming than you’d consider, and actually point to a vulnerability within most Postfix/SpamAssassin set-ups.

A bit of background: Postfix is a mail server. SpamAssassin is a server which is fed emails and responds with a score on how spammy the email is. Combine the two and you have a pretty good way to filter out spam. However, this is where it gets a bit more tricky.

Postfix has a concept of filters – a way of running email through a bit of code. There are two main types, before-queue and after-queue. before-queue will apply the filter before the email is accepted by the server, and after-queue will apply after the email is accepted.

The problem is that most SpamAssassin set-ups work after-queue. This means that your incoming mail server will accept spam emails before it scans them. If it rejects an email later on, it will send a response to the sender saying it bounced, and why.

You’ve probably seen emails from “MAILER DAEMON” saying stuff like “The email you tried to send your thing to doesn’t exist”. It’s pretty much the same.

Now, there’s actually a surprisingly easy way to exploit this. This could be used by an attacker to quite comfortably destroy the reputation of a mail server. Scary, huh? The attack can be done in a few simple steps.

Connect to your target mail server.

Send the most horrible spammy email you can, one that checks all the boxes on SpamAssassin

Set the envelope address to a honeypot address

Rinse & Repeat

The mail server will send a bounce email to the honeypot address, and will get added to an email blacklist as a result.

The best solution is just to install amavisd, and just let that handle interconnecting to SpamAssassin.

Radio is pretty linear. By linear, I mean that your content is only really designed to be played out once; then you move on and it is all but forgotten.

So how do you make this compatible with this crazy new internet-age, where this isn’t the case at all?

Technical Background: The BBC, along with EBU, are working on a system called ORPHEUS. Its end goal is to build a studio that almost automatically does this for you. However, this is going to likely take decades to trickle down to smaller broadcasters, so there’s no reason we can’t work on our own solution in the mean time. This post hopefully outlines some of the requirements such a system actually needs. Some of this is somewhat-plagairised from a presentation I saw at a conference, but we’ve elaborated a bit more on the ideas to make them a bit more realistic.

Linear radio looks a bit like this:

Create a show plan and work out what you want to discuss

Perform your show on air, probably not sticking completely to the plan.

Evaluate how you think it went, get feedback from others, etc.

Repeat.

Creating content for, say, YouTube, looks rather similar:

Create a plan for your short, and work out what you want to discuss

Record it, probably multiple times

Edit the raw video down, add titles and graphics, etcetera. Make small improvements, perhaps by refilming a segment.

Publish the video

Look at the analytics and possibly comments.

Repeat

(For a podcast, you can probably follow the same steps as above, but without the video)

So, it’s clear that the thought processes are pretty similar. But how can you take a radio show and put it on social media?

Well, first, you probably want video to go with the audio. The ideal solution here is an easy thought: record some video at the same time as audio in your logs.

The next thing is a bore: rights management. If you use beds, the license for them may not cover social media. If you catch the intro or tail of a song in a link, you can’t use it. So, we need a way to remove content that could be infringing – or we could somehow license it for YouTube. The latter is operationally hard, so we’ll go with the first and engineer it into our system.

Next, you need to actually be able to locate the segment you want to share. But how do you find it in the recordings amidst songs? You then need to be able to work out when to start and when to end the video. Naturally, you could select a whole link here if it’s not crazy long. We need a way to be able to work out when in the video.

So, three things we need to consider to make internet-ready content:

Video to go with your audio output – how do we make this video good?

Editing the original mix to subtract content – where do we even start doing this?!

Locating where our target links are in this video – how do we find the position of our content?

The next series of posts will look into how to solve our three problems. We will look at 2 first, as the solution provides many benefits.