Category Archives: Recording

I apologise in advance for this rather extreme short essay. But it occurred to me that the processes music goes through in order to be reproduced via LP records would be in the realm of sado-masochistic novel writing if they had not already been invented.

This is meant to be comedy. But it’s not far from the truth!

This is Hi-Fi

So you’ve made this lush production, full-range sound, lots of dynamic range, in stereo. Perfectly edited, vivid, convincing audio with no noise or interruptions, and little distortion.

But you’re not going to couple that transducer to another transducer with a rigid material, but instead attach it to a pointy thing and make it scratch its shape on a spinning piece of brittle plastic? And the plastic doesn’t stay at a constant speed, but starts around 50cm/sec but later slows down to below the speed of professional tape, maybe about 20cm/sec? And you have to put a vacuum cleaner behind the pointy sharp thing that carves so the bits it scratches away are sucked up? And the plastic doesn’t stay where it was put? And there’s dust in the room? And the plastic is affected by temperature?

Now you’re going to attach the diamond to another transducer the price of a house, and yet so delicate that the capacitance of the connecting cable affects the signal badly? And you’re going to roll off the top, boost the bottom so it rumbles, and amplify THAT so loudly that you can hear it but not so loudly that it, too, rattles the little diamond?

The other day, a friend asked a Facebook group how developments in domestic audio might shape up in the future.

Nothing, not even Dolby Atmos, beats the revelation I experienced, in the early 1980s, of

being able to leave behind the distortion and interruptions of vinyl by using CD or tape; and

being able to build better loudspeakers.

The story goes back fifty years, at the time of writing. In 1967, my father bolted a rescued Garrard AT6, the autochanger based on the SP25, into a polished brown radiogram cabinet and wired the crystal cartridge to the input of the built-in valve radio/amplifier. The cabinet had a richly toned single speaker underneath, and this system of glowing radio dial and filaments played his classical LPs, the 7-inch singles my mother inherited from her DJ brother, and my children’s records, together with some shellac 78s from second-hand shops and relatives, played by turning the crystal cartridge over in its headshell.

Amid the clunks and whirs of the mechanism, I was hooked on music of all kinds, and speech recordings too from companies like Saga and Delysé. On command, even a child could make dramatic magic happen in that warm-sounding loudspeaker. The amp with its lethal HT anode kicks was replaced in 1969 by a black box into which my father had installed a Sinclair IC10-based 10W amp, which lasted through the replacement of the cartridge with a ceramic item in 1971, then he built (with me holding the circuit-boards steady sometimes) the Practical Electronics Gemini stereo amp set in 1972. In the years of mass-market stereo LPs on “Music for Pleasure” among others, we played this amp into mis-matched speakers rescued from Army surplus shops. “But it was a stereo hi-fi to US” (to misquote Eric Idle), and the leap forward into “sound sculpted in space” was never regretted.

A Goldring magnetic cartridge followed, and a tuner, and other improvements here and there; but, as a musician-in-training, even when young, it maddened me that the music didn’t sound like it did on the radio or in the schoolroom when we played. There were ticks and pops that you didn’t get from instruments nor did you hear them on the Radio 3 concerts. Sometimes the pitch wavered slowly. Nearly always the sound had a ‘fuzz’ with it toward the end of a side, particularly on French horns, muted brass and sopranos. None of this happened when I taped a cassette off the radio, and it was a crying shame that all recorded music for the masses had this gauze of dirt, this veil in front of it. I learned where every scratch was in the quiet passages of the symphonies and chamber music on the shelf, and was almost surprised when those noises didn’t occur where expected on radio performances. My parents didn’t mind the interruptions, but I knew this rubbish was not music, even though the frequency response ran smoothly from bottom to top and the amplifier’s distortion was almost below measurement. And it was sad that one poorly set-up pickup or arm could damage a precious recording for ever.

Later, in my early teens, the record-making process was unveiled to me, and it seemed strange that good tape copies of the masters were not sold to music-lovers. “Musicassettes” too often sounded muddy, though we rescued a reel-to-reel deck to play home-recorded tapes well; but no decent tapes were available unless recorded from Radio 3 concerts or the Big Band shows on Radio 2 which were superbly presented. As you can tell, I had no idea of the economics of producing tape versus vinyl LPs.

But soon after that I was engineering or producing my own student recordings of good concerts or bands in our departmental studio at Surrey University; and almost simultaneously with my leaving home, the CD came along. Teenagers (as I was then) can tell where 20kHz brickwall filters harm music, but at last, at long last, the music was almost completely pure. It did not waver or wobble. It was not interrupted by ticks and pops, nor by fuzz. Its quality was identical from beginning to end. There was no mourning that the beautiful passages of “En Saga” were harmed by being close to the end of the LP side. There was silence between the tracks or in the rests. Just like in the concert hall or recording session. And, later, it became clear that the recorded sound did not need to be sanitized (albeit, in the hands of a mastering engineer, very sympathetically and musically indeed) to survive the transition from tape to vinyl groove. After this, all else was candlelight. We didn’t have gas in the village where I grew up. (Note to the youngest readers: find Karajan’s statement on digital audio.)

Apart from the gradual increase in amplifier efficiency and very occasional leaps forward in speaker technology (my main speakers are twenty years old, though a better sub was added recently), to answer the original question, I have heard nothing since the advent of interruption-free recordings, whether digital or analogue, that improves my enjoyment of music or drama, except for one thing. The ability to compress the music to suit my listening environment is my primary nod to convenience. Where necessary, my in-car or ‘party’ music on memory sticks has broadcast-style processing added so I never touch the volume control anywhere on the road or while people are chatting over the guacamole (home made) and Cava.

As I become older, one thing intrigues me. If I become thoroughly deaf, and need a cochlear implant, could I tune my computer, as a musical instrument, to the frequency channels and be able to hear (or compose) music arranged especially for those channels? Could that be a thing? Can there be music composed or arranged specifically to be heard at its best through the limited-pitch channels of a cochlear implant, so that permanently and profoundly deaf people might choose to try experiencing music in this way too?

As part of my efforts with the ICOMOS-UK digital committee, I’ve started to collect metadata specifications relevant to heritage and culture.

My aim is to produce a superset with copious documentation and guides to subsets, so that all data is interchangeable. After all, if the Digital Production Partnership can join European and North American delivery standards for television in this way, isn’t anything possible?

The venerable FFmpeg audio/video tool can now package its output in Avid Op-Atom format directly, without always needing to have its output wrapped by the raw2bmx tool. This method is very fast and, crucially, can be used on any computer; not just the machine with your Avid licence. However, certain features of the Avid Op-Atom MXF wrapper are either not yet tested, or not available. For these features, I still advise using the bmxlib suite.

The advantage of this method is that video and audio data is very quickly imported into your Avid, at the full rate that the FFmpeg encoder can manage. Furthermore, your Avid will be using its native formats (e.g. DNxHD), rather than converting, say, XAVC on-the-fly with the AMA functions. The disadvantage is that metadata is quite messy, and lacking certain elements altogether, until I’ve figured out the full MXF Op-Atom metadata tags. Particularly, audio and video tracks are not linked into a single multi-track clip in an Avid bin: you must synchronise them yourself.

So, in this article, I will show you how to take video and/or audio from any format that FFmpeg will read, and output immediately MXF files in Avid-friendly codecs that Avid Media Composer’s “Media Tool” will pick up for you.

This post shows my earliest tests. I have not yet fully explored how to link files, or add other attributes, that raw2bmx adds or, indeed, Avid’s own import and capture tools add. However, the advantage of not using raw2bmx is a much faster import process.

Here, we start with a file of any format with a mono soundtrack, and output Avid DNxHD files at a resolution of 1280×720 and a bit-rate of 90Mbit/s, straight into the Avid MediaFiles directory, and ready for editing. This workflow assumes the frame-rate is for UK television, at 25fps, progressive encoding. I have not attempted to detect the number of audio channels in use, and therefore they are not encoded separately. This is, however, easy to achieve by using FFmpeg’s “asplit” audio filter and is detailed on the FFmpeg website.

We start with producing an output video MXF. Here, we ensure that the video is resized to the particular flavour of HD we’re editing in. In this case, we’re using 720p, and resizing using the algorithm I consider to be the best.

-an

The video output can contain only one track: video, in this case. This command instructs that the output must contain no audio. (Literally: “audio, none”)

-metadata project="MY PROJECT"

Here, we embed into the MXF metadata a value for “project”. This corresponds to your Avid project name. It is true that, upon analysing Avid’s own MXF files, the project name is contained within the metadata tag “project_name”, but my shorter tag appears also to work.

-metadata material_package_name="MY CLIP"

This is the name of your clip, as it will appear in your Avid bins and in the Media Tool.

-b:v 90M

Here, set the bit-rate. Using FFmpeg’s built-in DNxHD encoder, you can choose from several bit-rates, which FFmpeg’s error messages will be happy to tell you about if you set this wrongly. We don’t explicitly set the encoder itself, because FFmpeg does that for you: its default for this muxer is DNxHD.

Finally, for the video file, here is the output file. In this case, I’m putting it straight into Avid’s media file storage area on my ‘M’ drive, for media. There is a "_v1" suffix so that the file is marked as a video file, for my own ease of comprehension.

-vn

After the video output file is named, we start listing the options for encoding the audio file. This first option instructs FFmpeg to produce an audio-only MXF file, without a video track. (Literally: “video, none”)

-metadata project="MY PROJECT"

As with the video file, we embed into the metadata the name of the Avid project that needs this file.

-metadata material_package_name="MY CLIP"

As above, this is the clip name as it will appear in your Avid bins or Media Tool.

-ac 1

This is a quick-and-dirty kludge to mix down all the incoming audio tracks to a single track. In real life, you’ll want to use FFmpeg’s filter “asplit” to handle each incoming audio track separately. At the moment, this command line produces only a single audio file, mixing together all incoming tracks.

-ar 48000

With this command, we convert the sample-rate of the incoming audio to the standard sampling rate for television: 48,000 samples per second. Avid can, of course, convert sample rates on-the-fly while editing, but it is better to perform this work at the import stage to give Avid less to do when editing. Again, the codec itself (pcm_s16le, meaning linear PCM, 16-bit, little-endian) is not explicitly specified because FFmpeg sets this as the default for Avid import.

-f mxf_opatom

As before, this explicitly instructs FFmpeg to wrap your data in an Op-Atom MXF wrapper, ready for your Avid Media Composer to use.

"M:\Avid MediaFiles\MXF\1\MY_CLIP_a1.mxf"

Here is the filename for the audio output. In this example, I have placed it in the same directory as the video output created earlier in this command line. It is suffixed with "_a1" for my own ease of comprehension.

There are additional options associated with this FFmpeg muxer, which are listed here. I have not experimented with these, but can see that the -mxf_audio_edit_rate might need to be adjusted for non-European television or film work e.g. 30000/1001 for American television work, or 24 or 24000/1001 for 24fps film work. Also, you would probably want to set the -signal_standard bt601 or -signal_standard 1 for standard definition television work.

Roland, who make the venerable MIDI to USB interface the UM-1, and its more recent version the UM-1X, claim that they will not support Windows 10. And, indeed, when you install Windows 10 onto a machine with the UM-1X plugged in, it remains unrecognised by the new operating system.

You can fix this with a text editor. Remember also you must set your Windows installation NOT to enforce driver signing. Please see the comments (below) for how to do this.

Download from Roland the driver archive for Windows 8.1. Its filename is um1_w81d_v101.zip.

Unpack the archive.

If you have a 64-bit machine, browse within the archive to this folder:um1_w81d_v101\Files\64bit\Files

tl;dr — I had very disturbing diplacusis (double hearing) during a really bad bout of influenza, but recovered after a month.

The Diplacusis Diary

Being a Tonmeister, and loving music all my life, I didn’t understand what drove some people, even those in my family, to dislike violins. Where I enjoyed beautiful, warm, expressive singing tone, they heard “tuneless cats wailing” or worse.

Whereas the main complainant among my relatives didn’t seem to mind piano music too much, orchestras and violins in particular were, to her, the equivalent of a knife edge being dragged squealing across a china plate.

How could there be such a difference?

Until last month, I had no idea. But now I know.

For three weeks, my right ear has presented me with hideously detuned ghost orchestras, squawking organ pipes, shrieking violins and cracked bells. Music encoded using codecs such as MP3 or AAC sounded like it was being played through loudspeakers whose cones had been torn apart, and any perception of stereo was lost: everything was shifted about 40° to the left, while demonic pitchless musicians wailed over my right shoulder. In short, all pleasure in music was replaced by agony, and my work as a performing musician, occasional record producer and film editor appeared finished.

This is an essay on the ailment diplacusis, and my journey to safety through it. To be more accurate, my particular case was diplacusis dysharmonica, where pitch is perceived normally in one ear, but wrongly in the other. This article is no substitute for a professional diagnosis and a course of therapy from a medical specialist, but it is published to show how a musician and amateur physicist (me) worked through the nightmare, and was healed by the brain and body’s own resources.

Yes, I’m better now and, indeed, most people recover without intervention. But, if you have begun a similar journey, please get checked by the best professional you can find because many different causes lead to the same ailment. Most triggers that the body can’t fix on its own can be cured by pharmaceutical or surgical intervention. Please don’t hesitate.

Where did it start?

I have normal hearing for a 51-year-old, gracefully growing older. There’s a little high-frequency tinnitus but nothing to worry about. Then, in May 2015 began my worst bout of influenza ever. This brought about the kind of coughing and congestion that kills older people.

While blowing my nose rather fiercely, I felt and heard something nasty, probably mucus, shoot up my right Eustachian tube and into my middle ear. Or perhaps too much pressure was used and something inside my middle ear became damaged?

Immediately, I felt a sense of pressure as if my ear needed to ‘pop’ and, as usual, there was a dullness of hearing. This is perfectly normal when the pressures either side of the tympanum are unequal. But also, there was a new acoustic effect, as if my eardrum were in direct physical contact with my throat. Breathing and swallowing became much louder than usual in this ear alone. And popping my ears to relieve pressure changed none of this.

So, in the matter of a very short space of time, I had an ear that felt completely full of something, and that would not respond to the normal procedures. The next day, I was checked by a doctor who wanted me to visit the audiology department at the hospital if things weren’t getting better. The tympanum is translucent, and an expert can diagnose much by shining bright light onto it.

What did I notice?

Day three dawned. Outside my house, off to the right from where I sit for my everyday work, there is a church. The bell, which was being tolled to call the congregation for the morning service, had developed a problem. It sounded as if it been cracked, which was a pity because its sound was normally very pleasant, a reminder that this is a historic and pretty town. Later that day, there was space in the diary to visit the vicar to tell him about the sad accident that had happened in his bell tower in case he’d not noticed.

Then it was time to edit and master some music for a client. Despite the feeling of pressure in the right ear, sensitivity had returned so I fearlessly began work.

The first piece of music wasn’t from the usual excellent producer whose work normally went into this particular project and the difference certainly showed! The whole choir was way off to the left in the stereo soundstage, and the MP4 audio file sounded terribly distorted, as if encoded at a very low bitrate. The right hand channel, particularly, had incredible harmonic distortion and countless intermodulation products. I very nearly fired off a cheery email to my friend who usually provides this material, saying “it’s easy to tell this isn’t from you!”

Then I glanced at the meters and the waveform. The audio was in dual-channel mono. In other words, both audio streams were identical and panned dead centre. What on EARTH was I hearing? Were my speakers or amplifier blown?

Into a separately amplified output, my headphones were plugged. The sound was just as awful. But then the real horror began: turning the cans the other way around, the balance and wild distortion inside my head were identical, as if I’d not reversed the headphones at all.

So I checked just the left channel: and it was perfect. But with the right channel alone, not only was the sound like someone singing through a comb and paper, it was nearly a semitone sharp! The vocal timbre also sounded sped up, like a tape being played through a pitch shifter.

A first response

This was deeply unpleasant. “I’m broken!” was the first thought. After a lifetime of playing and loving music, and wondering why my mother didn’t like musical sounds at all, suddenly all my own pleasure in music was lost. The glory of stereo, “sound sculpted in space”, had gone. I could no longer tell if an instrument or singer was in tune. And judgement on matters of tonal balance was impossible.

Every day in the press, we read about people whose lives have been utterly ruined by accidents. Losing part of one ear is hardly equivalent to being crippled and confined to a wheelchair for ever. And if a person suddenly disabled can find a way through, it wouldn’t be too much trouble for me with one-and-a-half ears and all my limbs still working.

A bit sad for a musician and producer, though — the end of my lifetime’s ambition.

That afternoon, I played piano for a rehearsal. The whole echo of the church appeared routed through a pitch-shifter and screamed mockingly at me like a choir in the worst kind of horror movie.

Analysis

So, that evening, there was time to analyse what was happening.

Speech? All sibilants on the left, and sounding sped-up in the right ear alone.

Sine waves? Fine up to about 2kHz, then bad intermodulation distortion when feed to both ears: and pitch shift above 2kHz in the right ear alone.

Playing the piano? Everything an octave above Middle C and higher was surrounded by a vile cluster of discordant tones.

What about fun with heavily-planned Beatles’ songs, where the vocals or an instrument are fully on one stereo channel or the other? The trumpet solo in “Penny Lane” was unlistenable in part, though the brain did a good job of pulling some of it back into pitch on its lower notes. Over this, I had no conscious control: it was rather like watching a remotely controlled machine at work.

The Nat ‘King’ Cole album “Welcome To The Club” has the vocals bizarrely panned entirely on one channel. You can see where I’m going with this! And, yes, he was singing a semitone sharp. So was my enjoyment of music and my professional judgement over for life?

Over the week that followed, experiments continued. Every morning I’d be woken by the church clock chiming with all its harmonics in the wrong pitch (though the fundamental tone was fine), then I’d try the piano: there were clusters of evil upper partials on every note, and harmonies brought no pleasure or contrast. And recorded music encoded with perceptual codecs still sounded as if played through a class B amplifier with terrible crossover distortion.

Thinking in Physics

What might have been happening inside my ear? The feeling of pressure was still there, and everything above about 1.5kHz was pitch-shifted up.

If the workings of the ear are unknown to you, I suggest that, at this point, you take a look at some Wikipedia entries particularly regarding the tympanum, the ossicles, the cochlea and the organ of Corti. Remember how standing waves are set up along the basilar membrane, turning it into a spectrum analyser.

If you have access to a tone generator, try this: feed 2kHz or 3kHz into headphones, then clench your jaw strongly. Did you hear the pitch of the tone go up? Is the pressure on your ear affecting the bone holding your cochlea and therefore changing its shape, altering the places along the basilar membrane where different frequencies resonate, thereby fooling the brain into perceiving a different pitch?

Maybe something, maybe mucus, was putting pressure constantly on my cochlea, possibly on its oval window, permanently changing the places where resonance occurs when frequencies are higher than about 1.5kHz? This is in line with the place theory of pitch perception.

And perhaps the audio that is normally heavily modified by the MP3 or AAC algorithms, disguised by the normal ear’s processes, is revealed in all its distortion by my suddenly revelatory but damaged cochlea? In other words, the spectral lines that these codecs decide to distort, lost in the ear’s usual perception, are shown in all their awfulness now that they are shifted for the benefit of my aural education.

How to fix my ear?

So at this point, about two weeks before writing this essay, I resolved to get through this in several ways.

Using commonly available open source software, I could have found where the frequency break in my damaged ear was, and design a process that maps frequencies above this frequency to slightly lower frequencies, thus restoring normal pitch perception for headphone use. Perhaps even a digital hearing-aid like this is possible?

Middle ear infections cause pressure in the middle ear, so I was ready to do all that is possible to detect and clear any infection.

I still had influenza and was very congested: so it would have been useful to keep using Olbas Oil and pseudoephedrine to clear any other sinus and Eustachian tube blockages.

Retrain my brain regarding pitch. After all, as a baby, only after birth could the already-formed brain have been able to compare pitch sensations generated by the two ears and, somehow, co-relate them — so why not try to restart the process?

The strong upper harmonics in violins and pipe organs howled violently in my right ear: and, if my family member who hated such instruments also had unresolved diplacusis, perhaps this was the reason for her dislike of such sounds?

Cured

Now, the good news, for me at least. My ear has become decongested in the last week, and the shrill demonic orchestra and choir has faded to almost nothing. My stereo hearing is now back to its normal clean status, and music is a constant pleasure. I didn’t need to make my own hearing-aid, the decongestants seemed to work, and my self-training with tones and careful music listening perhaps helped too.

Sometimes, diplacusis can be healed in this way by the body and brain’s own natural functions. This has taken about a month for me.

If you have just experienced the very disturbing onset of diplacusis, maybe this essay has given you hope? But please get to a hearing specialist as soon as you can, in case your situation is different from mine, and you need surgical intervention.

A full house of 2nd-year and final-year students, along with distinguished staff and alumni, came to hear my stories of music production, laughed in (some of) the right places, and asked a few challenging questions. If you were there and didn’t manage to speak up in the time allocated, please make contact through this blog or through the department.

Interest was expressed in being able to hear or see again the extracts of music and film that were critiqued, so I shall upload them in a way that might be useful to you in the near future.

The IoSR kindly organised some decent playback kit; my inability to see any of my lecture notes was my fault alone, so some of the material below wasn’t used in the lecture. Nevertheless, when it is written-up, it may possibly make sense.

This isn’t a blog version of my talk — you must come to the lecture for that — but you might it helpful to have notes of the recordings I used.

Each and every extract of a recording is accompanied by a critique of the performance or technique exhibited, so can be shown publicly in this context under the doctrine of Fair Use (in the USA) or Fair Dealing (in the UK, Europe and many Commonwealth countries).

Some recordings, e.g. the Stokowski, Stravinsky and Delibes early stereo examples, and the critique of the Elgar and Duke Ellington “accidental stereo” recordings, are still to be added. As is the tape of the Walter Gieseking Beethoven concerto performance recorded in Berlin in January 1945 where you can hear the bombs falling in the slow movement, again in stereo.

Everything is copyright, but you can copy anything on this site and, provided you don't mis-represent me or do anything illegal with it, you're free to reuse it. Not that it's necessarily worth reusing of course.