Interview transcription: boring but necessary

October 19th, 2015

The interview is the most important tool journalists have to obtain information, to expand on information they may have from other sources, and to clarify facts and see things from different perspectives. What is more, the interview is their source of quotes – the exact words said by the interviewee. How do journalists get these words ready to use? There is no alchemy here: recording + transcribing.

The interview transcription is probably the most boring part in the daily routine of a journalist. Whereas taking the interview, meeting and chatting with people is always an adventure and constructing the article is a thought-provoking task, the interview transcription is the least exciting job on one’s to-do list. And yet this ‘necessary evil’ is necessary.

To put it together, all interviews we enjoy reading in magazines and newspapers have definitely gone through the following 3 phases:

Experienced reporters claim that it takes an hour to transcribe 10 recorded minutes. Kooky, isn’t it? Can you imagine a prominent journalist at “The Washington Post”, “The New York Times”, “The Independent”, BBC or any other trendy media doing it? Certainly not – they simply don’t have the time.

Well, here transcription agencies come into play. Actually, this is one of the reasons they exist. Journalists do their best to take the interviews and leave the rest to the transcribers. Alright, not all the rest, just the most tedious part. Once they get their interview transcriptions, they can go on doing their job: composing the interview, shaping up an intriguing story out of a bunch of many chaotic questions.

Although there are plenty of automatic voice recognition services, which might be useful to some extent, let us remind you that:

170 out of 540 million English speakers are not native, meaning they probably have accents (automatic transcription cannot distinguish accents).

Automatic transcription software is helpless with homophones (words that are pronounced the same but differ in meaning).

Voice recognition software captures sound after all. And any interview is a conversation between at least two persons… with different voice pitches, tones and intonations.