Genome – Radio Times archive now live

Genome
– the BBC project to digitise the Radio Times magazines between 1923 and 2009
is now live. On the site you can find BBC
broadcast information – ‘listings’ - extracted from those editions. You can
also search individual programme titles, contributors and synopsis information.

Our aim on
this project is to curate a comprehensive history of every radio and TV
programme ever broadcast by the corporation, and make that available to the
public. Our first step has been this digitisation of the BBC radio and TV
programme schedules from the Radio Times magazine; the next phase of the
project is to incorporate what was actually broadcast, as well as the regional
and national variations. It’s
one of the most important steps we’re taking to begin unlocking the BBC’s
archive, as Genome is the closest we currently have to a comprehensive
broadcast history of the BBC.

We’re really pleased to get the site
live, not least because so many of you have been asking “when”, “how soon” and
telling us “how useful it would be”. The challenges in making available the
4.42 million programme records so far have been significant - you can read
about some of the recent ones on the Internet blog.

We need your help too though. We’re
looking to you to help us to clean up the data. The scanning process - known as
‘Optical Character Recognition’ - has produced plenty of errors: punctuation in
the wrong places, spaces where there shouldn’t be any or no spaces where there
should, as well as fundamental misunderstandings about who did what.

We’ve made it possible for you to
submit an edit to us, as you use the site. We’ll validate your suggested
changes and publish the ones which are approved.

We’ve also included a ‘Tell Us More’
form, at the bottom of each programme listing, so we can tap into the
collective memory, insight and knowledge of our users, making use of the wealth
of experience out there about our programmes, something we’d like to capture.

We also know that the schedule changed
considerably on occasion, because of events in the real world and we need that
information too.

Additionally, during the process of
building Genome, we’ve identified a few ‘chunks’ of data that are missing from
the database, but due to the way in which OCR works, didn’t get picked up in
the original scans. So, we will be adding this in.

The Radio Times has been published with regional variations since 1926. The magazines
we scanned and the data sets which have been included in Genome are not
exhaustive, rather they represent the ones which we could access and which
covered the greatest areas and variations.
In the future, we will look into the implications of attempting provide
a more complete set of regional data.

We won’t be able to reflect what you
send us straight away, but as we build on BBC’s Genome, it will come in to its
own.

Now that we have published the planned
broadcast schedule, our next step is to match the records in our archive
catalogue (the programmes that we have a copy of in our physical archives) with
the Genome programme listings. This helps
us identify what proportion of the broadcasts exist in a potentially ‘playable’
form, and highlights the gaps in our archive.

It is highly likely that somewhere out
there, in lofts, sheds and basements across the world, many of these ‘missing’
programmes will have been recorded and kept by generations of TV and radio
fans. So we’re hoping to use Genome as a
way of bringing copies of those lost programmes back in to the BBC archives too.

But, even if we don’t have an actual
copy of the programme, we’ll also look to publish related items in our
archives, such as scripts, photographs and associated paper-work. We’re looking in to the logistics of making
some of these items available via Genome.
Clearly, this will in some cases be a long and painstaking task. The BBC’s
various archives contain millions of items spread over 23 archive centres
across the UK, most of them in analogue form. It’s a big job, one we’re looking
forward to reporting back on in the future.

What happens
after 2009 when the Genome data “stops”? Well the information held at www.bbc.co.uk/programmes starts in
2007 (the birth of the iPlayer) and as the Genome data is improved and corrected
(by you!), we expect to start ‘backfilling’ the bbc.co.uk/programme pages with
the Genome data.