Search form

Main menu

Interview with Dave Rice (CUNY)

Barbican Centre, London, July 27, 2014 & e-mail correspondence.

Dave Rice is an audiovisual archivist and consultant based in New York City. He has worked for broadcasters such as Democracy Now! and Channel 13, and organisations like amongst others the United Nations, WITNESS or the Downtown Community Television. He is currently in charge of audiovisual preservation at the CUNY Television1, the largest educational access channel in the United States run by the City University of New York. In July 2014, during a workshop given at Tate in London by Dave Rice, Emanuel Lorrain (PACKED vzw) met him to talk about the digitisation work that he manages at CUNY, his work as a consultant and the open source tools for audiovisual preservation that he has been working on. The interview is followed by email correspondance where he talks about his recent work within the European project PREFORMA and his research on sustainable presentation of video files.

PACKED: Could you describe your background and how you came to work in video preservation and digitisation?

Dave Rice: I became very interested in silent film when I was at high school. While trying to research or view different films it quickly becomes apparent that there is an extreme range of availability and quality of silent film accessible today; some looks great and with some it is an eyestrain to watch; some films are easy to find and some are impossible. Pursuing silent film research makes the role of what the archivist has done between then and now quite apparent. After getting into college and studying more about film and photography, I eventually went to the L. Jeffrey Selznick School of Film Preservation in Rochester, which is a very intensive school within a film archive. I was there for a year, and when I left my first job opportunity was as the first archivist at Democracy Now!

PACKED: What format did you start working with at Democracy Now!?

Dave Rice: We started with DV and DAT tapes. At the time I had a large born-digital but non-file-based collection comprised of minidiscs, DV, DVCam and DAT tapes. The L. Jeffrey Selznick School insists on the importance of having total technical control over archival materials in the course of conservation or preservation. Students there learn all the details about sprocket repair, splicing and print assessment; however once I found myself working in a digital collection, it was a challenge to exercise the same level of collection-control because of my unfamiliarity with how the formats worked. I had lots of training about film, but for tape-based digital media, I had very little expertise to rely on. However, I started trying to preserve it.

PACKED: How did you start preserving these DAT tapes?

Dave Rice: We played the tapes with a DAT player and recorded the audio into a computer. After a month or two we stopped because the results were very unsatisfactory. The DAT tape collection came from a group of journalists that recorded in the field. The tape would switch from 44.1 kHz to 48 kHz or 32 kHz because if the journalists were low on tape in the field they would switch to 32 kHz 12-bit in order to get twice as much duration as the data-rate is half that of 48 kHz/16bit. When we were playing a tape on a DAT player, the data wasn’t able to transfer across the S/PDIF2 connection when it switched to 32 kHz/12bit. When I went to the ARSC listserv3 about this problem, they suggested that I should just use analog cables and digitise that, but to me that sounded very inefficient because it would mean converting a born-digital object to analog in order to preserve it.

Additionally, I could see from the tapes that there was a lot of metadata there; information about what the original clock was set to on the DAT recorder, as well as markers to show where the recording started and stopped. When digitising a DAT tape with this kind of traditional method, I would end up with one continuous WAVE file for three hours that would represent dozens of initial recordings.

PACKED: Did you find a solution which meant that you could extract the digital data directly in order to avoid unnecessary A/D conversion4 and keep the metadata?

Dave Rice: We started doing research to figure out how to do it in the same way that one would with an audio CD. We wanted to move the data over, rather than treat it as an audio object. We eventually learned that with a DDS3 drive we could do so. The DDS tape was a predecessor to the LTO5 tape with which you could backup 1.3 gigabytes of data on a DAT tape. The version 3 of the DDS drive supported reading audio from DAT tapes but later versions got rid of the feature. We bought about four of these DDS3 drives from eBay until we were able to get one of them working. We had to use old DOS6 tools to update the firmware, but once we had it done we had a drive with which we could read audio data from DAT tapes.

We found out that the audio is organised into frames that have headers containing clock information from the original recorder, which lets you know if it's the beginning or the end of a recording. It also has parity information which tells you if the audio data is read correctly or not. With the DDS3 drive, we could access all these reports and quantify the accuracy of our data transfer. It made cataloguing much easier because instead of working with a three-hour file, we had one file per recording. We could also know how much time elapsed between each file because if you're listening to a DAT tape without the context of the metadata, you may hear somebody speaking and then somebody else but you have no idea that perhaps a whole day has elapsed between each recording.

PACKED: Did it also make the transfer quicker than with a DAT player?

Dave Rice: Yes, with the DDS drive we could transfer it faster than in real time. We could access a four-hour tape in about an hour, because the drive is just reading as fast as it wants to read, just like a data transfer. So it's similar to when you copy audio data from a CD; it's not tied to duration as it is with a DAT player. With a DAT player workflow; we would have had to record on one machine and play on the other, and there would be discrepancies because you are only preserving a presentation of the DAT player rather than the data present on the DAT tape. With the DDS3 drive, you get a copy of the recording as it is.

PACKED : What kind of software did you use to dump the files?

David Rice : There have been different software projects written for Linux, Mac and Windows, mostly very long ago. When we were setting it up, we did a study of the different available software for Linux and Mac. For some of it, we had to go to the Internet Archive and use the Wayback Machine to find information. We wrote back to the authors of the software utilities that we were testing and one of them was very interested in our research. He made a new release of his software, which is called DAT Extract. This new version incorporated the advantages that the other utilities had over DAT Extract, so we used that going forward.

PACKED: This workflow is linked to a specific piece of hardware, the DDS3 drive.

David Rice: Yes, but generally there are always about a half-dozen on eBay at any given time, so they're not completely obsolete yet, but it's pretty close to being so.

PACKED: This type of transfer is very similar to what you can do with DV tapes.

Dave Rice: It is very similar in the sense that you can transfer the data coming out of a Firewire port on the deck into a file, so the preservation process is more like a data transfer rather than playing the tape with one device and recording with another. DV as a codec is well supported by both hardware and software. Similarly to DAT, DV has parity data, so as the video deck reads the data from tape the output of the Firewire7 cable with the DV stream contains all the preservation notes on whether the data being read is valid or not. So I was able to make a simple utility to look for all these little preservation notes in the DV stream in order to report where the glitches were.

PACKED: What did the utility do with this parity information?

Dave Rice: Typically, when there was a physical problem on a DV tape or on the head of the DV deck, it would output some sort of concealment to cover up the glitch. For a visual glitch, it would copy portions of the previous frame over to patch it up, and when an audio section had an error, it would just drop the audio to total silence. The exact sample value for silence was reserved as an error code in DV. This helped us detect where the errors were. Quite often we would be trying to do preservation work on DV tapes and the glitches would happen right at the most crucial moment because those were the ones that had been played over and over by the producers and the tape would be finnicky. We would transfer it once, and the second time, the error would be in a different place. Often, we had to do multiple passes over some areas and then assess which one had the least amount of errors. I used that information later when I worked at Audiovisual Preservation Solutions8 to make 'DV Analyser'9, a software analysis tool for DV codecs.

PACKED: Can you explain how it works?

David Rice: Yes. When you digitise a DV tape to a file using a Firewire cable, you get the raw DV codec from the original tape, so DV Analyser will read through that data to find other preservation notes that the video deck makes. Until doing this research, I couldn't find any other software that used it. DV Analyser scrubs through DV files to report on error concealment, dropouts, time code incoherencies, structural issues, to identify problems with DV, before the archivist considers the preservation work complete. It also helps find some interoperability issues.

PACKED: Did you do the first utility together with the IT department at Democracy Now! or do you have an IT education yourself?

David Rice: At Democracy Now! I learned a lot from the IT developer. I would go to the IT director and describe a preservation task I'd been working on and I would say "We have 6,000 CDs we want to digitise, and the process will require copying all the data off into a folder, naming the folder after the identifier, making an mp3 file, making a log to say when we did it, create checksums..." and the IT director would then take cdparanoia, LAME, different command line utilities to make directories, and make a very simple program that just asks the user for the identifier of the CD, makes the folders, makes the logs, makes the .wav file, the .flac10 file, the mp3, and then ejects the disc. So from the user perspective, one enters an ID, puts in the disc, and when it is ejected one repeats the procedure. It makes the workflow simple and efficient and it reduces human error substantially. After seeing that, I got much more interested in figuring out how to automate and script preservation tasks. So I've used computer programming to support preservation activities. I credit the Selznick Film Preservation School with instilling the motivation in me, and with such an insistence on the importance of having a very intimate level of technical knowledge with the format being preserved, but I didn't venture too much into IT beforehand.

Dave Rice: Angelo Sacerdote, Skip Elsheimer and I ran a presentation at AMIA12 concerning quality control within video digitisation projects. From that experience Angelo, who was the Preservation Manager at BAVC13 at the time, and I wrote an application for a National Endowment of Humanities research and development grant. After a few unsuccessful attempts, BAVC received a preservation and access research grant from the National Endowment of Humanities in late 2012 which led to the start of a comprehensive research and development project. To prevent QCTools from being supported solely by one NEH grant, we integrated much of the project with FFmpeg14. FFmpeg does all the decoding and runs the analytical filters that we wrote to create all the technical metadata from the decoded video. We submitted our filter, called 'signalstats' to FFmpeg, which hopefully provides a lot more sustainability and maintenance to the overall QC Tools project.

PACKED: How does QCTools perform quality control on digitised video?

Dave Rice: The idea of QCTools is to look for patterns in video that would be unnatural to have in an ideal digitisation of analog video. Some video deck read tape with two heads; one head reading field one and another head reading field two. We found that we would want to assess the rate of visual change happening on field one as well as the rate of change happening on field two; and if the rate of change within field one and rate of change in field two become more different, then it's likely that there is a head clog. We could see field two presenting the same consistent image, but the other one would gradually turn to snow or random data. It would be unnatural to have an analog video recorded in such a way that all the odd lines are showing something totally different to the even lines.

In another test, we look at a pixel and compare it to its neighbours, and if it's too different from its neighbours, then we know it’s out of place. Normally, if you digitise analog video and you have a pixel with a greenish colour, at least one of the neighbours will have a similar greenish color. But if you have a pixel that is just a white dot in the centre of a normal scene, then we know that it is the kind of white speckle you get from a VHS tape if you have a tracking or a skew error. Those pixels are counted and reported per frame in order to identify tape creases or crackles, or other physical tape errors.

Another test we do is for vertical line repetition. It's not really possible to produce the same subsequent lines of pixels many times in a row for analog noisy videos. But when you send a tape through a time base corrector and the signal is too unstable, it will use a dropout compensator that will repeat the same line over and over again for a few times into your final file. Most time-based compensators doesn't report on this concealment, so it's difficult for archivist to know how much of this is going on. QCTools notes when you have a group of lines that are very close to being identical, because it is indicative of a very active dropout compensator. If archivists can identify these kinds of problems they can clean their deck, try different hardware, try to clean or treat the tape, etc. QCTools tries to make a quantitative evaluation of qualities that are unlikely to be a result of a clean analog digitisation.

PACKED: How is QCTools integrated in a digitisation workflow?

David Rice: Right now, once your material is digitised, you can drag all the files into QCTools and let it make reports on all of them. This could be run as an overnight process after a day of capturing. Currently, it takes about half duration time to report on most of these files, because it has to decode the entire file and send all the data through different tests in order to access all of the information.

PACKED: Can you use it with any kind of video format?

David Rice: It is intended to work with file formats that would typically come out of analog video preservation efforts, but the goal is to have it work on any file FFmpeg can decode such as FFV1, jpeg2000, ProRes, uncompressed, DV, mpeg2, etc.

PACKED: What type of analog video do you digitise at the City University of New York?

David Rice: We have Betacam, Betacam SP, and U-Matic. Then we digitise some tape-based digital formats too like Betacam SX, Digital Betacam and HDcam. We don't do much HDCam, as fortunately most of the operations switched to file-based production when they were starting to use HD.

PACKED: What file formats do you capture onto when digitising analog video material?

David Rice: Most video is digitised onto 10-bit lossless FFV1 video files but we have a lot of Betacam SX that we preserve on 8-bit files. Betacam SX tape stores video in an 8-bit mpeg2 encoding, so when it plays out over SDI15 and goes into the capture card, the 9th and 10th bits are always zeros. The last two bits coming over SDI don't exist on the tape; they're created during the mpeg2 to SDI conversion to pad it out, because SDI has to be 10-bit in this case. So we crop the last two bits, which are always zeros anyway and capture Betacam SX as 8-bit.

David Rice: Yes, we do. Black Magic is active about releasing software development kits to let people develop programs or utilities to use their hardware. Black Magic has been more developed by the open source community than other brands so there are many utilities to bridge a Black Magic card and send the raw incoming video and audio directly to Libav17 or FFmpeg or to other hardware. People can be a lot more creative about how they use this hardware compared to some of the other options.

PACKED: What are the other hardware and software components of your digitisation workflow?

David Rice: Right now, we go from the tape player, then sometimes to a TBC18 depending on the deck, and then into the Black Magic card. We use the Leitch DPS-575, which is a time base corrector that supports sending SDI out.

PACKED: And what do you use to capture the video signal from the Black Magic Card?

David Rice: Previously we have used Final Cut Pro and Adobe Premiere, but lately we have used vrecord which is a command line utility that coordinates the Blackmagic software development kit; bmdcapture, FFmpeg, and FFplay to pass video from Blackmagic hardware through FFmpeg for encoding, and then FFplay to show the video and scopes.

This has streamlined the digitisation process as it makes it easier to capture video directly to lossless codecs. Nowadays normal work computers are powerful enough to perform software-based lossless encoding safely. At an AMIA presentation recently, I had a U-matic deck, a Blackmagic box, and a Thunderbolt connection to my 11" MacBook Pro. I demoed having the raw video come in from the Black Magic card to my computer, using the Black Magic SDK to pass the raw audio and video into FFmpeg which would simultaneously make FFV1, ProRes and H.264 files. I then piped the video back to the display for a preview with waveform and vectorscope images as well. My laptop was able to do all three encodings, plus frameMD519 reports and interpret waveform, vectorscope and preview at the same time.

PACKED: Did you have to change the workflow when you arrived at CUNY, or were they already using lossless formats?

David Rice: As with many production environments, the preservation decisions were often based on production decisions which were in turn based on access requirements, so digitisation was often done using ProRes or mpeg2-based formats. As the archive grew we transitioned to workflows that resolve both preservation and access requirements, which included integrating lossless codecs into many digitisation procedures.

PACKED: Did they use ProRes because they were using Final Cut to capture and edit video?

David Rice: Yes, they were not using ProRes to do preservation work, but to record interviews or shows. ProRes is generally a popular option in production. Around the same time I was starting at the City University of New York, FFmpeg was completing their reverse engineering work on ProRes and releasing it. Archivists often note that ProRes is a bit of an obsolescence risk, because it's only supported by one company. But since FFmpeg completed this reverse engineering work, instead of there being one proprietary implementation, there's now a secondary open source reverse engineered version for the same codec. For archivists dealing with obsolescence risks, having two options is substantially better than having just one to rely on.

PACKED: In you digitisation workflow, do you create a production copy at some point in the process?

David Rice: Yes. In our case, it’s an XDCAM or IMX30 which are particular profiles of mpeg2[1]. The decision is based on the aspect ratio and size and whether it is closer to the 16:9 or 4:3 aspect ratios.

PACKED: Are the production copy and the preservation file created at the same time?

David Rice: Usually yes. FFmpeg has a nice feature which can decode the input once, and then make multiple outputs, so you can make FFV1, XDCAM, IMX, H.264... all in the same decoding step, to reduce the total amount of processing time.

PACKED: Did you evaluate other codecs before choosing FFV1?

Dave Rice: Yes, we also tested Huffyuv20, jpeg2000 and uncompressed. We preferred to use a lossless codec. It wasn't so much for saving storage, but because we could move the data around a lot faster with lossless, checksum it faster, and remove some workflow bottlenecks. At the time, I think Huffyuv was not able to support the particular pixel formats we needed. Because of the tape collection we had, we wanted to support both 8-bit and 10-bit YUV 4:2:2, and I think Huffyuv at the time could not do 10-bit, although now it can do both. As a lossless codec, Huffyuv is not optimised for size, but for speed, so it's one of the fastest lossless codecs.

We tried using jpeg2000 in FFmpeg via libopenjpeg21, but at the time libopenjpeg and FFmpeg were not multi-threaded, so even if your computer had 8 or 16 processors, it would not utilise the power of the computer to do the encoding. So the speed was quite limited. We were only able to encode to jpeg2000 at 6 or 7 frames per second for standard definition, whereas with FFV1 we could do 90. The speed differences were substantial. Since then, I have sent a patch to FFmpeg to support multi-threaded encoding; the frames aren't sliced up into pieces, but multiple frames can be simultaneously distributed to processors and be collected and written to a file. So now, FFmpeg via libopenjpeg can support standard definition encoding a little faster than real time, which is a big advantage over what it used to be.

PACKED: What are the new features of the version 3 of the FFV1 codec?

Dave Rice: One of the features added to FFV1 version 3 was fixity integration; Peter Bubestinger22, myself and others were trying to emulate the same fixity support that FLAC has for audio. If you take a WAV file and you encode it to FLAC, you get this lossless file that is about a third of the size, but the raw audio data from the wav file that is encoded will itself be checksummed and written into the header of the FLAC file. So the FLAC file contains a checksum of exactly what it should decode back to. Even without an external checksumming system, you can test it to see if it's valid as it has a very self-describing fixity. Additionally, the audio in FLAC is grouped into frames, each of which contains a CRC23, so if the overall file is known to be invalid, you can see exactly where the problem is.

With FFV1 version 3 we wanted the same sort of digital preservation-friendly lossless codec, so now FFV1 version 3 mandates that the encoder incorporates a frame checksum into each header. In addition to being able to at least let that error be discoverable, it helps support error concealment, because instead of showing garbage or incorrect data in the event of data corruption, the decoder can show the corresponding space of data from the previous frame, similar to the way DV conceals if the DV deck finds errors in the parity data.

PACKED: So you participated in the evolution of FFV1, along with other people from the preservation sector?

David Rice: Most of the credit goes to the current FFmpeg maintainer, Michael Niedermayer, who's been the principle author. If you go into FFmpeg's commit logs for FFV1 you'll see that there have probably been a few dozen authors all together. It started back in 2003. The original intent wasn't for preservation but to make it an intermediate codec so that people could take video, process it in a certain way and save it to a lossless file and then continue processing it later, instead of having a file, processing it, saving it to a lossy codec and then processing it from there later and introducing generation loss. Even though it's lossy, ProRes was intended for the same kind of work, where you can save and reuse it without substantial degradation. There are probably a few dozen lossless codecs out there all together, but for archivists, a lot of them are disqualified quickly because supporting 10-bit video is a rare if you look in the overall landscape of codecs, which narrows down many of the options. Some lossless codecs don't support 4:2:2, some are RGB only, etc.

PACKED: At CUNY are you doing all the digitisation work in-house?

David Rice: We do it in-house for formats that we can handle. We're doing all of the video in-house. We've been sending audio electrical transcription discs to George Blood because it would be a very complex effort for us to support this type of digitisation. We also outsource the digitisation of very old audio and film material. When there is a smaller collection there isn’t the same kind of incentive to set up the hardware and skills to digitise in-house.

PACKED: Do you have film material at CUNY and if so, do you also use FFV1 as a preservation format for film?

David Rice: Yes we do have films that have been digitised. We have the digital copies stored as both RGB 10-bit DPX24 and RGB 10-bit FFV1.

e-mail correspondence (June 24, 2015)

PACKED: As as part of the European project PREFORMA25, you are part of a team working on a conformance checker26 for a preservation file format for audiovisual material. Could you explain what you and your colleagues are trying to built in this project?

Dave Rice: The PREFORMA project addresses the need for further design and development of tools to assist archivists in assessing, controlling, and fixing certain key open digital formats. These tools will include conformance checkers to assess the adherence of a file to its specifications, a policy checker to test files against a declared policy set, a metadata fixer, and a reporter. The MediaArea team is focused on providing these tools specifically for FFV1 (a lossless video codec), Matroska (an audiovisual container), and PCM audio.

In our proposal we wrote that the development of a conformance checker for Matroska and FFV1 was challenged by informalities and missing pieces within the current specifications which have been under development but not within a standards organisation. Thus we proposed working with the communities of Matroska and FFV1 with the Internet Engineering Task Force to facilitate the official standardization of these formats. FFV1 in particular is at somewhat of a tipping point where more and more archives are implementing it as a lossless preservation codec, so both the standardisation efforts and development of conformance checker tools are very timely.

One aspect of the PREFORMA project that I most appreciate is their commitment to an open source approach. Each of the PREFORMA projects is mandated to use open licenses throughout the project on documentation, test files and code. These requirements encourage participation and oversight and help to ensure that key pieces of information do not get trapped behind paywalls or within proprietary code.

PACKED: To finish, could you briefly explain the recent research you made on sustainable presentation of video files?

Dave Rice: Often when a digital file is played in two different playback applications there can be two different visual results including disparities in color, brightness, aspect ratio, duration, and other significant characteristics. This may be because of differences in feature support or bugs between the two players or because of how each player responds to internal contradictions or lack of technical self-descriptiveness within the file itself. Pip Laurenson of the Tate Museum asked me to do research on this topic which resulted in a workshop and a technical paper27. The paper lists specific significant characteristics of digital audiovisual files and then outlines the interoperability risks that may be associated with them in various playback scenario. The paper also reviews the challenges of response strategies to such interoperability issues such as normalization, emulation and migration.

Notes:

2. S/PDIF (Sony/Philips Digital Interface Format) is a type of digital audio interconnect cable used in consumer audio equipment to output audio over reasonably short distances. The signal is transmitted over either a coaxial cable with RCA connectors or a fibre optic cable with TOSLINK connectors. S/PDIF interconnects components in home theatres and other digital high fidelity systems. S/PDIF is based on the professional AES3 interconnect standard. Source: Wikipedia.

5. LTO is an acronym for Linear Tape-Open, an open format developed in the late 1990s for storing data on magnetic tape. It quickly became a standard and the most widely used format for storing data.

6. DOS is short for disk operating system, an acronym for several computer operating systems that were operated by using the command line. Source: Wikipedia.

7. IEEE 1394 is an interface standard for a serial bus for high-speed communications and isochronous real-time data transfer. It was developed in the late 1980s and early 1990s by Apple, who called it FireWire. Source: Wikipedia.

10. FLAC (Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, and is also the name of the reference codec implementation. Digital audio compressed by FLAC's algorithm can typically be reduced to 50–60% of its original size and decompressed to an identical copy of the original audio data. FLAC is an open format with royalty-free licensing and a reference implementation which is free software. FLAC has support for metadata tagging, album cover art, and fast seeking. Source: Wikipedia.

14. FFmpeg is a free software project that produces libraries and programs for handling multimedia data. FFmpeg includes libavcodec, an audio/video codec library used by several other projects, libavformat, an audio/video container mux and demux library, and the ffmpeg command line program for transcoding multimedia files. FFmpeg is published under the GNU Lesser General Public License 2.1+ or GNU General Public License 2+ (depending on which options are enabled). Source: Wikipedia.

15. Serial Digital Interface (SDI) refers to a family of video interfaces standardised by the Society of Motion Picture and Television Engineers (SMPTE).

17. Libav is a free software project that produces libraries and programs for handling multimedia data, please consult the section #List of project's components. The Libav source code is published under the GNU Lesser General Public License 2.1+. Source: Wikipedia.

18. A Time Base Corrector is an electronic device used to correct video signal instability during playback of videotape material. Source: ScreenSound Australia.

19. A framemd5-procedure is used with audiovisual data to produce one checksum per frame.

23. See: A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents. On retrieval the calculation is repeated, and corrective action can be taken against presumed data corruption if the check values do not match. Source: Wikipedia.

24. The Digital Picture Exchange (DPX) is a file format commonly used to work with digital cinema and is a ANSI / SMPTE (268M-2003) standard.