Professional Video Editing on Linux with Cinelerra

Final Cut Pro on the Mac, and Premiere, for Windows, both provide
professional quality video editing. Cinelerra is the closest
and best Linux equivalent. First released in 1996 (under its original name,
Broadcast 2000), this freely distributed non-linear editor (NLE) was developed
natively and solely for Linux. The program continues to be updated and improved
to this day.

Cinelerra includes many of the features of the pricey professional editors and some extras: real-time visual effects, FireWire input/output, render-farm
capability, and even support for HDTV formats and Ogg Vorbis. The downside is
that its hardware demands are quite unforgiving; the recommended configuration
has a dual 2GHz Athlon system, with 1GB RAM and a 200GB hard drive.

Who's behind this impressive program? We don't know. Cinelerra — along with other very useful multimedia utilities for Linux released from the same author (or authors) — is shrouded in mystery. Though this program's code is available for all to see and contribute to, its creator(s) prefers to remain anonymous for reasons best left to "Jack Crossfire" (a pseudonym, of course). As he explained in an email interview:

"In a shrinking industry like we're in now, managers aren't ready to see staff engineers building killer apps outside their day jobs, and they aren't afraid to get rid of anyone who ignores the system. You can't release software under an individual name when that happens, so 'Heroine Virtual Ltd.' became the entity under which all our content creation tools would appear. We leave it to your imagination how many people are behind it."

The Need to Edit

The inspiration to create Cinelerra was based on "some very basic, practical needs," as Crossfire describes. "Humans need to edit video and audio. Like the typewriter, the multimedia editor makes everything possible: video email, audio email, streaming media, watching TV, virtually everything we do when we're not eating and sleeping," he elaborates philosophically. "In the late 90s, there was no multimedia content creation system on any UNIX platform for less than $100,000. That got Broadcast 2000 off the ground. Then, as a natural course, Cinelerra elaborated on that functionality."

The nebulous Heroine developers have relied on a combination of C, C++, NASM assembly, and GAS assembly throughout the project's life. They found C to be
the most useful for the coding of general-purpose libraries, using C++ for
application-specific code. Between the general-purpose libraries and the
application code, there's a "middle layer" written in C++ as well.

"Unfortunately, platform-specific assembly language is becoming more and more important as newer CPUs rely more on vectored assembly language to gain performance," says Crossfire. "We're looking into alternative languages to C and assembly, which can be easily converted to either scalar assembly or any of the vectored assembly languages out there."

However, many of these existing "vectored C" languages lock development into
IA-32 assembly, do not lead to the best optimization of vectored op-codes, can
be difficult to read, and typically cost a lot of money for the compiler. So
the Cinelerra developers are considering a derivative of Forth as the
best means to produce platform-independent, vectored-object code.

Necessity as the Mother of Multimedia Invention

Heroine incorporated a few outside libraries into the Cinelerra codebase, including libdv and FreeType. The project has also spun off significant multimedia libraries. During Cinelerra's incarnation as Broadcast 2000, no general-purpose MPEG-2 decoders for Linux supported video editing. The Heroine group thus wrote Libmpeg3, a set of almost entirely refurbished MPEG reference implementations. Today Libmpeg3 is developed outside of Cinelerra.

Likewise, Heroine developed QuickTime support for Linux in 1999 out of necessity. "Today, there are many QuickTime libraries, but there are so many application-specific things in QuickTime for Linux — like libdv [and] FireWire wrappers — that it's not clear if it's going to be replaced in the near future," says Crossfire.

Cinelerra's user interface also had to be developed internally. When principal coding on Broadcast 2000 began, GTK+ at the time was not useful enough for Heroine's needs, and Qt had yet to be open source. (Using open source materials is a project requirement.) So Heroine built their own GUI library, with the intention to eventually wrap it around GTK+ in the future.

"Six years later, however, GTK+ and Qt still involve a lot more work than necessary, just to keep up with the API changes and the growing dependencies," says Crossfire. So Cinelerra continues to use the Heroine-built GUI library, because it has reasonably fast graphics rendering, decent object orientation, and easy compilation.

Future Innovation with HDTV

This innovation-by-necessity continues to this day — the program's
developers anticipate that future technical challenges for them will involve
improving (or creating) more multimedia libraries and capabilities for
Linux.

Cinelerra's background render-farm feature is one such example. Its design
involves transparent load balancing and restart detection. Every time the user
performs an edit, the network jumps to work: every node stops, re-syncs with
the editor's timeline, and balances itself with the overall load. In most
cases the user can simply drop visual effects onto the timeline and see them
rendered immediately at full-frame rates — something many
modern-day commercial NLEs still choke at trying to pull off, without the
addition of a specialized graphics card or other hardware.

Theoretically, Cinelerra is able to do this feat even for video footage
under HDTV format. "Anyone with a 100-node rack should try this, since we never could actually afford a render farm to see what was supposed to happen," says Crossfire.

The Cinelerra developers try to make a new release every three months. New
versions are mainly improvements to the stability of the code. "The next best thing is probably going to be selective use of vectored assembly language to speed things up," says Crossfire, but he and his mysterious cohorts see Cinelerra's evolution in improving its wrangling of high-definition video. "The future is HDTV. Right now, you can edit HDTV broadcasts with a render farm and a certain amount of patience, but it could be a lot faster," he says.

The Developer Speaks

"Jack Crossfire" is a pseudonym for one of the developers (or is he just the
only one?) behind Cinelerra. He recently agreed to an interview with the
O'Reilly Network.

O'Reilly Network: So how would you say Cinelerra compares to
Final Cut Pro or Premiere?

Jack Crossfire: Cinelerra will probably never have the
relevance in content creation that Final Cut Pro, Premiere, and, more
importantly, Avid Express have. There isn't the marketing horsepower or the
volunteer programmer army to create a bottomless pit of features. Cinelerra is
more likely to emphasize basic features like color correction, non-destructive
editing, render-farm support, and features that rely more on software than
hardware.

In commercial software, however, you have to show big hardware boxes with
lots of circuit boards and exciting user interfaces to get the attention of the
trade show jocks. The commercial [video] editors are more likely to emphasize
eye-popping features, and features that depend more on hardware than software.
They have a lot of 3D-animated effect icons, smoothly scrolling time lines, and
talking paperclips.

ORN: Does Cinelerra have any features or technologies that we won't find in a commercial video editor?

JC: It's been such a long time since I've actually used
another system for editing content that it's not clear where the advantage is.
Cinelerra probably has a shorter learning curve than commercial packages
because they're piling on shortcut after shortcut to build up their trade show
demos.

Secondly, the commercial packages once had so many user interface bugs that
it justified a new editor. They took a long time to navigate an audio waveform.
They required many steps to accomplish the simplest importing and
exporting.

Finally, you don't have to pay for Cinelerra. The user has complete
ownership of the source code when they download it. That goes a long way
toward peace of mind. If Avid goes out of business or Apple decides to
mothball their content creation business and go pure servers, you're a lot
better off if you have the source code.

No matter what the future of binary formats and operating systems, the only
requirement for running Cinelerra is going to be a compiler.

ORN: Describe the biggest technical challenges you
faced in putting together Cinelerra.

JC: The biggest challenges are mainly software, with
somewhat smaller challenges involving hardware. The capability to route one
[video or audio] track through any other track, and layer any number of [video]
effects under the tracks, was a significant problem. The ability to render the
timeline in the background over a cluster was another problem.

Nowadays, there's a renewed frenzy in ColorModel support. In 1997, everyone
wanted 8-bit Pseudocolor so they could make animated GIFs. Now everyone wants
their own crazy color model, either 16-bit floating point, 32-bit floating
point, 16-bit fixed point, 10-bit YUV. YUV was a big one in [the year] 2000
because everyone was converting VHS to DVD, which is a pure YUV process.
Cinelerra has a choice of what seem to be the eight most useful ColorModels,
RGB and YUV, 16- and 8-bits per channel, with and without alpha.

Now, they're not supported in every operation. This is partly intentional
and partly unintentional in a debugging kind of way. You have to experiment to
make sure the effect you want behaves expectedly in the desired ColorModel.
Supporting YUV and RGB at runtime has proven difficult because the math
operations for each are completely different. The result is, of course, most of
the time you can do internal processing in YUV when source footage is in YUV,
and RGB when source footage is in RGB, thus eliminating several steps.

ORN: When it comes to developing a video editor, what
technical hurdles are there in dealing with video on the Linux platform?

JC: Linux has virtually non-existent import and export
capability for [video] footage. To get to this point, Linux would need three
things:

A way to transfer the raw video data itself between an external device and the Linux box.

A way to seek the external device to an exact position in its storage medium from the Linux box.

A way to convert to and from the external device's video encoding format on the Linux box.

Now, before we get into another war about evil murdering dictators
oppressing free software developers by withholding important technical
documents, the lack of import and export capability is probably going to solve
itself in the long term.

More and more of the compatibility between external devices and Linux boxes
is moving from hardware implementation to software implementation. For some new
high definition camcorders coming out, the interface change is purely in
software. Instead of creating new I/O boards with new registers sets and logic
waveforms, they're using the existing I/O boards while changing the software to
decode MPEG-2 instead of DV. It's virtually impossible to support new I/O
boards, while it's relatively easy to support new software protocols.

That isn't to say the evil murdering dictators should continue refusing
interviews with their driver developers, but free software developers can get
their biggest gains by doing more in software instead of hardware.

ORN: What kind of help could Cinelerra use from
those willing to volunteer their skills?

JC: The biggest contributions would be detailed
explanations of how to crash it. There are a lot of crash situations, but
they're very hard to reproduce.

We're always interested in new developments in image processing, new
directions the content creation industry is headed in, and platform-independent
assembly languages.

Finally, it takes a long time to incorporate outside changes. New code has
to be verified against the current stability, and it has to be maintainable. So
that limits the amount of new code that can be integrated to bug fixes or big,
major features people are going to use all the time.

Supposedly, this is why a lot of programs have macro languages and language
bindings. The problem is, the guys who use the macro languages to add new
features are normally more interested in programming the language than using
the program to create content.

A lot of people like the Autoconf system and want to rebuild the Cinelerra
tree to use Autoconf. That system has proven to be real hard to cross compile
with and it fills the screen with huge amounts of linker wrappers, compiler
flags, compiler wrappers, concealing the important messages. Furthermore, these
systems of layered build scripts and package configurators have grown so huge
that it's become just as hard to hunt down the right script flags as it is to
configure a makefile.

ORN: Any advice for those interested in modifying the
Cinelerra code to help improve/stabilize it? Or advice in developing Linux
multimedia applications that deal with video?

JC: There are very few features that are going to justify
the amount of work required to implement them. Unless you've got millions of
dollars and a large staff of slaves, features that are big, major, and lasting
in impact are the things you'll be most rewarded for in your private software
adventures.

Finally, nowadays everyone wants to use Linux as a front-end to Win32
binaries, writing a few lines of open source code, and calling into Win32
binaries to do the real work. While this may get Linux recognized by one or
two marketing guys and Windows bigots, remember there's nothing like knowing
the code you've written is always going to work regardless of what agenda one
company or another takes.

ORN: Throughout your working on Cinelerra, what
have you personally learned as a programmer?

JC: Moore's law may have applied to CPU clock speeds, but
it doesn't apply to computer systems as a whole. In 1997, we thought general
purpose computers would be fast enough by 2002 for a C program to decode
compressed 2048x1024 video to an abstracted color model, perform any operation
you wanted, and display it in real-time in the display's color model.

Six years later, the instructions-per-clock-cycle, memory bandwidth, and
memory latency are largely unchanged since the Pentium II. The affordable hard drive still maxes out at 20MB/sec most of the time. Affordable memory still
takes 200ns per request.

A lot of the things that were supposed to be absorbed by Moore's law got
done by moving C code to either assembly or hardware-specific implementation.
YUV-RGB display moved from C to XVideo. Libdv is almost entirely IA-32 assembly
language. A massive permutation of color conversions is used, instead of an
abstracted color space. This is all done to get around slow performance, and
it's not very maintainable.

A lot is learned about software planning from doing free software projects
like Cinelerra that you couldn't learn any other way. Producing 150,000 lines
of useful application, finishing something you start, keeping the end in mind
through a long process, are things you can't do any other way.