The result of the past few months of hacking

As some people noticed in the PiTiVi community, I haven’t been working that much on PiTiVi over the past few months. I did mention the work I was doing was somewhat related to PiTiVi and that hopefully I’d be able to talk about it openly.

That day has now arrived.

GStreamer editing services

The “GStreamer Editing Services” is a library to simplify the creation of multimedia editing applications. Based on the GStreamer multimedia framework and the GNonLin set of plugins, its goals are to suit all types of editing-related applications.

The GStreamer Editing Services are cross-platform and work on most UNIX-like platform as well as Windows. It is released under the GNU Library General Public License (GNU LGPL).

Why ?

Because writing audio/video-editors is a lot of work, and we should make it as easy as possible for people to write such applications while being able to leverage the power of GStreamer and not requiring a PhD in nuclear engineering.

The GStreamer Editing Services (GES) introduces 3 concepts:

GESTimeline : This is your central container corresponding to a TimeLine, you can add Tracks and Layers to it. It is also a GStreamer Element, so you can use it in any GStreamer pipeline.

GESLayer : This corresponds to the User-visible part of the Timeline. This is where the user lays out the various LayerObjects (files, transitions) he wishes to use. The LayerObjects can be as simple or advanced as required (ex : a FileSource can have a mute property, an ‘overlay’ property, a rotate video property, …) and those objects will take care of properly filing up the Tracks. Applications can create their own subclasses of LayerObjects for their custom usage, implementing the logic of what TrackObjects to create in the background and not have to worry about anything else).

GESTrack : This corresponds to the media part of the Timeline. An audio editor will only require one audio Track, a video editor will require one Audio and one Video Track, etc … These parts don’t have to be visible to the user… nor the application developer

Why another library in addition to GNonLin?

The answer to that is that GNonLin will remain a media-agnostic set of elements whose goals are to be able to easily use parts of streams (i.e. from GStreamer elements) and arrange them through time. While this makes GNonLin very flexible… it also means there is quite a bit of extra code to write before getting to the ‘video-editing’ concepts.

Can I write a slideshow/audio/video/cutter/<crack-editor-idea> with it ?

Short answer : yes. Longer version : yes, but you might have to write your own LayerObject subclasses if you have some really specific use-cases in mind.

Where can I find it ?

The git repository is located here . Documentation can be generated in docs/libs/ if you have gtk-doc , and you can find some minimalistic examples in tests/examples/ . I will be gradually adding more documentation and examples.

GstDiscoverer, Profile System and EncodeBin

GES alone isn’t enough to end up with a functional editor. There are a couple of peripheral multimedia-related tasks that need to be done, and to solve that I’ve also been working on some other items. All of the following can be found in the gst-convenience repository.

GstDiscoverer

Those of you familiar with PiTiVi/Jokosher/Transmageddon/gst-python development might already be using the python variant of this code. The goal is to be able to get as much information (what’s the duration, what tags does it have, how many streams, of what type, using what codecs, …) from a given URI (file, network stream ,…) as fast as possible. While the code already existed, it was only python-based. So I rewrote one in C, with several improvements over it. It can be used synchronously or asynchronously, and comes with a command-line test application in tests/examples/

EncodeBin

Creating encoding pipelines, despite what many people might think, is not a trivial business and requires constantly thinking about a lot of little details. In order to make this as smooth as possible, I have written a convenience element for encoding : encodebin. It only has one required property : a GstEncodingProfile. Once you have set that property, you can then add that element to your pipeline, connect the various streams you wish to encode, connect to your sink element… and put the pipeline to PLAY.

Encoding profiles are not a new idea, and there has been many discussions in the past on how to properly solve this problem. Instead of concentrating on how to best store the various profiles for encoding… I decided to tackle the beast the other way round and, after having sent a RFC to the GStreamer mailing-list and collected as many use-cases as possible, came up with a proposal for a C based encoding profile description (see gst-libs/gst/profile/gstprofile.h). I still have some more use-cases to test and a few extra things to implement in encodebin, but so far the current profile system seems to fit all scenarios.

The remaining problem to solve… is to figure out how to store those encoding profiles in a persistent way for all applications to benefit from them.

Next steps

Using all of the above in PiTiVi But also looking forward to seeing comments/feedback/requests from people who wish to use any of the above in their applications.

In addition to that, I will be at the Maemo Barcelona Long Weekend starting from Friday, where we will try to corner down the UI and code requirements for creating a video editor for Maemo. All of the above should make it much easier to do than anticipated

Well I’m considering dropping out our backend to use stuffs from the gst-convenience repository. We tried to solve the exact same problems. Your code is way better than what we have produced until now.

If the profile specification will fit our needs, I think there is no reason to keep our code. Gnac will be a front-end on top of your libs.

Have you ever think to build a GUI profile editor ?
Where users can easily create their own profiles ?

Will your profiles be generic (not specifying a specific encoder for a particular format), such as those proposed by transmageddon ?
This is a big problem for us. Many “power user” want to be able to specify precisely the output format.

@david, The current work only handles the backend part of encoding. Think of it as a 3 part system [Actual Profiles] [GstProfile API] [EncodeBin]. Right now only the two last parts are handled.

The profile system doesn’t use specific element names, but only a combination of “media type” (ex: audio/mpeg,mpeglayer=3) and “preset” (“VBR/128k”). And it uses that for the audio encoding, video encoding and muxing formats. This is the only sensible way to handle it, after thinking about it many times… using hardcoded element names was just a bad idea.

The remaining part to create are (1) creating profiles and (2) storing and accessing profiles.

Creating profiles have two tricky parts. One of them is finding the right format from a user point of view. For that I might create some more convenience methods using gst-pbutils to list available audio/video encoding and muxing formats in a user-readable way. The other problem concern presets. As you mentionned… power users will want to be able to fine-tune the exact element being used… but we only accept presets and not a collection of properties. The idea here is to be able to create some form of anonymous or explicit preset which would only be stored in the profile but contain the various properties is the preset name. I’m still pondering that last part.

Storing/Accessing profiles is both straightforward or really tricky depending on how you look at it. You can just store/use profiles from your applications (some people have requested that), in which case.. you’re basically done (just store/load them as you want). The problem is storing/accessing them in a global fashion (so that all gstreamer applications can share them), allowing users to add their own profiles, for packages to provide profiles, etc… Initially I wanted to leverage the GstRegistry for that… but didn’t make much sense in the end. I’m still figuring out that part.

Having an application to view/modify/create profiles would definitely be a nice feature. We could even make standard GTK+ widgets so that applications can all have a streamlined interface for accessing/creating profiles. Any help in that area would be much appreciated.

I think the combination of media type choice is a good idea.
This will enable a better evolution of the libs in the future.

The point I’m not convinced is the preset part.
I have maybe misunderstood what it is about, but I think it is a really bad idea.

The point is that you’re trying to build a “standard” and convenient way to handle encoding with gstreamer. Great!

But why design a system that will not satisfy all kind of users?
I think that power users have the right to want, and use profiles of 192 kbits 16bit 48’000hz wav files if they want to I’ve read the debates on mailing lists.. I still cannot understand this preset “limitation”. When this layer will be useful ? How will the set of standard presets defined? Is not this approach a bit an “enlightened dictator” way ? What is going on if the dictator is wrong?

I think that, as it may become a standard way to handle profiles, sharing profiles between apps is a must. Why not storing XML files on specific folder ? Sharing profile with friends, is therefore easy, just send the XML to your neighbor :).

A profile editor can definitively be a great idea.
It fits the, “standard” way to handle things.
As you said, providing standard Gtk Widget, is a benefit for the entire community!

One more time, I see an inconsistency with presets here.
What can you edit in a profile when you’ve have predefined presets?
Then can we build new presets with the profile editor ?
I definitely cannot get what brings this preset layer.