Linux.com is running a very informative article on Phonon, the new multimedia layer for KDE 4. It explains the rise and fall of Phonon's predecessor aRts and elaborates on the ups and downs of an audio abstraction layer. The article also gives an overview of common use cases and provides some example code. The Phonon website itself provides more code examples and documentation for using the Phonon API and for writing a backend. In addition to the existing NMM backend by Bernhard Fuchshumer, Tim Beaulen is working on a Xine backend for Phonon.

Christian Schaller arguments against Phonon are not valid.
He basically says: You've high-level abstraction, but what to do if you want to get down to the bits and bytes level?

Answer is simple:
Imagine you use GStreamer + Phonon.
All the basic apps like your media-player, your VoIP-Software, System notifications can rely on Phonon and be sure to work in _every_ environment. But you want to use some special audio-editing software, too. This audio software is relying on gstreamer directly. So no problem. Everything will work fine, without getting in the way of each other.

What if your special audio application uses backend XYZ? Simply choose XYZ as backend for Phonon and your "normal" desktop apps will continue working while having your audio-app open.

No, that won't work. Phonon is a simplified wrapper for various backends. As such, it will provide the lowest (or at least, a relatively low) common denominator of features. It simply won't support everything you can do with a particular backend, even if you're using the best one.

For me, supporting many backends is a mistake. It makes it easy for people distros, but end users don't care; they want the best stuff, and their distro should take care of providing it reliably. KDE should choose best-of-breed technology, and adopt it wholeheartedly, rather than hedging its bets and being average.

> KDE should choose best-of-breed technology, and adopt it wholeheartedly,
> rather than hedging its bets and being average.

That's what they're doing. They decided to not rely on the broken GStreamer framework - which by the way is developed since 1999 (7 years) and still offer nothing that end users can use reliable enough.

I've been struggling with GStreamer since it became part of GNOME. It initially sounded like a good idea (and still is) but the implementation inside GNOME (3rd party components) is hackish and incomplete (from a developers point of view) and the easy of use is not given since I encounter crashes since the whole mess became part of GNOME. GNOME is known for hallfassed implementations and incomplete features. Why adopt mistakes from one desktop over to another one ?

The idea is sound. At the moment, lots of different multimedia projects exist, reproducing the same efforts to make codecs etc. This is an insane repeatition of previous mistakes, where proven approaches have already existed for decades. GStreamer is the closest we have to that approach, and if we all get behind it, then free desktop multimedia will focus and improve rapidly.

I don't like the idea of coding multimedia to a GNOME-like C api either, but sooner or later we'll all have to grow up and work together. Frankly, it's been too long in coming already.

We had the same problem with printing and the problem was solved.
Phonon is the way to go.

further software patents are a real risk and dependencies on one single engine could create real trouble due to legal risks.

One issue e.g. what concerned me in Suse 10 was: When I play an mp3 with Amarok and then play an mp4 audio (default format in iTunes I guess), it opens in Kaffeine, so I listen to two audio files at the same time. This should not happen.

It would be nice if applications like these could (optionally!) play nice with each other. In case of music or videoplayers this could mean requesting other players to fade out to a pause if the user starts playback of a song or movie, but in case of an incomming VoIP call the volume could just be turned down a bit, for instance. Would be cool...

Still, it should be possible to just play two or more songs at the same time. If somebody want to do that, why not?

"but in case of an incomming VoIP call the volume could just be turned down a bit, for instance."

Thats actually going to be a planned feature of Phonon. Applications can categorize themselves into categories, like Communication, Notifications, sound... player... thingies (I don't remember what they are official, but stuff like that).

I thought that was for things like routing sounds to a specific piece of hardware? Ah well, maybe it can and will be used more broadly for features like these. Stuff like fading out an allready playing song when starting playback on a new one could then be just policy for that category, as a service provided by Phonon... Hmmm... Sounds pretty cool to me! Go Phonon! :-)

This was discussed in kde-artists before it went down. But I hadn't heard it was actually being planned. Can you confirm that or point me to the docs that say it? I was the one who proposed it, so this might be a bit biased, but if they're planning to do that through Phonon, then that's a unique feature that would justify a new layer, I guess.

"These categories can also become usefull for an application that controls the volumes automatically, like turning down the music when a call comes in, or turning down the notifications when the media player knows it's playing classical music."

You can't escape software patents damages. When they will decide to sue you if you decode mp3, no backend will save you (the same for jpeg and incredible number of audio/video/file format).
When they will enforce the patent of "lissening music with a computer", no frontend will save you also.
The only answer to software patents is... not let them be allowed in your country!
So never give up in fighting them, and spread consciousness about the incredible risks among developers, that too often seem not to understand the danger.http://wiki.ffii.org/SwpatcninoEnhttp://www.nosoftwarepatents.comhttp://www.ffii.org

Xine has a license advantage to GStreamer: Xine is licensed under GPL and enjoys the full copyleft protection of the GPL.

The Free Software Foundation warned against the use of the LGPL, but GStreamer developers didn't listen -- they instead choose to sell out their users to the entertainment cartel.

Xine will always be able to remove the DRM from any crippled multimedia plugin, since the GPL ensures that all plugins must be Open Source.

I encourage all Freedom-conscious developers to stop working on GStreamer, NMM, Helix or any other backend that allows the Media Mafia to ram DRM down consumers throats. Work for Freedom, help out Xine!

Yep, helix sucks. From day one on Linux, they've done nothing that cooperates with Linux multimedia efforts (such as GStreamer), except where it makes their own product more famous, since their windows market is dying quickly.

"The Free Software Foundation warned against the use of the LGPL, but GStreamer developers didn't listen -- they instead choose to sell out their users to the entertainment cartel."

I agree actually. You can't reconcile the two. Does free software exist to allow usage without pointless and artificial restrictions, or does it exist to enforce those restrictions by allowing people to put them in? Xine is also a quality back-end.

Linux needs DRM to survive in the home.
DRMs are neither a good nor a bad thing as long as you know what you are getting.
What linux users are certainly not getting today is any commercial content (unless you steal it). Linux in the work place or the geeks bedroom my not need it, but I wonder how long it will make sense to use in Linux in the living room?

No. DRM is a slippery slope, which will change everything if you give an inch. DRM *IS* bad; it enforces "rights" that content producers aren't legally entitled to, and thereby illegally infringes on users' freedom. Under no circumstances should anyone give into that just because it makes it easier to watch another holywood moneyspinner with no real depth. There are already real alternatives, like podcasting and vodcasting, which will give you more to watch and listen to than ever before, without giving up your rights.

Look, when you use a stick to lever a rock, you either move the rock, or break the stick. DRM is too heavy a rock to be moved by Linux stick, so please, don't break what market share Linux has by stripping ability to play DRM media from Linux.

There is a simple reason why to not integrate DRM into linux-kernel: Linux is Open Soucre and under the GPL.

A license text does not stand above the the applicable law, and in some nations (e.g. Germany) it is a law that you are not allowed to write software or alter software in a way that makes it able to bypass a DRM system.

If we integrate a DRM system into OSS than we defacto destroy its OSS status.

So even without the GPLv3 DRM-terms DRM and OSS are implicitly incompatible.

Unfortunately, MP3 is NOT a software patent. It is a patent for a compression algorithm. As I see it, the problem is not that the patent exists but that their licensing method doesn't work for OSS. That is, I have no objection to paying Thomson my $2.50 for a license for the Fraunhofer patents. The problem is that they don't do business that way.

"Unfortunately, MP3 is NOT a software patent. It is a patent for a compression algorithm."

True, and that's a huge miscpnception people have. Because it is a patent for a compression and audio algorithm, what on Earth can you apply it to? Can you apply it to a format that perhaps uses the same principles? The water just keeps on getting muddier.

"That is, I have no objection to paying Thomson my $2.50 for a license for the Fraunhofer patents. The problem is that they don't do business that way."

True, and it's a trap open source projects have fallen into to the total detriment of everyone.

One issue e.g. what concerned me in Suse 10 was: When I play an mp3 with Amarok and then play an mp4 audio (default format in iTunes I guess), it opens in Kaffeine, so I listen to two audio files at the same time.

This should not happen.

It's easy: you stop the playing in Amarok, then you play your other file in Kaffeine. Since you're already listening to the first one, you'd have to be quite dumb not to realise that there's something being played.

Plus, what prevents you from changing the order of programs associated with mp4 so that it's played in Amarok instead of Kaffeine?

mp4 is usually a video format, so Kaffeine is just fine -- as long as it is video.

It goes like this, you click on the file in your playlist and it opens with Kaffeine.

"you'd have to be quite dumb not to realise that there's something being played."

Come one, you don't manually stop song a and then play song b. you are playing song a and then want to listen to a special audio file (who cares about codecs) b while song a is still playing. As you usually only listen to one song at one time, the default is that your music player stops song a and starts song b. This is the way it *always* works.

That it opens with Kaffeine is no problem, but that Amarok does not stop playing the former file a automatically is. Same goes usually for videos. You don't want to listen to Beethoven and watch a video from Linuxtag together, at least my multitasking capabilities are limited.

It is as you point out just an annoyance, no real problem which could not get fixed by personal activities. But it does mean that I don't like sound on Linux which is not fully ready for me for that small usability annoyance.

Just for your information: The example code in the article is from the ArtsPlayer class in JuK. Of course there's a lot more code around that (for example the setVolume code in the original ArtsPlayer expands to some heavy code for inserting the volume control into the signal path). Phonon hides all those details in the backends and, of course, remembers the output volume between media files.

That's really impressive. There is one thing I'm wondering about, shouldn't the destructor delete the m_media, m_path and m_output objects created in the constructor? Or is there some sort of parent/child relationship going on.

Those two implementations aren't comparable. In the artsplayer::play() method, there are checks for null which handle it and proceed, while in the phonon implementation, the method just returns without doing anything, as if it wasn't even called. Furthermore, the artsplayer implementation seems to handle that engine being disfunctional, whereas the phonon implementation presumably handles multiple engines internally, but is setup and checked for functionality elsewhere in juk. Having said all that, it does look a little nicer to work with.

..My final objection to Phonon is that even if they manage to prove me wrong on their ability to provide a truly useful limited cross framework API and demonstrates that having a menu option offering your grandma to play her music using framework X,Y or Z actually solves more problems that it creates, I still think that it falls short. Because it wouldn't provide an API to do applications like Pitivi, Diva, Jokosker, Buzztard, Flumotion and so on which I think is where we want to be at today in order to provide a competitive desktop. MacOS X and Windows Vista are showing us that this is the role that the desktop is heading towards.

What still I wonder is: there will be mutual exclusion? or media things can co-exists?
From what I understood, phonon will become the media server of kde4 every audio and video apps will output to it, and then everything will pass to the choosen media engine, that can be system-wide or application specific right?
Then non kde apps how can play and connect to their media engine?(for gnome gstreamer i suppose) and then particular apps, like music software, that most of time need realtime support can continue to use directly the hardware, or use a low liatency server like JACK?
This is a little confusing to me, how those different media interact with each other? (note I'm talking on linux platform since it's the one I use)

1. Phonon is not a server, it's a thin wrapper API around whatever media framework you like (btw: gstreamder doesn't have a sound "server" component either.

2. Thanks to alsa dmix, it is possible to run multiple frameworks side-by-side. Thus there is no reason why phonon can't coexist peacefully with each of the multimedia frameworks when used via their native APIs.

No server => no significant latency. I have no deeper knowledge about jack and gstreamer but from what I understood you could have a jack sink in gstreamer that just pours the sound into jack. But again: there is no need for that if you use alsa on linux (which all modern distributions do).

Reading all that noise sounded familiar to me. And then I remembered: Corba vs DCOP. When KDE chose DCOP over Corba, exactly the same kind of objections were reaised, the same kind of noise was made. We were dooming the project, Corba is better, DCOP will never make it, you need a super good IPC mecanism that must be hardware independant, programming language independant, network transparent or you are dead. The overhead of maintaining DCOP will kill you while Corba and orbit is there.

The result:
4 years later, Gnome has struggled with Corba to the point where they almost ditched bonobo. Very few applications in Gnome use the features that super Corba powered bonobo was supposed to provide. On the other hand, DCOP has been so successful that it was picked up as a basis and rewritten by freedesktop folks to provide a wider linux/unix IPC mecanism. All KDE applications support DCOP transparentely and you can do a lot of cool stuff with it.

So, let them talk, let them brag, let them predict the doom of KDE and its multimedia framework. KDE developers have shown that they know how to pick up excellent technical solutions that last over the years (arts being a notable exception). Phonon is the way to go and the people who refuses to see it now will be happy that KDE did that in a few years from now.

That's exactly why Phonon is the right choice for KDE4. It may be right for Gnome developers to change their software everytime GStreamer changes API/ABI (which is too often, sadly), but it's not for KDE.

If KDE used GStreamer, either KDE app developers stick with GStreamer 0.10.5 for the whole KDE4 lifetime, or KDE app developers change their apps for GStreamer 0.10.6, 0.12, 0.14, etc.

GStreamer releases so many versions with so many changes so often that it's a very fast-moving target, and that's definitely BAD.

Heh yea, its kind of ironic that the gstreamer devs would want KDE4 to use gstreamer 0.10. If it actually did, it would mean KDE pretty much forking gstreamer when gstreamer's development moved on in a few years.

I don't see the problem here. If a new version of GStreamer comes out with a new API, then it's up to KDE to port to it, like they would for any other new API, like CUPS etc. This talk that's going around in blogs, that GStreamer folks should maintain KDE's audio layer etc. is totally backwards.

Now, if KDE needs a certain interface for the lifetime of KDE4, all they have to do is open a specific gstreamer library version. It has always been the way, that a simple symlink lets that requested version point to the latest compatible version. And if building with multiple gstreamer versions present is an issue, it only takes an argument to configure, to say where the older header files are located.

Hmm, I think you should go and care for your own business. The KDE people have decided to use Phonon and that's it. We should stop forcing a broken multimedia framework like GStreamer down the throats of users.

Same here, Xine is the only backend thats at all reliable for me in amarok on both of my systems (gstreamer has NEVER worked once), also Kaffine has always been great at playing back at any movie I throw at it (using the Xine backend of course). I've often wished that I could replace aRts on my laptop with a xine based program (when arts plays a sound its all crackily, but not when played with a Xine based player).

The problem is, xine is just a playback framework. It doesn't help people who want to do other multimedia things, like actually make a video. And, mplayer duplicates much of that effort. GStreamer, on the other hand, breaks this all down into components, so that people who know how to encode/decode Quicktime really well can do that, and people who know MIDI can do that, etc. Even if each person has their own project, like Amarok, or Xine, or Blender, they could all contribute whatever improvements they want to gstreamer, and EVERY project would benefit. So, having that common multimedia layer that everyone puts their weight behind is much better in the long run, even if it takes a little longer to make it really come to life. It's been too long already; we really need to start working together on this.