Like the previously featured articles on new KDE 4 technologies forJob Processes or SVG Widgets, today we
feature the shiny new multimedia technology Phonon. Phonon is designed to take
some of the complications out of writing multimedia applications in
KDE 4, and ensure that these applications will work on a multitude
of platforms and sound architectures. Unfortunately, writing about
a sound technology produces very few snazzy screenshots, so instead
this week has a few more technical details. Read on for the details.

Phonon is a new KDE technology that offers a consistent API to use audio or video within multimedia applications. The API is designed to be Qt-like, and as such, it offers KDE developers a familiar style of functionality (If you are interested in the Phonon API, have a look at the online docs, which may or may not be up to date at any given moment).

Firstly, it is important to state what Phonon is not: it is not a new sound server, and will not compete with xine, GStreamer, ESD, aRts, etc. Rather, due to the ever-shifting nature of multimedia programming, it offers a consistent API that wraps around these other multimedia technologies. Then, for example, if GStreamer decided to alter its API, only Phonon needs to be adjusted, instead of each KDE application individually.

Phonon is powered by what the developers call "engines" and there is one engine for each supported backend. Currently there are four engines in development: xine, NMM, GStreamer and avKode (the successor to aKode). You may rest comfortably in the knowledge that aRts is now pretty much dead as a future sound server, and no aRts engine is likely to be developed. However, aRts itself may live on in another form outside of KDE. The goal for KDE 4.0 is to have one 'certified to work' engine, and a few additional optional engines.

Other engines that have been suggested include MPlayer, DirectShow (for the Windows platform), and QuickTime (for the Mac OS X platform). Development on these additional engines has not yet started, as the Phonon core developers are more concerned with making sure that the API is feature-complete before worrying about additional engines. If the Phonon developers attempt to maintain too many engines at once while the API is still in flux, the situation could become quite messy (If you would like to contribute by writing an engine, jump into the #phonon channel at irc.freenode.org).

When an engine is selected by the user or application, Phonon will use the selected engine to determine what file formats and codecs each backend supports, and will then dynamically allow the KDE application to play your media. As it currently exists in the KDE 3 series, the user would have to manually change engines in each application (Kaffeine, Amarok, JuK, etc.) rather than being able to select engines for use across KDE.

Once an engine is selected for Phonon, it allows the programs to do the standard multimedia operations for that engine. This includes the usual actions performed in a media player, like Play, Stop, Pause, Seek, etc. Support also exists in Phonon for higher-level functions, like defining how tracks fade into one another, so that applications can share this functionality instead of re-implementing it each time. Of course, some applications will want more control over their cross-fading, and so are still free to design their own implementation.

The engine with the greatest progress so far is xine, which I was able to set up and run on my system. I was unable to get the NMM (notoriously hard to compile/setup) or GStreamer engines to compile on my system, whilst avKode is currently disabled by default. I would show you a screenshot of Juk or Noatun playing audio with Phonon, but right now these applications look just like their KDE 3.x versions (only with a somewhat ugly/broken interface!). When they are getting polished for release, I will show them off in a later article.

Matthias Kretz offers a short video which, if you turn your speakers on while watching, demonstrates device switching. Phonon lets you switch audio devices on the fly, and you can hear the specific moment when the music switches from his various outputs (headphones, speakers, etc.).

Matthias also submits the following screenshot of output device selection using Phonon's configuration module. This is also a work-in-progress, and so take it with a grain of usability salt.

There are not many things that I can take a screenshot of which show Phonon in use (screenshots of an audio framework are notoriously difficult to compose!), but I can describe one of the neat side effects of using Phonon: network transparency. KDE has long used KIOSlaves to access files over the network as easily as if they were stored on your local computer. Multimedia apps like JuK or Amarok should be able to add files transparently over the network to their collections without having to be concerned about whether or not the back-end engine is aware of how to deal with ioslaves. This support is already partially implemented in KDE 4, and is most visible through audio thumbnails, which are working for many people over any KIO protocol, including sftp:// and fish:// - two popular protocols among KDE power users. They do not yet work for me due to some instability in the fish:// KIOSlave of my current compilation, but the developers in the #phonon IRC channel claim that it this functionality will be ready and working when fish:// is more stable.

So, Phonon, while still in development, is going to be a great pillar technology for KDE application programmers, making their job easier and removing the redundancy and instability caused by constantly-shifting back-end technologies, and (eventually) making support for other platforms a piece of cake. This means that those developers can spend more time working on other parts of their applications to ensure KDE Multimedia applications shine even more brightly than they currently do.

A couple of quickies here to note: Mark Kretschmann, lead developer for Amarok has officially opened up Amarok 2.0 development this week, and seems to be quite interested in what Phonon can do for Amarok 2.0. He doesn't rule out keeping their own engine implementations, like they currently do in the Amarok 1.4 series. However, given its early stage of development, Phonon can likely be adjusted to ensure that it will do everything Amarok asks of it.

If you're looking for a way to help out with KDE and are not a programmer, Matthias Kretz, lead developer of Phonon (Vir on IRC) has requested some help in keeping the Phonon website up-to-date.

And lastly, a few translations of these articles have been popping up around the world in various languages. Sometimes more than one translation is happening for a specific language. If you are translating or plan to translate these articles, send me a message so that we can save everyone some work and avoid redundancy (lets keep the redundancy-reduction spirit of Phonon alive!).

Please, STOP doing this. I suppose Digg users already read the Digg page, so you don't have to post links here. Same applies for del.icio.us, reddit, spurl, and the boring list of sites that do the same functionality.

Right, but you don't need to post on the dot linking to the digg story about the dot story. Everyone that reads it is already here, reading the actual story; they don't need to read a summary on another website.

the idea of asking people to "digg this" is not to have them go to digg.com and read the article *there*. The idea is to have those few friends of KDE (to be found *here*) who *do* frequent digg.com, to give the entry there a "thumbs up", so it may enter one of the front pages of digg for a while.

Very often, Digg posts who get a dozen of diggs within their first hour (only a dozen!) do get a lot of attention in their second and third hour....

And once on a Digg front page, a *lot* more Digg readers will see it there, and a part of them will come over to the Dot and read the full article here.

Got this?

Still thinking it is a bad idea to tell people about a Digg entry?

If yes, let it be told to you, that your trying to censor friendly people who are commenting on Dot stories and your asking them to stop helping to promote KDE is also a ... bad idea. Got *that*?

Hi,
certainly, having jackd interfaced somehow into Phonon would be great for everybody who wants to make music on Linux.
As far as I understand, jackd itself is not an engine... It is the way to get into your soundcard/midi and stream the data from/to there. So it is roughly at the level of ALSA or OSS...
On the contrary, Phonon relies on "hefty" backends such as xine, capable of doing all the file-decoding for mpegs, oggs, etc; and it is xine that would actually need to talk to jackd (rather than to ALSA, OSS, or so)...
So: it might be a question to the XINE developers to support jackd...

Correct. As far as I know xine supports audio output to jackd since version 1.1.3. As soon as xine tells Phonon it supports jack, Phonon will list jack in the list of output devices (where in the screenshot above you already see arts and esd as soundservers).

The idea is that Phonon will try to guess the correct defaults and give you the possibility to adjust those.

Regarding soundcard setup I'll try to make it smart. For example it currently uses dmix automatically, if that doesn't work fall back to hw and if that doesn't work fall back to plughw. Now the next step is to first test whether hw does mixing already, then dmix might be unnecessary or even counterproductive.

Regarding the soundservers I'd like to be able to help the user configure the soundserver correctly if he wants to use one. All that needs a lot of testing and information about different hardware and setups so that I will never be able to get it right on my own. It would be great if people can help out here.

As far as I know, it needs alsa to work properly, so it works on top. Or am I wrong?

From the documentation:

What is JACK?

JACK is a low-latency audio server, written primarily for the GNU/Linux
operating system. It can connect a number of different applications to an audio
device, as well as allowing them to share audio between themselves. Its clients
can run in their own processes (ie. as normal applications), or can they can
run within the JACK server (ie. as a "plugin").

Jack has two sets of parameter options. The first part are specific to running
the jack server. The second part are run time options for how jack interfaces
with the sound driver - currently only ALSA.

The easiest way to start jack is to run this command:

jackd -d alsa -d hw:0

Of course that gives you very little control over what jack does to the audio
stream and which device you use. You can specify a card name by setting up an
.asoundrc file. Visit the online ALSA docs for your card/device to get one.

No.Rosegarden or other Audio/Video Editing software are not going to Use Phonon.Phonon is NOT Designed for them.Its Designed to support Basic functionalities that __ALL__ of its Engines support.For example Playing and Pausing Audio is supported by all engines.but more advanced things which are needed for Editing are not implemented in all engines.
every video/audio editing application (Usually) uses One Engine.
sorry if im wrong.Im just a user and i know these by reading mailing lists/talks...

As a musician I use all the applications like Rosegarden, Ardour, ... directly with jackd, no need for Phonon. But therefore I have to connect the jackd deamon via Alsa to my prefered sound device.
Now I would be realy glad if KDE via Phonon could make use of jackd. If not, I would have to connect KDE to a different sound device because the one used by jackd is already blocked. Right?

Hopefully somebody will implement a jackd backend for phonon. However the linux kernel has something called dmix which basically allows multiple 'apps' to open the sound device, so it won't be blocked.

The precompiled packages won't work since you need to apply a patch first which lies in kdemultimedia/phonon-nmm/NMM-patches/ :(

So you have to compile NMM yourself (after patching) and then you can compile the backend.

It would be great if somebody could put more effort into the NMM backend. It sure is a great technology but currently nobody is working on it (I keep it compiling and running, but don't have time for more at the moment).

That's sad to hear, I hope more effort goes into it's development again. NMM is seriously cool, and it would provide KDE with functionality making it a level above other desktops in the multimedia area too. I think I would like NMM as the 'certified to work' engine.

I have one question: will there be a way to control the volume for a single (kde) application at a time? Say for instance that I am watching a dvd whose audio is quite low, so I want to push up the volume for the dvd player, but don't want to jump out of my seat or get deaf if I get a mail or some other event occurs, whose notification sound is comparatively louder..

Use http://pulseaudio.org and you'll be able to do this already!
My Gentoo system is already configured to use PulseAudio for all audio output and it works great! No special application support needed. It works as an Alsa plugin, a fake esd and other things so all programs that support either alsa or esd will play through PulseAudio. Apps that only support OSS can be used with the padsp utility (like esddsp or aoss). Monty of Xiph (author of ogg/vorbis) has also written a oss2pulse daemon which create a fake /dev/dsp and route it to PA.
Fedora Core 7 will be 100% PulseAudio. It's the compiz/beryl/xgl of desktop audio.

BTW, it does not try to compete with Phonon, GStreamer, Xine, Jack etc. It complements it.

Firstly, thanks as always to Troy for preparing this for the Hordes hungering for KDE4 news :)

To save everyone the trouble of rooting through the API, I have a couple of questions:

1) I heard rumblings of the existence of per-application volume settings so that (e.g.) you don't have your eardrums blown out when a buddy signs in to MSN just as you're watching the quiet part of your film ;) How will this work, exactly? Any mockups of e.g. kmix?

2) Would it be possible to, for an arbitrary piece of video supported by the currently used engine, extract frames (plus accompanying sound for that frame) one-by-one in some format (YUV+PCM, maybe), process them (maybe adding a watermark, or doing your own custom effects) and then send this altered audio and video frame into another Phonon-based encoder for encoding to, say, XVid?

First of all (as you can see in the screenshot) device preference is set per category of application. The distinction between Music and Video probably needs a better name for the category, though.

If that isn't enough for your needs the application still can override the global setting per category. So if the application provides the switch (it's almost no effort to implement) then you can have that, too.

1) Every Phonon AudioOutput object can be told over dbus to change its volume. For now the only way to remote control the volume of an app is to use qdbus:
% qdbus org.kde.knotify /AudioOutputs/0 Get org.kde.Phonon.AudioOutput volume
1
% qdbus org.kde.knotify /AudioOutputs/0 Set org.kde.Phonon.AudioOutput volume 0.2
% qdbus org.kde.knotify /AudioOutputs/0 Get org.kde.Phonon.AudioOutput volume
0.200000002980232

The value is stored as float in case you wonder about the 0.2.

2) It's on the todo list. But it's not likely to be ready for 4.0 unless more people help out. The idea is to use AudioDataOutput and VideoDataOutput objects to capture the media data and then an AvWriter object to encode and write to a file.

Just a small update since I wrote this article - I discovered I was missing a package on my system to get the phonon-gstreamer engine to build. After installing libgstreamer-plugins-base0.10-dev, it now builds... I had previously only the libgsreamer0.10-dev package and had assumed that it would be enough to build the engine, but tbscope and christoph4 in #phonon helped me to realize what I was missing.

Phonon is (for me) the best part of KDE4. Ever since I heard the proposal, I've been looking forward to KDE4. I have a USB sound card for my laptop which I use with my speakers in one room, but if I take my laptop elsewhere there's no point having that sound card. It's a nuisance to have to manually change the audio output each time (doubly so when said change needs to be done in a config file since xine_part doesn't like to load when it can't access the output).

As much as I'm looking forward to all the other cool stuff coming for KDE4, Phonon has to be the best one for me.

There has never been an official release period, so it cannot have been 'pushed back'. Unlike Microsoft, KDE has not published a roadmap for its next major release. The only previous answers to that question have been various people's personal thoughts on the issue, as opposed to a consensus amongst developers.

KDE 4 consists of several parts. The two principal parts are the platform ( the libraries which KDE applications use ), and the applications themselves. The platform absolutely must be completed before KDE 4.0 is released. The applications can be scaled up or down in terms of features for the release, depending on how much time the developers have. A lot of work has been done on the platform, and as this article discusses, much of it has been rolled into the main development branch. However there are still some items of work outstanding. Some developers feel uncomfortable committing themselves to a release period until this work is completed.

A release team has recently been formed, so a more concrete answer will probably be forthcoming in the next month or so.

Well fair enough but Microsoft did wait till the last minute to put a date on the release of Vista, but it (like many other stories have reported about KDE4) have said "it will be released sometime around early/mid/late 2004-2006 now 2007. I'm a fan of KDE (I use it primarily on my laptop) but KDE4 is looking like Netscape 6, and vista. Lots of pretty concept art and then a late schedule.

In no way am I trying to push on development on the backend ... its better to ship with a stable core than to release crap, espically in this community (the open source/free software)

Well, as anyone who has been around in the KDE 1.1.2 -> KDE 2.0 series transition, backend changes can mean a long development cycle. In the KDE 2 series however, great KDE technologies like kio slaves, kparts, dcop, etc. were born, and it was worth the wait.

(KDE 2->3 was less of a big deal, a lot more code cleanups, less huge structural changes...)

KDE 4 is introducing many new great technologies, and like KDE 2.0 (or OS X 10.0), may lack a little polish when it's first released, but should be a precursor of great things to come. The downside is that the 4.0 transition cycle will be quite a lot longer than 3.0's transition cycle. Honestly, I think that we'd be lucky to see KDE 4.0 this year. However I'd wager we'll see our first alpha released this July for Akademy. As noted above, however, this is an opinion, and there is not an announced release schedule for 4.0 yet.

Note also that 3.5.7 is likely also on its way, within 3 months if irc rumours are true.

sorry, as far as I can recall 2004,2005 and 2006 were _never_ mentioned as possible release dates for KDE4. In the very beginning where QT4 came out and the libs were branched to work on the KDE4 target it was mid2007-end2007 and this still seems to be a good guess if you consider the process which has been made up to now. Really, the dates you mention have never ever been connected to a guessed KDE4 release-date.
(duh, 2004/2005 .. how would that have been possible?)

You should also remember that KDE don't have thousands of fulltime developers working on KDE 4. Considering how relatively few developers that are actually working on KDE, it is amazing how cool a desktop they have created.

I don't really see how KDE 4 can be compared to Vista? There has been no official comment about when KDE 4 will be released and there is a lot more than just concept art available.

I am certainly looking forward to KDE 4 and these articles just makes it worse ;) but KDE 4 will be ready when it's done.

Umm, it's free software, programmed by volunteers. Since when have any of these volunteers ever needed an industry coalition in order to contribute? These sorts of coalitions generally tend to be PR groups only. The Linux kernel exists without some sort of coalition, and would continue to do so even if no one in industry cared.