I've restarted the project of putting together a simple sterep audio mixer. So far, it supports volume and panning settings for wavs and clips. Output is to a single stereo SourceDataLine.

My hope is that if this java code sits on top of something like libgdx, it can route this line to whatever it is that Android supports for playback, making it look to Android like a single outgoing wav or something. (Haven't tested this yet. Am in process of learning my way around Linux--have Ubuntu now--and plan to set up an Android emulator there in the next week or two.)

Unlike the previous version, this one never processes a frame (sample) of sound from a track unless it is the current frame. I iterate across the tracks and only read one sound sample from each, rather than a buffer's worth. There were doubts expressed that this would cause all sorts of performance problems, but tests seem to indicate all is fine. Last night, I ran 16 wav files simultaneously, and in another, 32 simultaneous clips all at different pan positions. No problems to report.

I basically have a wav wrapper and a clip-type wrapper supported so far. The clip is in two parts: a class that stores the clip data in RAM, and another that manages a set of cursors for multiple playback. There's a nifty non-blocking queue used for storing cursors that are ready to play (when they finish, they are re-entered into the queue automatically).

clip.play(speed, volume, pan); // of course, there is some setup first

speed is a multiplier--for example 2 will play the sound twice as fast, 0.75 will slow it down some (no I didn't support negatives--but it should be quite easy to add, actually, just a matter of adjusting the start point to the end of the clip!)volume goes from 0 to 1, a multiplier. (I plan to add a VolumeMapping function so that 0.5 actually sounds like it is at half volume.)pan goes from -1 to 1, with 0 as center.

All the tests have been using Thread.sleep() increments to space things out and the response is pretty good, probably fine for most game applications. There is a bit of variability that is probably directly related to the size of the cpu slices. I wouldn't want to use it for reading a musical score that requires playing a series of clips in perfect time. I just started working on an event-reader that is accurate to the frame (e.g., 1/44100th of a second). Will report progress on that. It makes use of nsigma's advice to handle these events in a single audio thread shared with the mixer, to avoid blocking problems.

No support for ogg or mp3. If you want to load an ogg or mp3 into RAM though, for clip playback, that should be easy to add.

Biggest drawback is perhaps that I haven't written in the ability to add or drop tracks yet while the mixer is running. Currently,the mixer iterates through all tracks for each frame, skipping those that are not "running". But once the audio-event-reader works, it should be doable to add this as part of that process.

Reinventing the wheel, again. There are a lot of great audio tools already in existence! But I want something for my games and that can play my Java FM synth sounds, and I want to learn about audio programming.

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

Not posted yet. I only got it working last night, and probably should do a bit of cleanup to make it more presentable. Also, is just having the wav & clip wrappers and no volume maps and need to turn the mixer off to add/delete tracks (you can stop and start existing tracks dynamically) sufficient to be useful?

It would be awesome if you were interested in testing/trying out what I have. I could maybe put together a jar by Monday. (I work all day tomorrow, and have a concert to attend in the evening where a new composition of mine is going to be played!)

Also, no audio-event-handler yet. First step: I now have a PriorityBlockingQueue that holds crude AudioMixerEvents, but the only thing that happens so far is that if an event's frameTime matches the frame being processed, it prints "click" and the time and the frame #. It's not set up to, say, play a clip track yet. Have to figure out how to set that up. Probably next week unless a simple solution pops into my head out of nowhere.

So all cuing right now is real time commands to the clip, independent of the mixer thread.

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

It would be awesome if you were interested in testing/trying out what I have. I could maybe put together a jar by Monday. (I work all day tomorrow, and have a concert to attend in the evening where a new composition of mine is going to be played!)

Which do you suggest as a best first try for integration with the new mixer?

As for progress on the mixer, I was working on the javadoc and have one more class to build, to allow clips to loop, when the 6-months old graphics card blew. Just got computer back from the shop. Hoping to get the jar up for tryouts soon.

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

The source is included, under GPL for now, and also includes a single .wav file of a bell that I have used in several test/demo examples (and takes up 99% of the jar). I made an attempt at creating JavaDoc for all the public methods--hopefully it will be helpful.

"Per frame" means per audio frame, or audio sample. The mixer only reads one frame (sample) at a time from the mixer tracks.

Click the jar to hear the sample tests:

1) single PFWavTrack playback of the a4.wav (a bell)

2) backwards PFClipShooter playback of the same bell

3) 32 bells spread out, left to right, with pitch variation, spaced by repeated Thread.sleep(100) commands. The timing is not shabby, though not dead accurate enough for real music system. That will come once the "per frame audio event queue" is implemented.

5) Combo of (a) ClipLooper of the bell (pan center), where the ends of the file are overlapped and cross-faded (creating a potentially infinitely long ringing) with (b) ClipShooter of a low bell (left) and some high bells that iterate from right to left, each with a smaller volume. At the end, the volume of the ClipLooper is brought down via a series of 0.005 volume increments spaced 10 millis apart.

Only one format currently supported: .wav, 44100fps, 16-bit, stereo, little endian signed PCM

Mono NOT supported--you have to mix your mono to stereo for use in this system. Mono for some clips would be a good thing, and I hope to get to it before too long.

Caution: No checking for overflow!! If you exceed 16 bits capacity, you will hear a really ugly loud noise. When that happens, figure out which of your files to play at a lower volume. This can be worked out prior to shipping your game.

Volume and pan changes that are "too large" will cause clicks. I recommend making a max volume change of maybe 0.005 at a time. Not sure what the pan tolerances are yet.

Volume does NOT map well to dynamics as we hear them. Coming: some volume maps so that volume changes are more evenly distributed between 0 and 1. Pan seems to behave better, so I'm not sure if/when I'll bother with working out a mapping for it.

For some reason, I can play 16 PFWavTracks within the Eclipse IDE, but I can't play even 2 as a stand-along jar. I do not understand this, but perhaps it has something to do with the audio files being compressed, and thus slower to read? For now, I recommend sticking with sounds you can load into memory (clips).

I haven't tested how many PFClip.. tracks can run concurrently yet.

COMING: an event queue within the audio mixer thread, for audio-frame-specific events. That means there will be accuracy up to 1/44100th of a second, if you know precisely during which audio-frame you want to the event to occur. Also coming: various tools for constructing sound environments using paramatized randomness (e.g., windchimes), and some nice FM Synth sounds, procedurally generated. (Those are my ambitions.)

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

Looks interesting. For your volume control, this is quite an interesting article - http://www.dr-lex.be/info-stuff/volumecontrols.html Or to summarise, volume should be logarithmic not linear to better suit how we hear. A usable approximation is to use the power of 4 of the volume setting passed into the setVolume() method. You'll probably need to relook at your panning algorithm too - google for equal power panning.

I'm not sure I understand your reasoning for processing this on a per-sample basis. It might work OK, but it is still an inefficient way to do this. You've basically got the JavaSound buffer being filled as quickly as possible every 1/20 of a second (your 1/5 in the comments is wrong - each sample frame is 4 bytes). Therefore, you can only post events into your sample accurate queue every 1/20 second anyway, really. Internally you can use arrays of floats as buffers, but you don't have to process all the buffer in one go - at the beginning of each cycle, order the events, work out how many sample to the first one, and process that many samples, then to the next event, etc. Hope that makes sense.

Looks interesting. For your volume control, this is quite an interesting article - http://www.dr-lex.be/info-stuff/volumecontrols.html Or to summarise, volume should be logarithmic not linear to better suit how we hear. A usable approximation is to use the power of 4 of the volume setting passed into the setVolume() method.

I'll check it out. I've been experimenting with exponential/logarithmic/trig volume mapping algorithms, and used them for synth envelopes, e.g., the FM "SpiderBell" and found them quite helpful. It just didn't make this iteration because I didn't want to complicate the api, and it is something that can well be implemented at the 'trigger' end--external to the mixer.

Yes, this is something I haven't addressed yet, and having the search terms is very helpful. But I still think that the simple algorithm being used may often suffice. To my ear, the perceptual distortion is not as in-your-face as the distortion of the volumes.

Quote

I'm not sure I understand your reasoning for processing this on a per-sample basis. It might work OK, but it is still an inefficient way to do this. You've basically got the JavaSound buffer being filled as quickly as possible every 1/20 of a second (your 1/5 in the comments is wrong - each sample frame is 4 bytes). Therefore, you can only post events into your sample accurate queue every 1/20 second anyway, really.

Ahh, got it. The output buffer is set to 8192 bytes, but yes, that does come to a much smaller latency given 4 bytes per frame. My oops.

OK, here is my reasoning, and I welcome your giving it a going over, shooting it down if it is faulty!

I'm seeing latency as a multiple stage problem, the sum of various contributors. I'm taking the view that there are three latencies in a mixer: the read, the processing, and the write. There is also the inherent latency or real-time variability/unpredictability caused by JVM switching.

I don't see any way to get around reducing the write latency--that's currently set by the 8192 byte arrays being sent to the SourceDataLine.

However, by making the read and the processing be single sound frames, the data reaching the audio thread is getting there at the earliest possible frame, limited to the JVM thread switching timing constraints.

Example: suppose we have 4 tracks and an input read buffer the same size as our write buffer (1/20th of a second). I am assuming the audio thread can be interrupted by the JVM at any time. Suppose a "play" command originates from the GUI thread and the JVM switches after processing two of the four mixer tracks, making a volatile "running" boolean change to TRUE. This will now be visible to the mixer track as soon as the JVM switches back. If the mixer track in question is one of the two that have already been processed, the earliest the "running = TRUE" will take effect is after the end of this block of time being processed and the next block is initiated. But if the "block" being processed is a single frame, it will be processed during the very next frame.

Yes? No?

The relevant question, it seems to me, is just how much we slow down the efficiency of the audio processing by using this admittedly odd method. (Golden rule being violated: It is *always* better to grab and use contiguous blocks of data.) I am not able to measure that, and I've been repeatedly finding Java making a fool of me when I try to optimize. So I took the plunge and tried this method to see/hear what would happen. It seems to me that it kind of works and is worth further investigation.

I'm willing to sacrifice a bit of processing efficiency if it makes the "real-time" behavior a little tighter. The question is how much.

Quote

Internally you can use arrays of floats as buffers, but you don't have to process all the buffer in one go - at the beginning of each cycle, order the events, work out how many sample to the first one, and process that many samples, then to the next event, etc. Hope that makes sense.

This makes perfect sense and is one of the plans I have been considering for audio event-queue processing. But before locking in an event-queue cycle as a component of the built-in latency, as well as adding the complications and costs of processing varying blocks of data in the read and processing stages, I wanted to try this "per frame" algorithm.

Possible danger of using the event queue cycle: the events don't necessarily materialize in the real-time order of occurrence. A late-arriving event could have it's "real-time" mismatch compounded by just missing an event-queue cycle.

*****************

There's a fair bit to be done to test if this api is practical or not. I'm not clear yet if it works to require the mixer be "off" when adding or deleting tracks. Also, have to test what happens if you attempt 20 or 30 tracks for a complicated level. Even if only a half dozen are playing at a given time, touching each one via a per-frame algorithm might balloon the inherent inefficiencies and cpu cost of this method.

Thanks as always for the feedback!

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

The major problem I see with your approach is that you're assuming the buffer processing is more spread out than it probably is, whereas it is more likely being processed in one quick burst. I think the majority of the time, your buffer will be processed completely before a context shift (particularly as your audio thread should be highest priority), though on a multicore system you'll be adding in more variability.

Therefore, most of the time your latency is going to be your buffer size. What you're method doesn't take account for, and possibly makes worse, is jitter. As you're probably aware of from using MIDI, a small but constant delay in triggering feels more natural than a varying one. One approach you could take is to timestamp your events with System.nanoTime(). Then process the events in your audio thread roughly 1/20th of a second behind - the aim is to schedule them sample accurate as close to one buffer time after the event was triggered. You could look into the source code of projects like Frinika or Gervill which take a somewhat similar approach to sample accuracy (from recollection).

Well, I just wanted to throw out any assumptions about buffer processing.

I recall testing two buffer sizes by printing nano times with each call, and surprise, surprise, the JVM simply allowed the version with a smaller buffer to be called more times in sequence before it switched--the switching durations were fairly consistent between the two versions.

I didn't think to put a second call at the end of the method, to determine whether the JVM ever switched at a point in the middle of this method! (One can peruse the gaps in the time stamps.)

Jitter is a good term. When I referred to the buffer processing loop as having three stages, perhaps the combination of stages contributes more to jitter than latency.

Big yes on time-stamping! That is an essential element of the event-queue that I wrote for the Theremin and have in progress for this mixing system. Also, one can use the moment that the audio mixer first goes on to create its first audio frame "The Epoch", so-to-speak, and use it to cross calculate between animation frames, sound frames/samples and real time.

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

Big yes on time-stamping! That is an essential element of the event-queue that I wrote for the Theremin and have in progress for this mixing system. Also, one can use the moment that the audio mixer first goes on to create its first audio frame "The Epoch", so-to-speak, and use it to cross calculate between animation frames, sound frames/samples and real time.

Thanks for trying it. I am assuming you are running OpenJDK, is that right?

Today I learned about $PATH in Linux, and learned how to make a link to allow calls to javac. That shows what a beginner I am with Linux--but one has to start somewhere. I also now have both OpenJDK 7 and Oracle's Java 7 on the Linux partition.

But I can't test my sound programs yet, because the Linux is not recognizing the Windows sound card that I have installed...so that has to be solved soon.

I can't recall what "crackle" tends to imply diagnostically. Does it happen on all the sound tests?

It could be defaulting to a less efficient java audio implementation than what you normally use. (Is the Java Sound Audio Engine involved in any way in your setup?) At this point, I call standard library sound code, e.g., javax.sound.sampled.SourceDataLine.

I would think if the problem were the execution speed of the code, we'd more likely hear dropouts. Perhaps my volume settings on the test code are too high. There also might be some artifacts related to the pitch shifting--that could cause some sizzle. If that were true, it would only sizzle on the tests where the playback speed (pitch of the bell) is altered.

Ah, another try might be to make the buffer for the SourceDataLine writes larger. I can't be an effective help until I get my Linux sound solved. But you are welcome to change the source, to try a larger buffer, for example. The source was included in the jar.

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

I get a similar issue with this code, and in Praxis LIVE, during the first few seconds of opening a line. Because each of the tests opens a new line, this recurs during the demo. It seems to be the fault of the PulseAudio mixer in OpenJDK / IcedTea, which seems to be set as default in most Linux distros. The standard mixers from Java actually work much better, and because in Java 7 they're now fixed to use the default ALSA device, they play through PulseAudio anyway - go figure!

@gouessej - does launching the JAR with the following line work better for you? Seems to for me.

I spent a few minutes looking through the tutorials and via search to find the parameter "-D" for jars, and could find nothing. Can you explain what is happening here, or where I might find the spec or api for it?

Is there a way to do this in the source code, so that we don't have to do it in the jar?

When I put "com.sun.media.sound.DirectAudioDeviceProvider" in my Eclipse IDE, I get the message that DirectAudioDeviceProvider is not available due to a restriction on rt.jar.

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

I spent a few minutes looking through the tutorials and via search to find the parameter "-D" for jars, and could find nothing. Can you explain what is happening here, or where I might find the spec or api for it?

-D is a standard command line option for Java to set system properties. See here, and it's the same on Windows. You can do the same in code by calling System.setProperty().

This way just saves having to recompile your JAR to test it - I'm lazy!

The system properties supported by JavaSound are documented here - http://docs.oracle.com/javase/7/docs/api/javax/sound/sampled/AudioSystem.html Using system properties overrides whatever is set in the sound.properties file in the JRE, which on a lot of OpenJDK installs seems to default to org.classpath.icedtea.pulseaudio.PulseAudioMixerProvider. You can also override these defaults system-wide by editing sound.properties.

Doh! I was looking for -D under the jar command, not the java command. Thanks for including the line of code already written. Looking at the AudioSystem spec, I was trying to figure out how to use their example with Properties.load rather than System.setProperty method.

NOW:I think I'm going to go ahead and install this in my own game, to get a better idea of its practicality, but am debating whether to implement a "simple" audio event queue first.

But I'm also thinking: what would be a good way to test the performance cost of not using a larger input/processing buffer? It would be nice to have some sort of measure of the tradeoff I am making by processing single audio frames.

It is easy to write a second version with a buffer. But how can they be compared meaningfully? I'm concerned that the accuracy will be distorted by the fact that writing to a SourceDataLine blocks.

Ah, I could write to some array (sized to the buffer size, and nonblocking) instead of a SourceDataLine! Then, let both run unhindered as fast as they can go. Will report back on this when I get a chance to run it.

Alternate suggestions for testing happily accepted!

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

It is easy to write a second version with a buffer. But how can they be compared meaningfully? I'm concerned that the accuracy will be distorted by the fact that writing to a SourceDataLine blocks.

Ah, I could write to some array (sized to the buffer size, and nonblocking) instead of a SourceDataLine! Then, let both run unhindered as fast as they can go. Will report back on this when I get a chance to run it.

Alternate suggestions for testing happily accepted!

Well, you could have a look at the JavaSound implementation in the JAudioLibs code - http://code.google.com/p/jaudiolibs/source/browse/?repo=audioservers. This offers two alternative timing mechanisms that don't block on the write to the SDL - instead using a large output buffer but never writing to all of it, and using either System.nanoTime() or getFramePosition() to control writes. NB. the getFramePosition() option doesn't really work on Windows.

In Praxis I further split buffers from the server into 64 sample chunks to process through the audio pipeline. This seems to be solid everywhere I've tested so far, and offers a fairly good compromise to sample accurate processing.

If by "let both run unhindered as fast as they can go" you mean running multiple threads then forget it. Thread contention will just interfere. Run everything off the primary audio thread, and make sure it's set to maximum priority too!

The spec for setting the system property says this simply makes the option a "first choice". If the option does not exist, then the normal default is used. So, I don't think I did anything to introduce new problems by adding this to the code.

Now, I assume you (Julien) are able to get uncrackling sound normally. At the risk of trying your patience, would you be willing to check the 'options' menu I added to the Theremin? You might have done this once before.

This menu will list all the available output "mixers" on your system. I recall you had experienced crackle on my audio programs before. To diagnose, it would be helpful to me to have you report back what mixer options are presented by the dropdown. Also, if you could try each and tell me if any of them eliminate the crackle.

If one of the mixer options does eliminate the crackle, I can possibly give it a higher priority--make it the default for those systems that have it.

Phil - is your code for the theremin mixer choice menu up somewhere? I'm not sure it's offering all the options on my machine. Julien should probably be seeing at least 2 options.

What about a version of the code above that allows people to select buffersize, mixer and samplerate? I'm particularly wondering about samplerate, as there are a variety of issues around resampling that can cause crackling. PulseAudio or ALSA can be potentially running at a different samplerate to that you request from the output line, if it happens to be already running, or if the soundcard doesn't properly support 44100Hz. I've noticed this cause crackling with lower latencies, and also heard of problems with audio clipping because of the resampling. The important option is probably 48kHz.

I am quite taken by surprise if there are cards that don't support 44100 or try to use 48kHz in place of 44100. I was putting off implementing the suggestions for additional options due to not running into concrete examples of problems caused. This is a concrete example, though!

I'd be really tempted to limit input to the mixer to 44100 Hz and say to folks use Audacity to convert your sound resources. That is simple enough to do, or to explain how to do. But I am at a bit of a loss as to how to play back 44100 Hz data on a 48k line. There could well be pre-built converters in Java but it is annoying to have to add another level of processing. (Reality is a bitch, yeah.)

By the way, as I said "for grins," I tried running a version of the audio mixer that omits the last step of sending the filled buffer to the SourceDataLine, and it takes about 6 milliseconds of processing per audio second.

I still have to write the "buffered input" version, and to create tests with more audio tracks. However, this preliminary test suggests that the loss of performance by working on a per-sample basis from the inputs may not be of killing significance. For example if the buffer inputs were twice as performant, that saves only 3 milliseconds. Either way, the cpu would still have 99% of its capacity free for other tasks. (I don't know what the % free would drop to yet, when adding more tracks.)

"We all secretly believe we are right about everything and, by extension, we are all wrong." W. Storr, The Unpersuadables

Well, I tried your code offline and it shows all the mixers fine. A bit of Googling suggests that the IcedTea policy files have a bug which doesn't allow opening the PulseAudio mixer (because of its native lib). That means the default mixers in an applet and offline are different - joy!

I was going to say your code is probably longer than it needs to be and you could use mixer.isLineSupported(), however I've just noticed that the JavaDoc has some ambiguous documentation about some lines not being supported until the mixer is open. Hmm ... never had a problem with that so far, but maybe trying to get the line is a safer bet after all, though it gives me the same output as your current code.

A bit of Googling suggests that the IcedTea policy files have a bug which doesn't allow opening the PulseAudio mixer (because of its native lib). That means the default mixers in an applet and offline are different - joy!

I get an error message about that even offline in JOAL:AL lib: pulseaudio.c:612: Context did not connect: Access denied

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org