http://genesisdaw.org/RSS for NodeFri, 31 Mar 2017 05:15:46 GMT
As a music producer, it's hard to find the line between creativity and
blindly reusing other people's work. If you use presets from a synthesizer,
you're using someone else's synth programming. If you use someone else's synth,
you're using someone else's Digital Signal Processing code. You can purchase
melodies and chord progressions from a MIDI store. You can use sample packs for
musical instruments that somebody else recorded. You can use pre-made drum loops.

At each step along the way, it becomes less clear what your role is in the creation
process.

I have decided to solve the problem by adding restrictions
to the creative process. These restrictions include:

Use only samples that I personally have recorded with a microphone.

Use only synthesizers that I personally have coded the DSP and programmed the presets.

Use only DAW software that I personally have coded, including recording
software for recording samples.

Use only melodies and chord progressions that I personally have created.

The music I release under these restrictions will be associated with the artist name Bit Trouper.

Google gives 2 definitions for trouper:

an actor or other entertainer, typically one with long experience.

a reliable and uncomplaining person.

So now it's time to build a Digital Audio Workstation. I'm calling it Genesis.

]]>http://genesisdaw.org/post/bit-trouper.htmlhttp://genesisdaw.org/post/bit-trouper.htmlWed, 13 Mar 2013 01:38:29 GMT
I started working on Genesis in March 2013. The first commit:

Qt solves a lot of platform abstraction for the programmer. You might even say it solves too much.
The downside is that it is a hulking behemoth of a dependency with a nontrivial build process.
But, it enabled me to make some windows and fiddle around with some code.

Rust seemed like a cool alternative to C++ because it had fancy new programming language features
like not requiring .h files. More seriously, the 3 problems with using Go are resolved with
Rust, since it can call C code directly, C code can call it, and there is no garbage collector.
Further, the safety guarantees it made were promising.

But nothing in life is free. GTK is sufficiently complicated that a Rust "safety wrapper" is in order.
The one available was not complete.

At this time I began to have a new philosophy: Stop depending on libraries that hold the project down.

This led me to ditch the idea of using GTK or Qt and instead code the user interface toolkit from scratch.
This would work out better anyway if I wanted to sandbox plugins and provide a user-interface API for them
to use.

At this point I felt like I needed a direction, something to work towards. So I created a goal:

Open an audio file with libav.

Display the waveform of the audio file in the display.

Next I began to experience Rust development. I went back and forth between GLFW and glutin for
window library as one or the other seemed to be better.
I created groove-rs - bindings to
libgroove.
I wrote my own
3D math library because
the existing one's API was too hard for me to understand.
I fixed a lot of code after updating to the latest Rust compiler each day.

Finally, I got font rendering working. It took me 16 days, but I felt like I had learned enough
about Rust to finally let me develop efficiently.

And then I tried to abstract the font rendering code into a widget concept, and everything broke down.
The Rust compiler has many false negatives - situations where it is a compile error due to safety,
but actually it's pretty obvious that there are no safety problems.

This is so frustrating and demotivating that I realized the benefits of Rust did not outweigh
the slow development pace that I had taken on.

I felt guilty for allowing myself get distracted from actually making progress all this time.
From now on I would be lazer focused and in general avoid depending on other people's code.
If it's a bug in my code, then I know how to fix it. New roadmap:

Load all the audio file's channels into memory.

Display all of the audio file's channels in a list.

Render channels graphically.

Ability to make a selection.

Playback to speakers.

Delete selection.

Undo/Redo

Multiplayer

So I knew I would be switching to C++. But I did not want to fall into the same trap in C++-land
that I did in Rust-land: instead of solving the actual problem of making a DAW, trying to understand
how Rust or C++ works and get my code to compile.

But some C++ features are too good to ignore, such as template data structures. For example, in C
if you want to create a list data structure, you have 3 options:

Make it generic using preprocessor directives.

Make it generic using void * instead of actual types.

Don't make it generic. Reimplement the same data structure over and over for each type.

This sucks. Preprocessor directives are the devil, Too much void * leads to
runtime memory safety errors that the compiler can't catch. And what if it wasn't a simple list
data structure, but instead a hash table or something? This is crazy, we can't reimplement the
same data structure a bunch of times.

Another example. In C you use malloc to allocate memory. It typically looks something
like malloc(sizeof(MyType) * count). If you forget to multiply by count or sizeof, oops.
segfault. Meanwhile if you can use templates then you can define a function like this:

So I wanted templates from C++. And atomics
from C++11. But that's about it. Other than that, I want to code like I'm in C. The beauty of C-style
coding is that the control flow is so simple, it's impossible to not know what code is doing. Contrast
that with fancy C++ stuff and you never know what's going to happen or what each line of code is going
to do.

I figured out how to have my cake and eat it too. I discovered that you can compile with g++
and link with gcc, like so:

I implemented ability to delete a selection, save, and playback audio with PortAudio.

At this point I had been using the default Unity Ubuntu user interface. I noticed that when vsync is on,
switching to another window or switching back was really laggy. This sucks.
I filed a bug for that, and then switched to
XFCE where I am now much happier, and the lag is gone.

I have some complaints about PortAudio. First of all, it spits out a bunch of garbage to stderr
that you can't do anything about. So I'm already biased against it. Secondly it does not give you
the low level control that you need for a DAW application. For example it doesn't have PulseAudio
support so I'm accessing the sound driver via Genesis to PortAudio to PulseAudio ALSA Wrapper
to PulseAudio to ALSA. And finally, I'm getting some crashes, and my favorite way to fix crashes
is to delete code.

So I'm turning to the PulseAudio API directly.
This means that when I want to support OSX I need to create a CoreAudio backend, and when
I want to support Windows I need to create an ASIO or DirectSound backend. I might also want to create
an ALSA backend for Linux installations that don't have PulseAudio. I haven't yet investigated what
the audio situation is for the various BSDs.

Oh yeah and a lot of linux audio community members say I should only support JACK because PulseAudio
will have too high of a latency. JACK support is planned but one of my design philosophies is that
Genesis should be easy to install and run right off the bat without having to separately set up and
run a JACK server. So the final strategy will be: try JACK, and then fall back to PulseAudio.

But for now, PulseAudio is a build requirement.

After I got zooming and scrolling to work, I decided that it was time to create the core audio
pipeline and start working on something that lets us create music rather than just edit audio files.

Most operating systems do not attempt to make any real-time guarantees. This
means that various operations do not guarantee a maximum bound on how long it
might take. For example, when you allocate memory, it might be very quick, or
the kernel might have to do some memory defragmentation and cache invalidation
to make room for your new segment of memory. Writing to a hard drive might
cause it to have to warm up and start spinning.

This can be a disaster if one of these non-bounded operations is in the
audio rendering pipeline, especially if the latency is low. The buffer of audio
going to the sound card might empty before it gets filled up, causing a nasty
audio glitch sound known as a "buffer underrun".

In general, all syscalls are suspect when it comes to real-time guarantees. The
careful audio programmer will avoid them all.

libgenesis meets this criteria with one exception. libgenesis takes advantage
of hardware concurrency by using one worker thread per CPU core to render
audio. It carefully uses a lock-free queue data structure to avoid making
syscalls, but when there is not enough work to do and some threads are sitting
idly by, those threads need to suspend execution until there is work to do.

So if there is more work to be done than worker threads, no syscalls are made.
However, if a worker thread has nothing to do and needs to suspend execution,
it makes a FUTEX_WAIT syscall, and then is woken up by another worker thread
making a FUTEX_WAKE syscall.

Compatibility

libgenesis follows semver. Major version is bumped when
API or ABI compatibility is broken. Minor version is bumped when new features
are added. Patch version is bumped only for bug fixes. Until 1.0.0 is released
no effort is made toward backward compatibility whatsoever.

Genesis Audio Studio has an independent version from libgenesis. Major version
is bumped when a project file will no longer generate the same output audio as
the previous version. Minor version is bumped when new features are added.
Patch version is bumped only for bug fixes. Until 1.0.0 is released no effort
is made toward project files being backward compatible to previous versions.

Coordinates

Positions in the audio project are in floating point whole notes. This is to
take into account the fact that the tempo and time signature could change at
any point in the audio project. You can convert from whole notes to frames by
calling a utility function which takes into account tempo and time signature
changes.

Multiplayer and Peer-to-Peer

When a user opens Genesis, there should be a pane which has a set of rooms that
users can gather in and chat. For example, a lobby. Users can create other
rooms, perhaps private, to join as well. Users can invite other users to join
their open project. When a user joins the project, a peer-to-peer connection is
established so the edits do not go through a third party. Further, if the peers
are connected via LAN, network speeds will be very fast.

The server(s) that provide this chat are also peers. Individuals or businesses
could donate to the server space, similar to being a seeder in a torrent
situation, by running the server software, adding their node to the pool.

When two (or more) users are simultaneously working on a project, the playback
head is not synchronized. The users are free to roam about the project, making
changes here and there. However, each person will see "where" in the project
the other person is working, and see the changes that they are making. So it
would be trivial, for example, for both users to look at a particular bassline,
both listening to it on loop, albeit at different offsets, while one person
works on the drums, and the other person works on the bass rhythm.

Plugin Registry and Safety

Plugins must be provided as source code and published to the Genesis registry.
The Genesis registry will not be a single server, but once again a peer-to-peer
network. Downloading from the plugin registry will be like downloading a
torrent. By default Genesis will act as a peer on LANs when other instances of
Genesis request plugins over the LAN.

It's not clear how this goal will be accomplished, but we will attempt to build
a system where these constraints are met:

Plugins are provided as source code that is guaranteed to build on all
supported platforms. It's not possible to have a plugin that works on one
person's computer and not another.

Plugins either have compile-time protection against malicious code and
crashes (such as segfaults) or run-time protection.

One idea: instead of one sandboxed process per plugin, have one sandboxed
process that runs all the untrusted plugin code; the entire real-time
execution path.

DRM will never be supported although paid plugins are not out of the question,
as long as the constraint is met that if a user wants another user to join their
project, the other user is able to use the plugin with no restrictions.

Project Network

Users could browse published projects and samples on the peer-to-peer network.
A sample is a rendered project, so if you want to edit the source to the sample
you always can.

Publishing a project requires licensing it generously so that it is always safe
to use projects on the network for any purpose without restriction.

The network would track downloads and usages so that users can get an idea of
popularity and quality. Projects would be categorized and tagged and related to
each other for easy browsing and searchability.

So, one user might publish some drum samples that they made as projects, one
project per sample. Another user might use all of them, edit one of them and
modify the effects, and then create 10 projects which are drum loops using the
samples. A third user might use 2-3 of these drum loops, edit them to modify
the tempo, and produce a song with them and publish the song. Yet another user
might edit the song, produce a remix, and then publish the remix.

This project, sample, and plugin network should be easily browsable directly
from the Genesis user interface. It should be very easy to use content from the
network, and equally easy to publish content to the network. It should almost
be easier to save it open source than to save it privately.

License

Genesis is licensed with the Lesser General Public License. A full copy of the
license text is included in the LICENSE file, but here's a non-normative
summary:

As a user you have access to the source code, and if you are willing to compile
from source yourself, you can get the software for free.

As a company you may freely use the source code in your software; the only
restriction is that if you modify Genesis source code, you must contribute those
modifications back to the Genesis project.

When working on widget user interface code, I realized that I was implementing inheritence
without actually using C++'s inheritance features, and it was more error prone than just
giving in and depending on libstdc++.

The core backend and the GUI are decoupled. The core backend is in a shared
library called libgenesis which does not link against any GUI-related
libraries - not even libstdc++.

Meanwhile, the GUI depends on libgenesis and puts a user-interface on top of it.

libgenesis is intended to be a general-purpose utility library for doing
digital audio workstation related things, such as using it as the backend for
a headless computer-created music stream.

And a normalize_audio example which reads an audio file of any format,
increases the volume as much as possible without clipping, and then saves to any format.
This demonstrates the audio import and export capabilities.

]]>http://genesisdaw.org/post/interesting-commits.htmlhttp://genesisdaw.org/post/interesting-commits.htmlThu, 26 Mar 2015 21:29:02 GMT
I spent three days in a row working on a many writer, many reader, fixed-size, thread-safe,
first-in-first-out lock-free queue.

The queue makes no syscalls except under one condition: if a reader attempts to dequeue an item
and the queue is empty, then it makes a syscall to go to sleep. And then if and only if this has
happend, a writer will make a syscall to wake up a sleeping reader when it enqueues an item.

I wrote some unit tests and they all pass, even when I run 10 instances of
the unit tests at once on repeat for 10 minutes.

Coding with the C++11 atomics was quite handy.

It depends on a Linux-specific feature called
futex
for causing threads to sleep and wake up. Other operating systems have
similar features, and I will need to create an OS-specific port of this
data structure for each operating system when the time comes.

]]>http://genesisdaw.org/post/thread-safe-lock-free-queue.htmlhttp://genesisdaw.org/post/thread-safe-lock-free-queue.htmlFri, 27 Mar 2015 21:29:02 GMT
Today I reached a motivating milestone: I can play a polyphonic sine wave
using a USB MIDI keyboard.
Download Video

It's just a simple sine wave and there are no parameters to configure. I wasn't sure how to
modulate the volume when there were multiple notes being pressed at the same time. Should
it do simple addition? Then the synth could output samples above 1.0. That seems broken.
Should it average the notes together? Then when you play 2 notes at once, each note is half
volume of a single note. If one of the simultaneous notes had a low velocity, that seems weird.

Another problem is the clicking when you release a note. It's because the sine wave abruptly ends
instead of gracefully fading towards zero:

These are problems for another day. Moving on.

I also got a delay example working. It connects your default recording device to
a delay (echo) filter to your default playback device. Like the synth example, it's a C program
which only depends on libgenesis. I took all the error checking code out for clarity.

I got this from /dev/random.
This uniquely identifies the file as a Genesis project file. The default file
extension is .gdaw.

After this contains an ordered list of transactions. A transaction is a set of
edits. There are 2 types of edits: put and delete.

A put contains a key and a value, each of which is a variable number of bytes.
A delete contains only a key, which is a variable number of bytes.

A transaction looks like this:

Offset

Description

0

uint32be length of transaction in bytes including this field

4

uint32be crc32 of this transaction

8

uint32be number of put edits in this transaction

12

uint32be number of delete edits in this transaction

16

the put edits in this transaction

-

the delete edits in this transaction

A put edit looks like:

Offset

Description

0

uint32be key length in bytes

4

uint32be value length in bytes

8

key bytes

-

value bytes

A delete edit looks like:

Offset

Description

0

uint32be key length in bytes

4

key bytes

That's it. To read the file, apply the transactions in order. To update a file,
append a transaction. Periodically "garbage collect" by creating a new project
file with a single transaction with all the data, and then atomically rename
the new project file over the old one.

This structure uses triple buffering to provide a single reader, single writer atomic value, where
the value can be anything. In Genesis, it is used to hold the data structure that contains all the
MIDI events. When a user edits the project, the pointer atomically flips to an unused buffer for
writing, similar to how screen rendering works.

]]>http://genesisdaw.org/post/playback.htmlhttp://genesisdaw.org/post/playback.htmlThu, 28 May 2015 06:06:40 GMT
I struggled with buffer underruns for a long time. I seemed to be dealing
with some enigmatic audio problems.

After many failed attempts to solve the audio problems I was having, I decided
to tackle the problem head on. I extracted the audio engine code and started a
new project, one in which I was dedicated to becoming an expert at how audio
input and output is handled on every platform. Hopefully after solving the crap
out of sound I/O in a way independent from the other issues I was having, I
would be able to solve the Genesis problems.

libsoundio is a lightweight abstraction over various sound drivers. It provides a
well-documented API that operates consistently regardless of the sound driver it
connects to. It performs no buffering or processing on your behalf; instead exposing
the raw power of the underlying backend.

libsoundio is appropriate for games, music players, digital audio workstations,
and various utilities.

Features & Limitations

Exposes both raw devices and shared devices. Raw devices give you the best
performance but prevent other applications from using them. Shared devices
are default and usually provide sample rate conversion and format
conversion.

Exposes both device id and friendly name. id you could save in a config file
because it persists between devices becoming plugged and unplugged, while
friendly name is suitable for exposing to users.

Supports optimal usage of each supported backend. The same API does the right thing whether the backend has a fixed buffer size, such as on JACK and CoreAudio, or whether it allows directly managing the buffer, such as on ALSA or PulseAudio.

C library. Depends only on the respective backend API libraries and libc. Does not depend on libstdc++, and does not have exceptions, run-time type information, or setjmp.

Errors are communicated via return codes, not logging to stdio.

Supports channel layouts (also known as channel maps), important for surround sound applications.

Ability to monitor devices and get an event when available devices change.

Ability to get an event when the backend is disconnected, for example when the JACK server or PulseAudio server shuts down.

Detects which input device is default and which output device is default.

Ability to connect to multiple backends at once. For example you could have an ALSA device open and a JACK device open at the same time.

I am pleased to report that libsoundio integration is complete and it has
indeed solved the buffer underrun problems and other issues I was running
into before.

In this screenshot you can see that Genesis is connected to multiple sound
drivers at once - JACK, PulseAudio, ALSA, and Dummy. On top of that it is
listening and will automatically refresh when you insert or remove a device
such as a USB microphone.

Here is a screenshot of the Project Settings Pane.
Changing the sample rate does in fact change the sample rate of the
underlying audio pipeline. Next up is implementing rendering to disk,
which uses these settings.

Today I refactored the audio file format code and added this dock which is used
to start a render of the project. The options here are persisted in the user settings
file. Next up is making the Render button actually start a render.

]]>http://genesisdaw.org/post/render-project-dock.htmlhttp://genesisdaw.org/post/render-project-dock.htmlThu, 15 Oct 2015 09:42:35 GMT
I've been experimenting with broadcasting myself coding Genesis live.
Here are some
archived videos.
In these videos we worked on:

Move audio graph code out of project code.

Prepare codebase for ability to render project.

Fix memory leaks.

If you, like me, do not have Flash installed, you can use the excellent
livestreamer project to
view the content.

I think I'm going to try to start doing these streams daily.
I'm not really sure where to announce when the stream starts. I'll try
twitter. My username is
andy_kelley.

]]>http://genesisdaw.org/post/live-coding-2015-10-20.htmlhttp://genesisdaw.org/post/live-coding-2015-10-20.htmlWed, 21 Oct 2015 01:42:54 GMT
I was aware of Single Instruction Multiple Data optimizations, but I wasn't
sure how to take advantage of them, or whether or not it was something I needed
to directly take advantage of, versus relying on the compiler to do so for me.