tl;dr: Write a custom GSource if you have a non-file-descriptor-based event source to integrate with a GMainContext. It’s a matter of writing a few virtual functions.

What is GSource?

A GSource is an expected event with an associated callback function which will be invoked when that event is received. An event could be a timeout or data being received on a socket, for example.

GLib contains various types of GSource, but also allows applications to define their own, allowing custom events to be integrated into the main loop.

The structure of a GSource and its virtual functions are documented in detail in the GLib API reference.

A message queue source

As a running example, a message queue source will be used which dispatches its callback whenever a message is enqueued to a queue internal to the source (potentially from another thread).

This type of source is useful for efficiently transferring large numbers of messages between main contexts. The alternative is transferring each message as a separate idle GSource using g_source_attach(). For large numbers of messages, this means a lot of allocations and frees of GSources.

Structure

Firstly, a structure for the source needs to be declared. This must contain a GSource as its parent, followed by the private fields for the source: the queue and a function to call to free each message once finished with.

Next, the prepare function for the source must be defined. This determines whether the source is ready to be dispatched. As this source is using an in-memory queue, this can be determined by checking the queue’s length: if there are elements in the queue, the source can be dispatched to handle them.

return (g_async_queue_length (message_queue_source->queue) > 0);

Check function

As this source has no file descriptors, the prepare and check functions essentially have the same job, so a check function is not needed. Setting the field to NULL in GSourceFuncs bypasses the check function for this source type.

Dispatch function

For this source, the dispatch function is where the complexity lies. It needs to dequeue a message from the queue, then pass that message to the GSource’s callback function. No messages may be queued: even through the prepare function returned true, another source wrapping the same queue may have been dispatched in the mean time and taken the final message from the queue. Further, if no callback has been set for the GSource (which is allowed), the message must be destroyed and silently dropped.

If both a message and callback are set, the callback can be invoked on the message and its return value propagated as the return value of the dispatch function. This is FALSE to destroy the GSource and TRUE to keep it alive, just as for GSourceFunc — these semantics are the same for all dispatch function implementations.

/* Pop a message off the queue. */
message = g_async_queue_try_pop (message_queue_source->queue);
/* If there was no message, bail. */
if (message == NULL)
{
/* Keep the source around to handle the next message. */
return TRUE;
}
/* @func may be %NULL if no callback was specified.
* If so, drop the message. */
if (func == NULL)
{
if (message_queue_source->destroy_message != NULL)
{
message_queue_source->destroy_message (message);
}
/* Keep the source around to consume the next message. */
return TRUE;
}
return func (message, user_data);

Callback functions

The callback from a GSource does not have to have type GSourceFunc. It can be whatever function type is called in the source’s dispatch function, as long as that type is sufficiently documented.

Normally, g_source_set_callback() is used to set the callback function for a source instance. With its GDestroyNotify, a strong reference can be held to keep an object alive while the source is still alive:

However, GSource has a layer of indirection for retrieving this callback, exposed as g_source_set_callback_indirect(). This allows GObject to set a GClosure as the callback for a source, which allows for sources which are automatically destroyed when an object is finalized — a weak reference, in contrast to the strong reference above:

It also allows for a generic, closure-based ‘dummy’ callback, which can be used when a source needs to exist but no action needs to be performed in its callback:

g_source_set_dummy_callback (source);

Constructor

Finally, the GSourceFuncs definition of the GSource can be written, alongside a construction function. It is typical practice to expose new source types simply as GSources, not as the subtype structure; so the constructor returns a GSource*.

The example constructor here also demonstrates use of a child source to support cancellation conveniently. If the GCancellable is cancelled, the application’s callback will be dispatched and can check for cancellation. (The application code will need to make a pointer to the GCancellable available to its callback, as a field of the callback’s user data set in g_source_set_callback()).

Sources can be more complex than the example given above. In libnice, a custom GSource is needed to poll a set of sockets which changes dynamically. The implementation is given as ComponentSource in component.c and demonstrates a more complex use of the prepare function.

Another example is a custom source to interface GnuTLS with GLib in its GTlsConnection implementation. GTlsConnectionGnutlsSource synchronizes the main thread and a TLS worker thread which performs the blocking TLS operations.

Day 5, and the DX and docs hackfest in Collabora HQ, Cambridge has drawn to a close. It’s been great to have everyone here, and there have been a lot of in-depth discussions over the last few days about the details of app sandboxing, runtimes, Builder integration with various new services, the development of an IDE abstraction layer, approaches for making build systems accessible to Builder, lots of new things to statically analyse, and some fairly fundamental additions to GLib in the form of G_DECLARE_[FINAL|DERIVABLE]_TYPE and general-purpose reference counted memory areas. Whew! We even had a fleeting visit by Richard Hughes to discuss packaging issues for apps.

I can’t do justice to the work of the docs team, who put in consistent, solid effort throughout the hackfest. See the blogs by Petr, Bastian, Kat and Jim for all the details. They even left me with a seemingly endless supply of Mallard balls to throw around the office!

Dave and I have spent a little while working on further deprecating gnome-common. More details to come once the migration guide is finished.

Today was a bit of a slow start, since people were still arriving throughout the day. Regardless, there have been various discussions, with Ryan, Emmanuele and Christian discussing performance improvements in GLib, Christian and Allan plotting various different approaches to new UI in Builder, Cosimo and Carlos silently plugging away at GTK+, and Emmanuele muttering something about GProperty now and then.

Tomorrow, I hope we can flesh out some of these initial discussions a bit more and get some roadmapping down for GLib development for the next year, amongst other things. I am certain that Builder will feature heavily in discussions too, and apps and sandboxing, now that Alex has arrived.

I’ve spent a little time finishing off and releasing Walbottle, a small library and set of utilities I’ve been working on to implement JSON Schema, which is the equivalent of XML Schema or RELAX-NG, but for JSON files. It allows you to validate JSON instances against a schema, to validate schemas themselves and, unusually, to automatically generate parser unit tests from a schema. That way, you can automatically test json-glib–based JsonReader/JsonParser code, just by passing the JSON schema to Walbottle’s json-schema-generate utility.

It’s still a young project, but should be complete enough to be useful in testing JSON code. Please let me know of any bugs or missing features!

tl;dr: Use g_set_object() to conveniently update owned object pointers in property setters (and elsewhere); see the bottom of the post for an example.

A little while ago, GLib gained a useful function called g_clear_object(), which clears an owned object pointer (a pointer which owns a reference to a GObject). GLib also just gained a function called g_set_object(), which works similarly but can either clear an object pointer or update it to point to a new object.

Why is this valuable? It saves a few lines of code each time an object pointer is updated, which isn’t much in itself. However, one thing it gets right is the order of reference counting operations, which is a common mistake in property setters. Instead of:

because otherwise, if (new_object == object_ptr)(or if the objects have some other ownership relationship) and the object only has one reference left, object_ptr will end up pointing to a finalised GObject (and g_object_ref() will be called on a finalised GObject too, which it really won’t like).

So how does g_set_object() help? We can now do:

g_set_object (&object_ptr, new_object);

which takes care of all the reference counting, and allows new_object to be NULL. &object_ptr must not be NULL. If you’re worried about performance, never fear. g_set_object() is a static inline function, so shouldn’t adversely affect your code size.

Even better, the return value of g_set_pointer() indicates whether the value changed, so we can conveniently base GObject::notify emissions off it:

/* This is how all GObject property setters should look in future. */
if (g_set_object (&priv->object_ptr, new_object))
g_object_notify (self, "object-ptr");

Hopefully this will make property setters (and other object updates) simpler in new code.

For the past several months, Olivier Crête and I have been working on a project using libnice at Collabora, which is now coming to a close. Through the project we’ve managed to add a number of large, new features to libnice, and implement hundreds (no exaggeration) of cleanups and bug fixes. All of this work was done upstream, and is available in libnice 0.1.8, released recently! GLib has also gained a number of networking fixes, API additions and documentation improvements.

Firstly, what is libnice? It’s a GLib implementation of ICE, the standard protocol for NAT traversal. Briefly, NAT traversal is needed when two hosts want to communicate peer-to-peer in a network where there is at least one NAT translator between them, meaning that at least one of the hosts cannot directly address the other until a mapping is created in the NAT translator. This is a very common situation (due to the shortage of IPv4 addresses, and the consequence that most home routers act as NAT translators) and affects virtually all peer-to-peer communications. It’s well covered in the literature, and the rest of this post will assume a basic understanding of NAT and ICE, a topic about which I recently gave a talk.

Conceptually, libnice exists just to create a reliable (TCP-like) or unreliable (UDP-like) socket which connects your host with a remote one in a manner that traverses any intervening NATs. At its core, it is effectively an implementation of send(), recv(), and some ancillary functions to negotiate the ICE stream at startup time.

Highly related, the original receive API has been augmented with scatter–gather support in the form of a recvmmsg()-like API: nice_agent_recv_messages(). Along with appropriate improvements to libnice’s underlying socket implementations (the most obscure of which are still to be plumbed in), this allows performance improvements by batching messages, reducing the number of system calls needed for communication. Furthermore (perhaps more importantly) it reduces memory copies when assembling and parsing packets, by allowing the packets to be split across multiple non-contiguous buffers. This is a well-studied and long-known performance technique in networking, and it’s nice that libnice now supports it.

So, if you have an ICE connection (stream 1 on agent, with 2 components) exchanging packets with 20B headers and variable-length payloads, instead of:

libnice has also gained non-blockingvariants of its I/O functions. Previously, one had to explicitly attach a libnice stream to a GMainContext to start receiving packets. Packets would be delivered individually via a callback function (set with nice_agent_attach_recv()), which was inefficient and made for awkward control flow. Now, the non-blocking I/O functions can be used with a custom GSource from g_pollable_input_stream_create_source() to allow for more flexible reception of packets using the more standard GLib pattern of attaching a GSource to the GMainContext and in its callback, calling g_pollable_input_stream_read_nonblocking() until all pending packets have been read. libnice’s internal timers (used for retransmit timeouts, etc.) are automatically added to the GMainContext passed into nice_agent_new() at construction time, which you must run all the time as before.

Finally, FIN/ACK support has been added to libnice’s pseudo-TCP implementation. The code was originally based on Google’s libjingle pseudo-TCP, establishing a reliable connection over UDP by encapsulating TCP-like packets within UDP. This implemented the basics of TCP, but left things like the closing FIN/ACK handshake to higher-level protocols. Fine for Google, but not for our use case, so we added support for that. Furthermore, we needed to layer TLS over a pseudo-TCP connection using GTlsConnection, which required implementing half-duplex close support and fixing a few nasty leaks in GTlsConnection.

After a couple of discussions at the DX hackfest about cross-platform-ness and deployment of GLib, I started wondering: we often talk about how GNOME developers work at all levels of the stack, but how much of that actually qualifies as ‘core’ work which is used in web servers, in cross-platform desktop software1, or commonly in embedded systems, and which is security critical?

On desktop systems (taking my Fedora 19 installation as representative), we can compare GLib usage to other packages, taking GLib as the lowest layer of the GNOME stack:

Package

Reverse dependencies

Recursive reverse dependencies

glib2

4001

–

qt

2003

–

libcurl

628

–

boost-system

375

–

gnutls

345

–

openssl

101

1022

(Found with repoquery --whatrequires [--recursive] [package name] | wc -l. Some values omitted because they took too long to query, so can be assumed to be close to the entire universe of packages.)

Obviously GLib is depended on by many more packages here than OpenSSL, which is definitely a core piece of software. However, those packages may not be widely used or good attack targets. Higher layers of the GNOME stack see widespread use too:

Package

Reverse dependencies

cairo

2348

gdk-pixbuf2

2301

pango

2294

gtk3

801

libsoup

280

gstreamer

193

librsvg2

155

gstreamer1

136

clutter

90

(Found with repoquery --whatrequires [package name] | wc -l.)

Widely-used cross-platform software which interfaces with servers2 includes PuTTY and Wireshark, both of which use GTK+3. However, other major cross-platform FOSS projects such as Firefox and LibreOffice, which are arguably more ‘core’, only use GNOME libraries on Linux.

How about on embedded systems? It’s hard to produce exact numbers here, since as far as I know there’s no recent survey of open source software use on embedded products. However, some examples:

Qt pulls in GLib as an optional dependency (automatically enabled if compiled for X11, QWS or QPA) for its main loop, and is used widely in embedded systems.

So there are some sample points which suggest moderately widespread usage of GNOME technologies in open-source-oriented embedded systems. For more proprietary embedded systems it’s hard to tell. If they use Qt for their UI, they may well use GLib’s main loop implementation. I tried sampling GPL firmware releases from gpl-devices.org and gpl.nas-central.org, but both are quite out of date. There seem to be a few releases there which use GLib, and a lot which don’t (though in many cases they’re just kernel releases).

Servers are probably the largest attack surface for core infrastructure. How do GNOME technologies fare there? On my CentOS server:

I can’t find much evidence of other GNOME libraries in use, though, since there isn’t much call for them in a non-graphical server environment. That said, there has been heavy development of server-grade features in the NetworkManager stack, which will apparently be in RHEL 7 (thanks Jon).

So it looks like GLib, if not other GNOME technologies, is a plausible candidate for being core infrastructure. Why haven’t other GNOME libraries seen more widespread usage? Possibly they have, and it’s too hard to measure. Or perhaps they fulfill a niche which is too small. Most server technology was written before GNOME came along and its libraries matured, so any functionality which could be provided by them has already been implemented in other ways. Embedded systems seem to shun desktop libraries for being too big and slow. The cross-platform support in most GNOME libraries is poorly maintained or non-existent, limiting them to use on UNIX systems only, and not the large OS X or Windows markets. At the really low levels, though, there’s solid evidence that GNOME has produced core infrastructure in the form of GLib.

As much as 2014 is the year of Linux on the desktop, Windows and Mac still have a much larger market share. ↩

To begin with, what is ‘the right context’? Taking a multi-threaded GLib program, let’s assume that each thread has a single GMainContext running in a main loop — this is the thread default main context.((Why use main contexts? A main context effectively provides a work or message queue for a thread — something which the thread can periodically check to determine if there is work pending from another thread. It’s not possible to pre-empt a thread’s execution without using hideous POSIX signalling). I’m ignoring the case of non-default contexts, but their use is similar.)) So ‘the right context’ is the one in the thread you want a function to execute in. For example, if I’m doing a long and CPU-intensive computation I will want to schedule this in a background thread so that it doesn’t block UI updates from the main thread. The results from this computation, however, might need to be displayed in the UI, so some UI update function has to be called in the main thread once the computation’s complete. Furthermore, if I can limit a function to being executed in a single thread, it becomes easy to eliminate the need for locking a lot of the data it accesses((Assuming that other threads are implemented similarly and hence most data is accessed by a single thread, with threads communicating by message passing, allowing each thread to update its data at its leisure.)), which makes multi-threaded programming a whole lot simpler.

For some functions, I might not care which context they’re executed in, perhaps because they’re asynchronous and hence do not block the context. However, it still pays to be explicit about which context is used, since those functions may emit signals or invoke callbacks, and for reasons of thread safety it’s necessary to know which threads those signal handlers or callbacks are going to be invoked in. For example, the progress callback in g_file_copy_async() is documented as being called in the thread default main context at the time of the initial call.

The core principle of invoking a function in a specific context is simple, and I’ll walk through it as an example before demonstrating the convenience methods which should actually be used in practice. A GSource has to be added to the specified GMainContext, which will invoke the function when it’s dispatched. This GSource should almost always be an idle source created with g_idle_source_new(), but this doesn’t have to be the case. It could be a timeout source so that the function is executed after a delay, for example.

As described previously, this GSource will be added to the specified GMainContext and dispatched as soon as it’s ready((In the case of an idle source, this will be as soon as all sources at a higher priority have been dispatched — this can be tweaked using the idle source’s priority parameter with g_source_set_priority(). I’m assuming the specified GMainContext is being run in a GMainLoop all the time, which should be the case for the default context in a thread.)), calling the function on the thread’s stack. The source will typically then be destroyed so the function is only executed once (though again, this doesn’t have to be the case).

Data can be passed between threads in this manner in the form of the user_data passed to the GSource’s callback. This is set on the source using g_source_set_callback(), along with the callback function to invoke. Only a single pointer is provided, so if multiple bits of data need passing, they must be packaged up in a custom structure first.

Here’s an example. Note that this is to demonstrate the underlying principles, and there are convenience methods explained below which make this simpler.

That’s a lot of code, and it doesn’t look fun. There are several points of note here:

This invocation is uni-directional: it calls my_func() in thread1, but there’s no way to get a return value back to the main thread. To do that, the same principle needs to be used again, invoking a callback function in the main thread. It’s a straightforward extension which isn’t covered here.

Thread safety: This is a vast topic, but the key principle is that data which is potentially accessed by multiple threads must have mutual exclusion enforced on those accesses using a mutex. What data is potentially accessed by multiple threads here? thread1_main_context, which is passed in the fork call to thread1_main; and some_object, a reference to which is passed in the data closure. Critically, GLib guarantees that GMainContext is thread safe, so sharing thread1_main_context between threads is fine. The other code in this example must ensure that some_object is thread safe too, but that’s a topic for another blog post. Note that some_string and some_int cannot be accessed from both threads, because copies of them are passed to thread1, rather than the originals. This is a standard technique for making cross-thread calls thread safe without requiring locking. It also avoids the problem of synchronising freeing some_string. Similarly, a reference to some_object is transferred to thread1, which works around the issue of synchronising destruction of the object.

Specificity: g_idle_source_new() was used rather than the simpler g_idle_add() so that the GMainContext the GSource is attached to could be specified.

With those principles and mechanisms in mind, let’s take a look at a convenience method which makes this a whole lot easier: g_main_context_invoke_full().((Why not g_main_context_invoke()? It doesn’t allow a GDestroyNotify function for the user data to be specified, limiting its use in the common case of passing data between threads.)) As stated in its documentation, it invokes a callback so that the specified GMainContext is owned during the invocation. In almost all cases, the context being owned is equivalent to it being run, and hence the function must be being invoked in the thread for which the specified context is the thread default.

Modifying the earlier example, the invoke_my_func() function can be replaced by the following:

That’s a bit simpler. Let’s consider what happens if invoke_my_func() were to be called from thread1, rather than from the main thread. With the original implementation, the idle source would be added to thread1’s context and dispatched on the context’s next iteration (assuming no pending dispatches with higher priorities). With the improved implementation, g_main_context_invoke_full() will notice that the specified context is already owned by the thread (or can be acquired by it), and will call my_func_idle() directly, rather than attaching a source to the context and delaying the invocation to the next context iteration. This subtle behaviour difference doesn’t matter in most cases, but is worth bearing in mind since it can affect blocking behaviour (i.e. invoke_my_func() would go from taking negligible time, to taking the same amount of time as my_func() before returning).

How can I be sure a function is always executed in the thread I expect? Since I’m now thinking about which thread each function could be called in, it would be useful to document this in the form of an assertion:

g_assert (g_main_context_is_owner (expected_main_context));

If that’s put at the top of each function, any assertion failure will highlight a case where a function has been called directly from the wrong thread. This technique was invaluable to me recently when writing code which used upwards of four threads with function invocations between all of them. It’s a whole lot easier to put the assertions in when initially writing the code than it is to debug the race conditions which easily result from a function being called in the wrong thread.

This can also be applied to signal emissions and callbacks. As well as documenting which contexts a signal or callback will be emitted in, assertions can be added to ensure that this is always the case. For example, instead of using the following when emitting a signal:

As well as asserting emission happens in the right context, this improves type safety. Bonus! Note that signal emission via g_signal_emit() is synchronous, and doesn’t involve a main context at all. As signals are a more advanced version of callbacks, this approach can be applied to those as well.

Before finishing, it’s worth mentioning GTask. This provides a slightly different approach to invoking functions in other threads, which is more suited to the case where you want your function to be executed in some background thread, but don’t care exactly which one. GTask will take a data closure, a function to execute, and provide ways to return the result from this function; and will then handle everything necessary to run that function in a thread belonging to some thread pool internal to GLib. Although, by combining g_main_context_invoke_full() and GTask, it should be possible to run a task in a specific context and effortlessly return its result to the current context:

GMainContext is at the core of almost every GLib application, yet it was only recently that I took the time to fully explore it — the details of it have always been a mystery. Doing some I/O work required me to look a little closer and try to get my head around the ins and outs of GMainContext, GMainLoop and GSources. Here I’ll try and write down a bit of what I’ve learned. If you want to skip to the conclusion, there’s a list of key points for using GMainContexts in libraries at the bottom of the post.

What is GMainContext? It’s a generalised implementation of an event loop, useful for implementing polled file I/O or event-based widget systems (i.e. GTK+). If you don’t know what poll() does, read about that first, since GMainContext can’t be properly understood without understanding polled I/O. A GMainContext has a set of GSources which are ‘attached’ to it, each of which can be thought of as an expected event with an associated callback function which will be invoked when that event is received; or equivalently as a set of file descriptors (FDs) to check. An event could be a timeout or data being received on a socket, for example. One iteration of the event loop will:

Prepare sources, determining if any of them are ready to dispatch immediately.

Poll the sources, blocking the current thread until an event is received for one of the sources.

At its core, GMainContext is just a poll() loop, with the preparation, check and dispatch stages of the loop corresponding to the normal preamble and postamble in a typical poll() loop implementation, such as listing 1 from http://www.linux-mag.com/id/357/. Typically, some complexity is needed in non-trivial poll()-using applications to track the lists of FDs which are being polled. Additionally, GMainContext adds a lot of useful functionality which vanilla poll() doesn’t support. Most importantly, it adds thread safety.

GMainContext is completely thread safe, meaning that a GSource can be created in one thread and attached to a GMainContext running in another thread. A typical use for this might be to allow worker threads to control which sockets are being listened to by a GMainContext in a central I/O thread. Each GMainContext is ‘acquired’ by a thread for each iteration it’s put through. Other threads cannot iterate a GMainContext without acquiring it, which guarantees that a GSource and its FDs will only be polled by one thread at once (since each GSource is attached to at most one GMainContext). A GMainContext can be swapped between threads across iterations, but this is expensive.

Why use GMainContext instead of poll()? Mostly for convenience, as it takes all the grunt work out of dynamically managing the array of FDs to pass to poll(), especially when operating over multiple threads. This is done by encapsulating FDs in GSources, which decide whether those FDs should be passed to the poll() call on each ‘prepare’ stage of the main context iteration.

So if that’s GMainContext, what’s GMainLoop? Ignoring reference counting and locking gubbins, it is essentially just the following three lines of code (in g_main_loop_run()):

Plus a fourth line in g_main_loop_quit() which sets loop->is_running = FALSE and which will cause the loop to terminate once the current main context iteration ends. i.e. GMainLoop is a convenient, thread-safe way of running a GMainContext to process events until a desired exit condition is met, at which point you call g_main_loop_quit(). Typically, in a UI program, this will be the user clicking ‘exit’. In a socket handling program, this might be the final socket closing.

It is important not to confuse main contexts with main loops. Main contexts do the bulk of the work: preparing source lists, waiting for events, and dispatching callbacks. A main loop just iterates a context.

One of the important features of GMainContext is its support for ‘default’ contexts. There are two levels of default context: the thread-default, and the global-default. The global-default (accessed using g_main_context_default()) is what’s run by GTK+ when you call gtk_main(). It’s also used for timeouts (g_timeout_add()) and idle callbacks (g_idle_add()) — these won’t be dispatched unless the default context is running!

What are the thread-default contexts then? These are a later addition to GLib (since version 2.22), and are generally used for I/O operations which need to run and dispatch callbacks in a thread. By calling g_main_context_push_thread_default() before starting an I/O operation, the thread-default context has been set, and the I/O operation can add its sources to that context. The context can then be run in a new main loop in an I/O thread, causing the callbacks to be dispatched on that thread’s stack rather than on the stack of the thread running the global-default main context. This allows I/O operations to be run entirely in a separate thread without explicitly passing a specific GMainContext pointer around everywhere.

Conversely, by starting a long-running operation with a specific thread-default context set, your code can guarantee that the operation’s callbacks will be emitted in that context, even if the operation itself runs in a worker thread. This is the principle behind GTask: when a new GTask is created, it stores a reference to the current thread-default context, and dispatches its completion callback in that context, even if the task itself is run using g_task_run_in_thread().

For example, the code below will run a GTask which performs two writes in parallel from a thread. The callbacks for the writes will be dispatched in the worker thread, whereas the callback from the task as a whole will be dispatched in the interesting context.

From the work I’ve been doing recently with GMainContext, here are a few rules of thumb for using main contexts in libraries which I’m going to follow in future:

Never iterate a context you don’t own, including the global-default or thread-default contexts, or you can cause the user’s sources to be dispatched unexpectedly and cause re-entrancy problems.

Always remove GSources from a main context once you’re done with them, especially if that context may have been exposed to the user (e.g. as a thread-default). Otherwise the user may keep a reference to the main context and continue iterating it after your code expects it to have been destroyed, potentially causing unexpected source dispatches in your code.

If your API is designed to be used in threads, or in a context-aware fashion, always document which context callbacks will be dispatched in. For example, “callbacks will always be dispatched in the context which is the thread-default at the time of the object’s construction”. Users of your API need to know this information.

Use g_main_context_invoke() to ensure callbacks are dispatched in the right context. It’s much easier than manually using g_idle_source_new().

Libraries should never use g_main_context_default() (or, equivalently, pass NULL to a GMainContext-typed parameter). Always store and explicitly use a specific GMainContext, even if that reduces to being some default context. This makes your code easier to split out into threads in future, if needed, without causing hard-to-debug problems with callbacks being invoked in the wrong context.

Always write things asynchronously internally (using the amazing GTask where appropriate), and keep synchronous wrappers to the very top level, where they can be implemented by calling g_main_context_iteration() on a specific GMainContext. Again, this makes future refactoring easier. You can see it in the above example: the thread uses g_output_stream_write_async() rather than g_output_stream_write().

Always match pushes and pops of the thread-default main context.

In a future post, I hope to explain in detail what’s in a GSource, and how to implement one, plus do some more in-depth comparison of poll() and GMainContext. Any feedback or corrections are gratefully received!

This past week, I’ve been working on a Clang plugin for GLib and GNOME, with the aim of improving static analysis of GLib-based C projects by integrating GIR metadata and assumptions about common GLib and GObject coding practices. The idea is that if running Clang for static analysis on a project, the plugin will suppress common GLib-based false positives and emit warnings for common problems which Clang wouldn’t previously have known about. Using it should be as simple as enabling scan-build, with the appropriate plugin, in your JHBuild configuration file.

I’m really excited about this project, which I’m working on as R&D at Collabora. It’s still early days, and the project has a huge scope, but so far I have a Clang plugin with two major features. It can:

Load GIR metadata1 and use it to add nonnull attributes based on the presence (or absence) of (allow-none) annotations.

Add warn_unused_result attributes to functions which have a (transfer container) or (transfer full) return type.

Add malloc attributes to functions which are marked as (constructor).

Constify the return types of (transfer none) functions which return a pointer type.

By modifying Clang’s abstract syntax tree (AST) for a file while it’s being scanned, the plugin can take advantage of the existing static analysis for NULL propagation to function calls with nonnull attributes, plus its analysis for variable constness, use of deprecated functions, etc.

So far, all the code is in two ‘annotaters’, which modify the AST but don’t emit any new warnings of their own. The next step is to implement some ‘checkers’, which examine (but don’t modify) the AST and emit warnings for common problems. Some of the problems checked for will be similar to those handled above (e.g. warning about a missing deprecated attribute if a function has a Deprecated: gtk-doc tag), but others can be completely new, such as checking that if a function has a GError parameter then it also has a g_return_if_fail(error == NULL || *error == NULL) precondition (or something along those lines).

I’ve got several big ideas for features to add once this initial work is complete, but the next thing to work on is making the plugin effortless to use (or, realistically, nobody will use it). Currently, the plugin is injected by a wrapper script around Clang, e.g.:

This injects the -load $PREFIX/lib64/clang/3.5/libclang-gnome.so -add-plugin gnome arguments which load and enable the plugin. By specifying GIR namespaces and versions in GNOME_CLANG_GIRS, those GIRs will be loaded and their metadata used by the plugin.

Please note that it’s alpha-quality code at the moment, and may introduce program bugs if used during compilation, so should only be used with Clang’s -analyze option. Furthermore, it still produces some false positives, so please be cautious about submitting bug reports to upstream projects ‘fixing’ problems detected by static analysis.

Actually, it loads the metadata from GIR typelib files, rather than using the source code annotations themselves, so there is a degree of data loss incurred due to the limitations of the typelib binary format. ↩

Recently, I've been looking into how character encoding and locales work on Linux, and I thought it might be worthwhile to write down my findings; partly so that I can look them up again later, and partly so that people can correct all the things I've got wrong.

To begin with, let's define some terminology:

Character set: a set of symbols which can be used together. This defines the symbols and their semantics, but not how they're encoded in memory. For example: Unicode. (Update: As noted in the comments, the character set doesn't define the appearance of symbols; this is left up to the fonts.)

Character encoding: a mapping from a character set to an representation of the characters in memory. For example: UTF-8 is one encoding of the Unicode character set.

Nul byte: a single byte which has a value of zero. Typically represented as the C escape sequence ‘\0’.

NULL character: the Unicode NULL character (U+0000) in the relevant encoding. In UTF-8, this is just a single nul byte. In UTF-16, however, it's a sequence of two nul bytes.

Now, the problem: if I'm writing a (command line) C program, how do strings get from the command line to the program, and how do strings get from the program to the terminal? More concretely, what actually happens with argv[] and printf()?

Let's consider the input direction first. When the main() function of a C program is called, it's passed an array of pointers to char arrays, i.e. strings. These strings can be arbitrary byte sequences (for example, file names), but are generally intended/assumed to be encoded in the user's environment's character encoding. This is set using the LC_ALL, LC_CTYPE or LANG environment variables. These variables specify the user's locale which (among other things) specifies the character encoding they use.

So the program receives as input a series of strings which are in an arbitrary encoding. This means that all programs have to be able to handle all possible character encodings, right? Wrong. A standard solution to this already exists in the form of libiconv. iconv() will convert between any two character encodings known to the system, so we can use it to convert from the user's environment encoding to, for example, UTF-8. How do we find out the user's environment encoding without parsing environment variables ourselves? We use setlocale() and nl_langinfo().

setlocale() parses the LC_ALL, LC_CTYPE and LANG environment variables (in that order of precedence) to determine the user's locale, and hence their character encoding. It then stores this locale, which will affect the behaviour of various C runtime functions. For example, it will change the formatting of numbers outputted by printf() to use the locale's decimal separator. Just calling setlocale() doesn't have any effect on character encodings, though. It won't, for example, cause printf() to magically convert strings to the user's environment encoding. More on this later.

nl_langinfo() is one function affected by setlocale(). When called with the CODESET type, it will return a string identifying the character encoding set in the user's environment. This can then be passed to iconv_open(), and we can use iconv() to convert strings from argv[] to our internal character encoding (which will typically be UTF-8).

At this point, it's worth noting that most people don't need to care about any of this. If using a library such as GLib – and more specifically, using its GOption command line parsing functionality – all this character encoding conversion is done automatically, and the strings it returns to you are guaranteed to be UTF-8 unless otherwise specified.

So we now have our input converted to UTF-8, our program can go ahead and do whatever processing it likes on it, safe in the knowledge that the character encoding is well defined and, for example, there aren't any unexpected embedded nul bytes in the strings. (This could happen if, for example, the user's environment character encoding was UTF-16; although this is really unlikely and might not even be possible on Linux — but that's a musing for another blog post).

Having processed the input and produced some output (which we'll assume is in UTF-8, for simplicity), many programs would just printf() the output and be done with it. printf() knows about character encodings, right? Wrong. printf() outputs exactly the bytes which are passed to its format parameter (ignoring all the fancy conversion specifier expansion), so this will only work if the program's internal character encoding is equal to the user's environment character encoding, for the characters being outputted. In many cases, the output of programs is just ASCII, so programs get away with just using printf() because most character encodings are supersets of ASCII. In general, however, more work is required to do things properly.

We need to convert from UTF-8 to the user's environment encoding so that what appears in their terminal is correct. We could just use iconv() again, but that would be boring. Instead, we should be able to use gettext(). This means we get translation support as well, which is always good.

gettext() takes in a msgid string and returns a translated version in the user's locale, if possible. Since these translations are done using message catalogues which may be in a completely different character encoding to the user's environment or the program's internal character encoding (UTF-8), gettext() helpfully converts from the message catalogue encoding to the user's environment encoding (the one returned by nl_langinfo(), discussed above). Great!

But what if no translation exists for a given string? gettext() returns the msgid string, unmodified and unconverted. This means that translatable string literals in our program need to magically be written in the user's environment encoding…and we're back to where we were before we introduced gettext(). Bother.

I see three solutions to this:

The gettext() solution: declare that all msgid strings should be in US-ASCII, and thus not use any Unicode characters. This works, provided we make the (reasonable) assumption that the user's environment encoding is a superset of ASCII. This requires that if a program wants to use Unicode characters in its translatable strings, it has to provide an en-US message catalogue to translate the American English msgid strings to American English (with Unicode). Not ideal.

The gettext()++ solution: declare that all msgid strings should be in UTF-8, and assume that anybody who's running without message catalogues is using UTF-8 as their environment encoding (this is a big assumption). Also not ideal, but a lot less work.

The iconv() solution: instruct gettext() to not return any strings in the user's environment encoding, but to return them all in UTF-8 instead (using bind_textdomain_codeset()), and use UTF-8 for the msgid strings. The program can then pass these translated (and untranslated) strings through iconv() as it did with the input, converting from UTF-8 to the user's environment encoding. More effort, but this should work properly.

An additional complication is that of combining translatable printf() format strings with UTF-8 string output from the program. Since printf() isn't encoding-aware, this requires that both the format string and the parameters are in the same encoding (or we get into a horrible mess with output strings which have substrings encoded in different ways). In this case, since our program output is in UTF-8, we definitely want to go with option 3 from above, and have gettext() return all translated messages in UTF-8. This also means we get to use UTF-8 in msgid strings. Unfortunately, it means that we now can't use printf() directly, and instead have to sprintf() to a string, use iconv() to convert that string from UTF-8 to the user's environment encoding, and thenprintf() it. Whew.

Here's a diagram which hopefully makes some of the journey clearer (click for a bigger version):

So what does this mean for you? As noted above, in most cases it will mean nothing. Libraries such as GLib should take care of all of this for you, and the world will be a lovely place with ponies (U+1F3A0) and cats (U+1F431) everywhere. Still, I wanted to get this clear in my head, and hopefully it's useful to people who can't make use of libraries like GLib (for whatever reason).

Exploring exactly what GLib does is a matter for another time. Similarly, exploring how Windows does things is also best left to a later post (hint: Windows does things completely differently to Linux and other Unices, and I'm not sure it's for the better).