Ramblings on computers…

Tag Archives: GLib

From a discussion on #gtk+ this morning: if you’re using recent versions of GLib with structured logging support, and you want to work out which bit of your code is causing a certain message to be printed to the terminal, run your application in gdb and add a breakpoint on g_log_writer_standard_streams.

(This assumes you’re using the default log writer function; if not, you need to add a breakpoint on something in your writer function.)

tl;dr: Visualise your main context and sources using Dunfell. Feedback and ideas welcome.

At the DX hackfest, I’ve been working on a new tool for instrumenting and visualising the behaviour of the GLib main context (or main contexts) in your program.

It’s called Dunfell (because I’m a sucker for hills) and at a high level it works by using SystemTap to record various GMainContext interactions in your program, saving them to a log file. The log file can then be examined using a viewer program.

The source is available on GitLab or GitHub because I still haven’t decided which is better.

In the screenshot above, each vertical line is a thread, each blue box is one dispatch phase of the main context which is currently running on that thread, each orange blob is a new GSource being created, and the green blob is a GSource which has been selected for closer inspection.

At the moment, it requires a couple of GLib patches to add some more SystemTap probe points, and it also requires a recent version of GTK+. It needs SystemTap, and I’ve only tested it on Fedora, so it might need some patching to work with the SystemTap installed on other distributions.

More visualisation ideas are welcome! At the moment, what Dunfell draws is quite simplistic. I hope it will be able to solve various common debugging problems eventually but suggestions for ways to do this intuitively, or for other problems to visualise, are welcome. Here are the use cases I was initially thinking about (from the README):

Detect GSources which are never added to a GMainContext.

Detect GSources which are dispatched too often (i.e. every main context iteration).

Detect GSources which are never removed from their GMainContext after being dispatched (but which are never dispatched again).

Detect GMainContexts which have GSources attached or (especially) events pending, but which aren’t being iterated.

Monitor the load on each GMainContext, such as how many GSources it has attached, and how many events are processed each iteration.

Monitor ongoing asynchronous calls and GTasks, giving insight into their nesting and dependencies.

Monitor unfinished or stalled asynchronous calls.

Allow users to record logs to send to the developers for debugging on a different machine. The users may have to install additional software to record these logs (some component of Dunfell, plus its dependencies), but should not have to recompile or otherwise modify the program being debugged.

Work with programs which purely use GLib, through to programs which use GLib, GIO and GTK+.

Allow visualisation of this data, both in a standalone program, and in an IDE such as GNOME Builder.

Allow visualising differences between two traces.

Minimise runtime overhead of logging a program, to reduce the risk of disturbing race conditions by enabling logging.

Connecting to an already-running program is not a requirement, since by the time you’ve decided there’s a problem with a program, it’s already in the wrong state.

The misconception seems to be that they cause assertion failures. I think that arises from the fact that G_IS_OBJECT is commonly used with g_return_if_fail(), which does cause an assertion failure if G_IS_OBJECT returns FALSE.

Similarly, this all applies to the macros for GObject subclasses, like GTK_WIDGET and GTK_IS_WIDGET, or G_FILE and G_IS_FILE, etc.

The interface is designed around polling, potentially-blocking I/O. What’s ‘potentially-blocking’ about it? The timeout parameter. Set it to zero for non-blocking behaviour, where the functions will return G_IO_ERROR_WOULD_BLOCK if they would block. Set it negative for blocking behaviour, where the functions will not return until they can do at least some I/O. Set it positive for timeout behaviour, where the functions will block for the given number of microseconds, then return G_IO_ERROR_TIMED_OUT if they managed to perform no I/O.

Currently, the API (particularly GInputMessage and GOutputMessage, due to the way they are used as in-out parameters) doesn’t support introspection. This can be added in future if needed by creating some convenience API for allocating and freeing the message structures as boxed types.

The grand, overarching plan is for this to appear in a libnice version near you, some time soon, exposing the whole of an ICE connection as a GDatagramBased.

tl;dr: Write a custom GSource if you have a non-file-descriptor-based event source to integrate with a GMainContext. It’s a matter of writing a few virtual functions.

What is GSource?

A GSource is an expected event with an associated callback function which will be invoked when that event is received. An event could be a timeout or data being received on a socket, for example.

GLib contains various types of GSource, but also allows applications to define their own, allowing custom events to be integrated into the main loop.

The structure of a GSource and its virtual functions are documented in detail in the GLib API reference.

A message queue source

As a running example, a message queue source will be used which dispatches its callback whenever a message is enqueued to a queue internal to the source (potentially from another thread).

This type of source is useful for efficiently transferring large numbers of messages between main contexts. The alternative is transferring each message as a separate idle GSource using g_source_attach(). For large numbers of messages, this means a lot of allocations and frees of GSources.

Structure

Firstly, a structure for the source needs to be declared. This must contain a GSource as its parent, followed by the private fields for the source: the queue and a function to call to free each message once finished with.

Next, the prepare function for the source must be defined. This determines whether the source is ready to be dispatched. As this source is using an in-memory queue, this can be determined by checking the queue’s length: if there are elements in the queue, the source can be dispatched to handle them.

return (g_async_queue_length (message_queue_source->queue) > 0);

Check function

As this source has no file descriptors, the prepare and check functions essentially have the same job, so a check function is not needed. Setting the field to NULL in GSourceFuncs bypasses the check function for this source type.

Dispatch function

For this source, the dispatch function is where the complexity lies. It needs to dequeue a message from the queue, then pass that message to the GSource’s callback function. No messages may be queued: even through the prepare function returned true, another source wrapping the same queue may have been dispatched in the mean time and taken the final message from the queue. Further, if no callback has been set for the GSource (which is allowed), the message must be destroyed and silently dropped.

If both a message and callback are set, the callback can be invoked on the message and its return value propagated as the return value of the dispatch function. This is FALSE to destroy the GSource and TRUE to keep it alive, just as for GSourceFunc — these semantics are the same for all dispatch function implementations.

/* Pop a message off the queue. */
message = g_async_queue_try_pop (message_queue_source->queue);
/* If there was no message, bail. */
if (message == NULL)
{
/* Keep the source around to handle the next message. */
return TRUE;
}
/* @func may be %NULL if no callback was specified.
* If so, drop the message. */
if (func == NULL)
{
if (message_queue_source->destroy_message != NULL)
{
message_queue_source->destroy_message (message);
}
/* Keep the source around to consume the next message. */
return TRUE;
}
return func (message, user_data);

Callback functions

The callback from a GSource does not have to have type GSourceFunc. It can be whatever function type is called in the source’s dispatch function, as long as that type is sufficiently documented.

Normally, g_source_set_callback() is used to set the callback function for a source instance. With its GDestroyNotify, a strong reference can be held to keep an object alive while the source is still alive:

However, GSource has a layer of indirection for retrieving this callback, exposed as g_source_set_callback_indirect(). This allows GObject to set a GClosure as the callback for a source, which allows for sources which are automatically destroyed when an object is finalized — a weak reference, in contrast to the strong reference above:

It also allows for a generic, closure-based ‘dummy’ callback, which can be used when a source needs to exist but no action needs to be performed in its callback:

g_source_set_dummy_callback (source);

Constructor

Finally, the GSourceFuncs definition of the GSource can be written, alongside a construction function. It is typical practice to expose new source types simply as GSources, not as the subtype structure; so the constructor returns a GSource*.

The example constructor here also demonstrates use of a child source to support cancellation conveniently. If the GCancellable is cancelled, the application’s callback will be dispatched and can check for cancellation. (The application code will need to make a pointer to the GCancellable available to its callback, as a field of the callback’s user data set in g_source_set_callback()).

Sources can be more complex than the example given above. In libnice, a custom GSource is needed to poll a set of sockets which changes dynamically. The implementation is given as ComponentSource in component.c and demonstrates a more complex use of the prepare function.

Another example is a custom source to interface GnuTLS with GLib in its GTlsConnection implementation. GTlsConnectionGnutlsSource synchronizes the main thread and a TLS worker thread which performs the blocking TLS operations.

Day 5, and the DX and docs hackfest in Collabora HQ, Cambridge has drawn to a close. It’s been great to have everyone here, and there have been a lot of in-depth discussions over the last few days about the details of app sandboxing, runtimes, Builder integration with various new services, the development of an IDE abstraction layer, approaches for making build systems accessible to Builder, lots of new things to statically analyse, and some fairly fundamental additions to GLib in the form of G_DECLARE_[FINAL|DERIVABLE]_TYPE and general-purpose reference counted memory areas. Whew! We even had a fleeting visit by Richard Hughes to discuss packaging issues for apps.

I can’t do justice to the work of the docs team, who put in consistent, solid effort throughout the hackfest. See the blogs by Petr, Bastian, Kat and Jim for all the details. They even left me with a seemingly endless supply of Mallard balls to throw around the office!

Dave and I have spent a little while working on further deprecating gnome-common. More details to come once the migration guide is finished.

Today was a bit of a slow start, since people were still arriving throughout the day. Regardless, there have been various discussions, with Ryan, Emmanuele and Christian discussing performance improvements in GLib, Christian and Allan plotting various different approaches to new UI in Builder, Cosimo and Carlos silently plugging away at GTK+, and Emmanuele muttering something about GProperty now and then.

Tomorrow, I hope we can flesh out some of these initial discussions a bit more and get some roadmapping down for GLib development for the next year, amongst other things. I am certain that Builder will feature heavily in discussions too, and apps and sandboxing, now that Alex has arrived.

I’ve spent a little time finishing off and releasing Walbottle, a small library and set of utilities I’ve been working on to implement JSON Schema, which is the equivalent of XML Schema or RELAX-NG, but for JSON files. It allows you to validate JSON instances against a schema, to validate schemas themselves and, unusually, to automatically generate parser unit tests from a schema. That way, you can automatically test json-glib–based JsonReader/JsonParser code, just by passing the JSON schema to Walbottle’s json-schema-generate utility.

It’s still a young project, but should be complete enough to be useful in testing JSON code. Please let me know of any bugs or missing features!

tl;dr: Use g_set_object() to conveniently update owned object pointers in property setters (and elsewhere); see the bottom of the post for an example.

A little while ago, GLib gained a useful function called g_clear_object(), which clears an owned object pointer (a pointer which owns a reference to a GObject). GLib also just gained a function called g_set_object(), which works similarly but can either clear an object pointer or update it to point to a new object.

Why is this valuable? It saves a few lines of code each time an object pointer is updated, which isn’t much in itself. However, one thing it gets right is the order of reference counting operations, which is a common mistake in property setters. Instead of:

because otherwise, if (new_object == object_ptr)(or if the objects have some other ownership relationship) and the object only has one reference left, object_ptr will end up pointing to a finalised GObject (and g_object_ref() will be called on a finalised GObject too, which it really won’t like).

So how does g_set_object() help? We can now do:

g_set_object (&object_ptr, new_object);

which takes care of all the reference counting, and allows new_object to be NULL. &object_ptr must not be NULL. If you’re worried about performance, never fear. g_set_object() is a static inline function, so shouldn’t adversely affect your code size.

Even better, the return value of g_set_pointer() indicates whether the value changed, so we can conveniently base GObject::notify emissions off it:

/* This is how all GObject property setters should look in future. */
if (g_set_object (&priv->object_ptr, new_object))
g_object_notify (self, "object-ptr");

Hopefully this will make property setters (and other object updates) simpler in new code.

For the past several months, Olivier Crête and I have been working on a project using libnice at Collabora, which is now coming to a close. Through the project we’ve managed to add a number of large, new features to libnice, and implement hundreds (no exaggeration) of cleanups and bug fixes. All of this work was done upstream, and is available in libnice 0.1.8, released recently! GLib has also gained a number of networking fixes, API additions and documentation improvements.

Firstly, what is libnice? It’s a GLib implementation of ICE, the standard protocol for NAT traversal. Briefly, NAT traversal is needed when two hosts want to communicate peer-to-peer in a network where there is at least one NAT translator between them, meaning that at least one of the hosts cannot directly address the other until a mapping is created in the NAT translator. This is a very common situation (due to the shortage of IPv4 addresses, and the consequence that most home routers act as NAT translators) and affects virtually all peer-to-peer communications. It’s well covered in the literature, and the rest of this post will assume a basic understanding of NAT and ICE, a topic about which I recently gave a talk.

Conceptually, libnice exists just to create a reliable (TCP-like) or unreliable (UDP-like) socket which connects your host with a remote one in a manner that traverses any intervening NATs. At its core, it is effectively an implementation of send(), recv(), and some ancillary functions to negotiate the ICE stream at startup time.

Highly related, the original receive API has been augmented with scatter–gather support in the form of a recvmmsg()-like API: nice_agent_recv_messages(). Along with appropriate improvements to libnice’s underlying socket implementations (the most obscure of which are still to be plumbed in), this allows performance improvements by batching messages, reducing the number of system calls needed for communication. Furthermore (perhaps more importantly) it reduces memory copies when assembling and parsing packets, by allowing the packets to be split across multiple non-contiguous buffers. This is a well-studied and long-known performance technique in networking, and it’s nice that libnice now supports it.

So, if you have an ICE connection (stream 1 on agent, with 2 components) exchanging packets with 20B headers and variable-length payloads, instead of:

libnice has also gained non-blockingvariants of its I/O functions. Previously, one had to explicitly attach a libnice stream to a GMainContext to start receiving packets. Packets would be delivered individually via a callback function (set with nice_agent_attach_recv()), which was inefficient and made for awkward control flow. Now, the non-blocking I/O functions can be used with a custom GSource from g_pollable_input_stream_create_source() to allow for more flexible reception of packets using the more standard GLib pattern of attaching a GSource to the GMainContext and in its callback, calling g_pollable_input_stream_read_nonblocking() until all pending packets have been read. libnice’s internal timers (used for retransmit timeouts, etc.) are automatically added to the GMainContext passed into nice_agent_new() at construction time, which you must run all the time as before.

Finally, FIN/ACK support has been added to libnice’s pseudo-TCP implementation. The code was originally based on Google’s libjingle pseudo-TCP, establishing a reliable connection over UDP by encapsulating TCP-like packets within UDP. This implemented the basics of TCP, but left things like the closing FIN/ACK handshake to higher-level protocols. Fine for Google, but not for our use case, so we added support for that. Furthermore, we needed to layer TLS over a pseudo-TCP connection using GTlsConnection, which required implementing half-duplex close support and fixing a few nasty leaks in GTlsConnection.

After a couple of discussions at the DX hackfest about cross-platform-ness and deployment of GLib, I started wondering: we often talk about how GNOME developers work at all levels of the stack, but how much of that actually qualifies as ‘core’ work which is used in web servers, in cross-platform desktop software1, or commonly in embedded systems, and which is security critical?

On desktop systems (taking my Fedora 19 installation as representative), we can compare GLib usage to other packages, taking GLib as the lowest layer of the GNOME stack:

Package

Reverse dependencies

Recursive reverse dependencies

glib2

4001

–

qt

2003

–

libcurl

628

–

boost-system

375

–

gnutls

345

–

openssl

101

1022

(Found with repoquery --whatrequires [--recursive] [package name] | wc -l. Some values omitted because they took too long to query, so can be assumed to be close to the entire universe of packages.)

Obviously GLib is depended on by many more packages here than OpenSSL, which is definitely a core piece of software. However, those packages may not be widely used or good attack targets. Higher layers of the GNOME stack see widespread use too:

Package

Reverse dependencies

cairo

2348

gdk-pixbuf2

2301

pango

2294

gtk3

801

libsoup

280

gstreamer

193

librsvg2

155

gstreamer1

136

clutter

90

(Found with repoquery --whatrequires [package name] | wc -l.)

Widely-used cross-platform software which interfaces with servers2 includes PuTTY and Wireshark, both of which use GTK+3. However, other major cross-platform FOSS projects such as Firefox and LibreOffice, which are arguably more ‘core’, only use GNOME libraries on Linux.

How about on embedded systems? It’s hard to produce exact numbers here, since as far as I know there’s no recent survey of open source software use on embedded products. However, some examples:

Qt pulls in GLib as an optional dependency (automatically enabled if compiled for X11, QWS or QPA) for its main loop, and is used widely in embedded systems.

So there are some sample points which suggest moderately widespread usage of GNOME technologies in open-source-oriented embedded systems. For more proprietary embedded systems it’s hard to tell. If they use Qt for their UI, they may well use GLib’s main loop implementation. I tried sampling GPL firmware releases from gpl-devices.org and gpl.nas-central.org, but both are quite out of date. There seem to be a few releases there which use GLib, and a lot which don’t (though in many cases they’re just kernel releases).

Servers are probably the largest attack surface for core infrastructure. How do GNOME technologies fare there? On my CentOS server:

I can’t find much evidence of other GNOME libraries in use, though, since there isn’t much call for them in a non-graphical server environment. That said, there has been heavy development of server-grade features in the NetworkManager stack, which will apparently be in RHEL 7 (thanks Jon).

So it looks like GLib, if not other GNOME technologies, is a plausible candidate for being core infrastructure. Why haven’t other GNOME libraries seen more widespread usage? Possibly they have, and it’s too hard to measure. Or perhaps they fulfill a niche which is too small. Most server technology was written before GNOME came along and its libraries matured, so any functionality which could be provided by them has already been implemented in other ways. Embedded systems seem to shun desktop libraries for being too big and slow. The cross-platform support in most GNOME libraries is poorly maintained or non-existent, limiting them to use on UNIX systems only, and not the large OS X or Windows markets. At the really low levels, though, there’s solid evidence that GNOME has produced core infrastructure in the form of GLib.

As much as 2014 is the year of Linux on the desktop, Windows and Mac still have a much larger market share. ↩