Appdash, an open-source perf tracing suite

By Quinn Slack on May 30, 2016

Every developer knows they should instrument their app to identify perf bottlenecks, but it’s hard to actually get around to doing it — especially when you’re focused on shipping the latest and greatest features of your site.

Now, point your browser at localhost:8699. This loads the main page of the sample web app, which issues three API requests on the backend. You can view the trace of this page load by clicking on the link in the interface, which opens up the Appdash UI for the trace.

/: The root route, which is visited by users inside their web browser.

/endpoint: an API endpoint.

The API endpoint code pauses for 200ms, to simulate slowness that in a real application might be due to hitting the database or some external service. When a user visits the root route (/), the backend makes three outbound API requests before responding to the user.

Appdash data model

Appdash is heavily influenced by the Dapper data model and has 4 main concepts:

Spans: A span refers to an operation and all of its children. For example, an HTTP handler handles a request by calling other components in your system, which in turn make various API and DB calls. The HTTP handler’s span includes all downstream operations and their descendents; likewise, each downstream operation is its own span and has its own descendents. In this way, appdash constructs a tree of all of the operations that occur during the handling of the HTTP request.

Event: Your application records the various operations it performs (in the course of handling a request) as Events. Events can be arbitrary messages or metadata, or they can be structured event types defined by a Go type (such as an HTTP ServerEvent or an SQLEvent).

Recorder: Your application uses a Recorder to send events to a Collector (see below). Each Recorder is associated with a particular span in the tree of operations that are handling a particular request, and all events sent via a Recorder are automatically associated with that context.

Collector: A Collector receives Annotations (which are the encoded form of Events) sent by a Recorder. Typically, your application’s Recorder talks to a local Collector (created with NewRemoteCollector. This local Collector forwards data to a remote appdash server (created with NewServer that combines traces from all of the services that compose your application. The appdash server in turn runs a Collector that listens on the network for this data, and it then stores what it receives.

Appdash supports storing data in different underlying Stores. The example code uses an appdash.MemoryStore wrapped in an appdash.RecentStore. This means that the data is stored in memory for 20 seconds before being evicted and discarded (which is useful for applications with storage limitations). You can also store data in a SQL database. It’s easy to add support for other databases, as well. You just need to implement the PersistentStore interface.

Integrating Appdash into your app

There are two ways to run the Appdash web UI. You can either use appdash serve on the command line or embed the UI directly into the application that is being monitored, in which case Appdash will run in-process and listen on a separate port. The example app does it the second way.

In a production environment, you could either use a centralized Appdash server or simply block the Appdash port from external access via your firewall. To embed the web UI into our app, we use the appdash/traceapp package.

To integrate Appdash into your app, you’ll need to create a Collector to collect Annotations (the encoded form of Events) for a given Span provided by a Recorder. In the example code, the collector runs in process, but it can also be a remote service run via appdash serve.

To create our local collector, we simply give it an appdash.Store.

The httptrace Middleware

Lastly, we need to create and configure our appdash/httptrace.Middleware, which hooks into your web app’s HTTP handler and pulls the necessary timing and meta-data information (e.g., HTTP headers, status codes, etc.) and generates Events for the Collector to consume.

The RouteName field is simply a function that returns the name of the URL route for a given HTTP request. In the case of the example app, we simply use the request’s path (r.URL.Path). If you are using a routing library like Gorilla mux, you would probably assign it the name of the mux.Route. The route name will be displayed on the span in the web UI.

The SetContextSpan field lets you store a appdash.SpanID of an HTTP request. In our case, we simply use gorilla/context to associate it locally with the request for future use. We will explain below why you would want to know the span ID.

The httptrace Transport

If your web app makes HTTP requests to other services (these can be external services or an internal API if it’s run as a separate service), Appdash can keep track of a thread of execution across HTTP boundaries. This means if an application endpoint (that generates dynamic HTML) makes a call to an API endpoint over the course of its execution, Appdash will be able to associate the specific API request with the specific application request.

This is done by supplying a special http.Transport to the HTTP client that issues the request inside your app. To do this, wrap the existing http.Transport in an instance of appdash/httptrace.Transport. You can then use the http.Client as you would any other http.Client.

Linking to the Trace

Let’s say that a user of your application has reported to you that it’s responding very slowly to their requests. With thousands of people using the service everyday, browsing through the list of all traces hoping to find the trace corresponding to their request is like trying to find a needle in a haystack.

Appdash solves this problem by giving you access to the appdash.SpanID for a given request in the SetContextSpan function for the httptrace.Middleware described above. Earlier, we used gorilla/context to associate the SpanID with the request. And we can render the span IDright into our very simple web page (perhaps in an HTML comment if you didn’t want it to be visible to all users).

Now, a user who is experiencing performance issues with our site can directly give us the trace ID of a slow request. Alternatively, you could create an automated system to do this whenever a user reports an issue from a slow page.

Start tracing today!

Appdash is an incredibly versatile and easy-to-deploy performance and debug tracing suite for web applications. It supports Go and Python, and we’d love to add more languages with help from the community. It’s being used today in production applications at Sourcegraph, and we hope you’ll find it useful for your own web app.