Pages

Posts by Nicholas Blumhardt

Things have been quiet on my blog for the last few months, but certainly lively elsewhere around the web. I thought I’d post a few pointers to what’s new in the worlds of Seq and Serilog.

Hey— I’ve started cross-posting anything related to Seq on the Seq blog; this (nblumhardt.com) site will continue as my personal blog with a bit more varied material including most Seq content for now, so stay tuned in!

Event-processing angles

I thought it was cool to see this sample from the XSockets.NET team showing how to connect to a running (Serilog-enabled) application to collect logs.

Unlike other sinks we’ve seen to date, in this case the sink is a server, and clients connecting via XSockets request the level of log events they wish to be sent. It’s a really exciting angle that I think deserves serious exploration (think on-demand real-time diagnostics…)

Site redesigns!

Both Serilog and Seq themselves got new websites in the last couple of months.

Serilog is now an eye-catching red design:

I’m really happy with the way this one turned out – it’s nice and simple, and keeps all of the important info (how to use Serilog; where to find documentation and so-on) front-and-center.

Seq now has a new layout that makes it easier to organize content:

Currently the biggest improvement, other than the livelier design, is better access to older versions and release notes via the improved Downloads area. It’s also nice to have complete setup instructions back on the front page to show just how easy it is to get started.

TL;DR: Seq 1.4 is out, and with it we’re fine tuning the licensing options.

Building a software product is a continual exercise in listening, learning and adapting. Since releasing v1.0 of Seq nearly five months ago, we’ve watched keenly as customers put it to work and to be frank, we now have a much better understanding of how Seq fits in to the tooling landscape.

Part of this learning has shown us how customers fit Seq into their own environments, including how those environments dictate licensing requirements. Feedback has highlighted some discrepancies between how we license Seq and how it’s used, so for the past month we’ve been working on a revision to our model.

First, two important things:

If you’re using Seq today, you don’t need to change anything – the license you use Seq under is unchanged unless you upgrade to a newer version (outside of the one-year included upgrades period)

If you’re worse off under the new model, or have questions/suggestions please email support@getseq.net – we will help you across

Changes to our free tier

The challenge with licensing is to make a fair cost/benefit proposal: Seq should be less expensive to use when the benefit is less and when the effort in supporting it is less. When the benefit is greater, or when the support requirements are greater, asking more helps us to provide the level of service required.

Unfortunately, the unlimited-usage “Developer Edition” we launched with thwarts this requirement – how much Seq costs to use ended up coming down to whether the production environment was fully-secured or not. In a production environment secured by other means (e.g. a VPN or other network-level security) the price to use Seq might be zero (via the “Developer Edition”), while a small difference in network topology might require the top-level “Enterprise Edition” for the exact same use of Seq – that is, the exact same benefit.

To fix this, we’ve bitten the bullet and decided to license Seq purely on a per-user basis. The old Developer Edition has been replaced with a new, fully-securable Single-User license. Many, if not most, customers using the old Developer Edition will be able to switch to the new Single-User license and take advantage of authentication and SSL to put Seq in more places.

No model is perfect, but we think setting the price on the basis of the number of users is a much better measure of both how much value Seq will provide, and how much support/development we will need to sustain to make the best experience possible. Basing this on whether or not authentication is required was a mistake on our part.

Some teams will be using the old Developer Edition with more than one user, so won’t be able to take advantage of the new Single-User license. If you’re in this position, and can’t switch to one of our subscriptions or per-user tiers, get in touch – supporting you is important to us and we’ll help you across so that you can continue to benefit from new Seq releases.

New per-user tiers

Previously we supplied Seq in five, fifteen and twenty-five user tiers. There turns out to be a big difference between a six-user team and a fourteen-user one, so the new model is more fine-grained, providing five-user increments from 5 all the way to 25 users.

Three server installations with all tiers

Perhaps more importantly, Seq previously included a single server installation with each of the five- and fifteen-user tiers. It turns out that this penalised users with segregated environments: if “Dev” and “Test” are on-premises, but “Prod” in the cloud, requiring two licenses seems fairly arbitrary – where an environment is situated shouldn’t make such a big impact on pricing.

In the new model, each tier includes three Seq servers. This means a single license should cover most teams regardless of how their environments are organised.

New unlimited-user Enterprise license

In Enterprise environments, counting seats can be annoying and lead to more time spent in the purchasing process as teams change size and shift focus. For Enterprise customers we’re now offering an unlimited-user license, installable on up to ten servers, to make life simpler.

We’re also looking at ways to make an Enterprise SLA available alongside this option – please get in touch via sales@getseq.net if this is needed in your organization.

Subscription pricing

Along with our up-front pricing, we’re now offering Seq on a pay-as-you go basis for those who’d rather pay by the month. We’re making this available primarily as a convenience – the product you get is exactly the same, i.e. you will still host Seq on your own servers.

Subscriptions come with full email support and access to all upgrades during the subscription period.

Serilog’s often used to log out objects as structured data. For example:

log.Information("Executing {@Command}...", cmd);

will attach a structured Command property to the event that can later be queried or processed.

Sometimes we don’t want all of the properties of an object included. Where cmd is, say, a LoginCommand:

publicclass LoginCommand{publicstring Username { get; set;}

publicstring Password { get; set;}

publicbool RememberMe { get; set;}}

We’d like to log the username attached to the command, but not the password. The same situation arises with credit card numbers and other sensitive information. Without some care we’ll end up logging events like:

The problem with this approach is pretty obvious — it centralizes destructuring information in a way that might not always co-operate with assembly dependencies and so-on, and gets cluttered as more types need customization.

Why ‘destructuring’ anyway?

Here’s a chance to explain one of the more unusual pieces of terminology in Serilog. The name Serilog is derived from serializing logger, and destructuring pretty much equates with serialization, so why not call it that?

The original design for Serilog was going to tackle the problem we’re observing here using a syntax like this:

log.Information("Executing {@Command(Username,RememberMe)}...", cmd);

This would have the same effect as the configuration we just looked at, selecting just the Username and RememberMe properties from the command object, but doesn’t suffer the same “centralization” issues.

This style of describing “what you want” is actually implemented in some programming languages with a feature called destructuring assignment. Hence, the name! It’s not likely that we’ll add this feature to Serilog now; a more familiar way to achieve the same result is to use attributes, as we’ll see next.

Other goodies in Serilog.Extras.Attributed

There’s one other little thing that you might find handy in this package. Some types, while they’re “complex data” in some sense, don’t serialize well for storage in events. These types need to be treated like scalar values (int, string and so-on) or by conversion with ToString() before logging.

You can force this behavior using the LogAsScalar attribute; here’s an example that appears in Seq:

Without LogAsScalar, the FilterExpression would be serialized as a (potentially) large tree-shaped object, that while interesting, isn’t valuable in a log. Using the attribute let’s us keep the property in the log, but use its more friendly ToString() representation.

Perhaps the most useful, but seldom-noticed benefit of a structured log is having the first-class notion of an event type.

When analyzing a traditional text-based log, there’s no concrete relationship between the messages:

Pre-discount tax total calculated at $0.14
Pre-discount tax total calculated at $5.20
Customer paid using CreditCard

From a tooling perspective, each might as well be a unique block of arbitrary text. A lot of work’s required using tools like logstash to go further than that.

In a structured log from Serilog, the message template passed to the logging function is preserved along with the event. Since the first two come from the template:

Pre-discount tax total calculated at {TaxAmount}

While the third comes from:

Customer paid using {PaymentMethod}

We can use this information to unambiguously find or exclude either kind of event. It’s so simple and straightforward it is mind-boggling not to see this technique used everywhere!

Working with message templates is verbose though, so Seq produces a 32-bit (Murmur) hash of the message template that can be referred to using hex literals, for example Pre-discount tax total calculated at {TaxAmount}→ $A26D9943, while Customer paid using {PaymentMethod}→ $4A310040.

Seq calls this the hash the “event type”. Here’s Seq selecting all events of the first type on my test machine:

This was for me, actually, one of the core reasons to justify spending a bunch of time on Serilog and Seq – application logs no longer seemed like a hack or a crutch: working with “real” events that have an actual type elevates them to something that can be usefully manipulated and programmed against.

So, what else can you do with a log imbued with first-class event types?

Event types and “interestingness”

When I log into any of the Seq servers I maintain, they look much like this:

All seems pretty good – nothing out of the ordinary. Well, at least in the last minute or two…?! It’s really hard looking at a small window on a large event stream to make any proactive decisions, because by definition, the interesting events are a minority.

A compromise I use often is to create a view just for “Warnings and Errors” – unfortunately, on a large stream, this view’s not so useful either: the same kinds of things show up again and again. Diligently excluding uninteresting events by type can help here, but that’s a very manual process and isn’t much fun in practice.

Seq 1.4 introduces some simple local storage for hosted apps, so as a test case for that I’ve spiked a different approach to finding “interesting” events. Using some app-local storage, why not keep track of which event types we’ve seen before, and emit an event whenever something “new” happens?

For example, in the case:

for(var i =0; i <10;++i){
Log.Information("The value is {Counter}", i);}

We’d detect the event “The value is 0″ and then ignore further events with this event type (for values 1, 2, …). By keeping tabs on new events I might spot something unusual, or, I might just decide to incorporate the new event type into a counter or chart.

There’s a broader notion of “interestingness” than this of course, but for a simple example it’s still quite powerful.

First of Type

Here’s a new view on the same server shown earlier:

Rather than list the events themselves, we see the first occurrence of each event type and a hyperlink to view the event itself (shown in the expanded property view).

The new feature that enables this is the StoragePath property, which simply provides a unique filesystem folder for the running app instance:

var stateFile = Path.Combine(StoragePath, StateFilename);

To keep runtime performance of the app and the size of the state file constant, we use a Bloom filter to track which event types we’ve seen:

_filter =new UInt32BloomFilter(File.ReadAllBytes(stateFile));

The spike of the app as currently written uses a slightly smaller filter, and fewer hashes than desirable – if anyone’s interested in “doing the math” and setting up a better filter, PRs are accepted for this!

The language of the Bloom filter’s API – “may contain”, stresses that an approximation is being used:

if(!_filter.MayContain(evt.EventType))

This does mean that new types can be missed. Tuning the filter can bring this down to a negligible percentage.

We don’t have to do anything special to get evt.Id hyperlinked in the Seq UI – the UI’s smart enough to detect this.

The rest of the app simply maintains the state file as event types are seen.

What’s Next?

I’ve often written log messages thinking “well, if I ever see this message I’ll know there’s a bug” – using this app there’s actually the possibility that I might notice a needle in the haystack!

It’s not a big leap from here to imagine a similar app that could track property values on an event type, either testing for uniqueness or observing variance from an historical range. Having local storage within Seq apps opens up a lot of possibilities I hadn’t given much thought to in the past. Happy hacking! :)

Collecting too much log data makes the “needle in the haystack” all the more difficult to find, but for practical reasons it is often easier to collect as much data as possible anyway and “sort it out later” via Seq retention policies.

These have served us well since the early versions of Seq, and selective retention is a feature that’s been widely appreciated by users. Retention policies do come at a cost however – as events are removed from the stream, space in the storage engine needs to be compacted and this can use significant resources on the server (CPU, RAM and I/O).

In many cases, it is preferable to drop uninteresting events as they arrive. This facility is now available in Seq 1.4.2-pre as “input filters”.

Each Seq API key can now specify a filter expression – just like the event stream supports – that identifies events that will be accepted when logged with the key:

A typical use case may be to drop events below a certain level, but any criterion can be included.

If for any reason it becomes necessary to change the filter during operation, updating it takes immediate effect so that more or fewer events can be recorded.

It’s a simple feature right now that we’re looking for feedback on – if you can grab the new preview from the download page (preview link is at the bottom) and try it out it would be great to hear your thoughts!

Not currently in the preview, but also coming in 1.4 is a MinLevel() query function that will tidy up filtering by level like the example shows.

A few days ago I took an impromptu snapshot of one of the Azure machines I have running Seq. Cloud storage being what it is, an I/O problem during the snapshot process crashed the Seq service, but I didn’t notice until the following day. The machine was running horrifically slowly at this point, so I restarted it and all came good.

When I logged into Seq to check some queries, I expected to see a big gap during the outage. It took me a few seconds to realize why there wasn’t one – all of the events raised during the outage period were there! – and that made me think this feature of the Seq client for Serilog might not be as well-known as I’d assumed.

When you set up Serilog to write events to Seq, you use code typically like:

If a buffer location is specified, the Seq client will write events to a set of rolling files like "myapp-2014-06-27.jsnl" (newline-separated JSON) and then ship them to the server when it’s possible. Almost nothing to configure, and voila! Reliable log collection.

There’s still no substitute for a redundant, transactional data store for truly critical events: machines can be lost, disks fill up and so-on. For general application monitoring and diagnostics however, I’m really pleased with how much “bang for the buck” this feature provides.

Look carefully at the string formatting though – particularly the way we give names to each of the parameters. The “message” part isn’t destructively rendered into text, but preserved as an object with property values. It can be formatted to the console etc. as usual:

[inf] Hi Nick, your flight leaves in 45 minutes

But more interestingly, it can be serialized in a format like JSON for processing and querying:

Preserving the structure of log data, and making it trivially easy to enrich that data with more contextual information, makes it possible to do some really clever things with the resulting logs.

Try finding all messages where the user’s name starts with “N”, or where the wait time is greater than one hour in a traditional log file if you’re skeptical here. In some trivial cases a regular expression might handle this cleanly, but it quickly becomes complex and cumbersome when more data is involved. Using Serilog it can be as simple as startswith(name, "N") or wait > 60.

Preserving the message template itself ("Hi, {name}...") also has some interesting and very useful results: the message template becomes a kind of “message type” that can be used to find – or exclude – messages of a specific type. Again very hard, often impossible with traditional loggers.

What’s going on in JavaScript?

In JavaScript as elsewhere, there are already some sophisticated logging frameworks, with varied levels of structured data support. A fairly new project called node-bunyan interests me the most, with its clever DTRACE integration and support for serializing objects instead of messages, but even here the new hybrid-style API and DSL provided by Serilog is missing. With node-bunyan it’s possible to log an object and/or a message, but messages are still formatted out in a lossy manner like the earlier loggers on .NET did.

Whether it grows into a standalone project in its own right, or encourages more established frameworks to take a look at this new style of logging, bringing Serilog’s API to JavaScript seems like a fun and worthwhile thing to try.

serilog.js

Tinkering away over a month of so I’ve sketched out what I think serilog.js should look like.

Yes, serilog.js terminal output is type-aware, just like syntax highlighting in an IDE! It’s useful when dealing with structured events to see the difference between a number 5 (magenta for numbers) and a string "5" (cyan for strings). In itself this feature is a nice trimming, but it’s not just a party-trick; below the surface, the event is carrying all of this information.

Serilog “sinks” are destinations for log events. This example uses the terminal sink, which writes to the system terminal/console. At the moment, the sinks are all zipped up directly in the serilog package, but they need to be split out.

Tour of the API

At present, with very little of the back end ‘sinks’ built, serilog.js is all about the API.

Configuring the pipeline

One of the first things to notice is that the logger is a pipeline. Thinking of log events as structured data leads naturally to an API more like what you’d find in event processing frameworks than the typical logging library.

serilog.configuration() kicks things off by creating a new configuration object

.writeTo(terminal()) emits all events to the terminal sink

.minimumLevel('WARNING') filters out any that are below the WARNING level (we’ll cover levels below)

.writeTo(file({path: 'log.txt'}) emits the events that reach it to a log file, and

.createLogger() takes all of these pipeline stages and bundles them up into an efficient logger object

The result is log, which we’ll use below. (As in Serilog for .NET, there’s no “global” logger – multiple loggers can be created and are completely independent, unless you link them together with something like .writeTo(log2)).

You might be wondering – why not Node.js streams or RxJS etc.? Simple really – generic APIs are great for glueing together solutions, but for something that’s used extensively in a codebase, maintaining complete control over dependencies, performance and semantics is a good idea long-term, and that’s the path taken here.

Logging levels

serilog.js has four simple logging levels, listed here in order of importance:

serilog.js can handle structured storage of objects and arrays nicely, and these can support some really cool processing scenarios – just don’t forget the @ (if you watch the events roll by on the terminal, you’ll notice the difference too – objects are colored green, while strings render in cyan).

Context and enrichment

The strength of structured logs is in correlation; this is one of the great challenges when instrumenting distributed apps, and probably the #1 reason that interest in structured logs has picked up again in recent times.

When searching a large number of events produced across many machines, it’s vital to have properties to filter on so that a single location or thread of interaction can be followed.

In a simple case, you might want to instrument all events with the name of the machine producing them. This can be done when configuring the logger:

Filtering

A final part of the API to check out is filtering. While most filtering will happen server-side wherever you collect your logs, it can be useful to apply some filters in the logging pipeline if you have a component producing a lot of noise, or if you need to avoid passing events with sensitive data to logs.

Building serilog.js

If you want to tinker with the project, you can grab the source from GitHub. The build process uses gulp to drive some mocha tests. Just type:

npm install
gulp

…and you’re away!

The browser

serilog.js also runs in the browser (though version support will be patchy and is completely untested!). There are some notes on the GitHub site linked above on how to do this, and a test.html page in the repository that sets some of it up.

The catch

Serilog in .NET is out there in production use today. It’s being extended and refined by a core group of contributors and takes what time I have available to coordinate.

I’ll be using serilog.js in a few small JavaScript projects and extending it for my own needs, and I’ll be building a sink that writes to Seq at some point. But, I don’t have the capacity to really drive this project beyond the prototype that it is right now.

If you’re interested in JavaScript, event processing, application monitoring and/or API design, serilog.js needs you. Kick the tyres, check out the code, send a pull request! It’s a project “up for grabs” in the largest sense. Can you help change the face of JavaScript logging?

The features planned for Seq 1.4 are starting to come together. There’s a new preview available to download on the Seq website (direct download).

While 1.4 is still a little way off “done”, many of the improvements we’ve made are usable today; here’s a run through what you can expect from the new version.

What’s Seq? Seq is the fastest way for .NET teams to improve visibility using structured application logs. If you’re using Serilog, you’ll find Seq is the perfect partner – trivially easy to set up, and built for structured log events so querying them is simple from “day 1” of a project. You already use CI and automated deployment: Seq is the next piece of the puzzle, closing the loop to show your team what’s happening in development, testing and production.

Charts

New in 1.4:

Charts show the count of matching events in a time period. You can view the last day (by hour), last week (by day), last month (by day) or last year (by week). It’s a simple feature that I’m sure we’ll expand on, but already a great way to keep a “finger on the pulse”.

To add a chart, filter the “events” view to show just what you want to chart, then drop down the “refresh” button and select “Add to dash”. By default, queries are shown on the dash as counters; to toggle between the counter and chart display, there’s a little icon in the top left that appears on hover.

(If you’re already using counters with Seq 1.3, you’ll need to recreate them before historical data will be indexed by the 1.4 preview.)

Quick(er) switching between views

A much-requested improvement: when no view is selected, the first 10 views will be shown in a list (saving a few clicks when switching between views).

Extended text operators

A bunch of work has gone into better string comparison functions in this version. Seq queries now support:

Contains(text, pattern)

StartsWith(text, pattern) and EndsWith(text, pattern)

IndexOf(text, pattern), and

Length(text)

In each of these, “pattern” can be case-insensitive, case-sensitive, or a regular expression. There’s some more information and examples in the documentation.

Date functions and @Timestamp

Events in Seq now carry a queryable @Timestamp property, and we’ve added a DateTime() function that can be used with it. Check out the documentation for some examples here.

It’s fun watching Serilog presented by people who obviously “get it”. And, compared with the alternatives shown earlier in the session, it’s great to see just how easy Serilog is for generating quality structured logs.

(If you’re not using Serilog and interested in the demo code, it’s on GitHub, but please check out this tiny PR if it hasn’t made it through yet.)

If you look carefully you’ll note an extra couple of single quotes (') in there around the the {0} token that will be replaced with the search term being logged. I got into this habit because when reading logs I find it less jarring to see random string data quoted, so that my brain doesn’t try to join the whole lot up in a single bewildering sentence:

Could not find any documents matching my pants

vs.:

Could not find any documents matching 'my pants'

String formatting in Serilog message templates

When I started writing Serilog I thought I’d save a few characters, and encourage consistency, by making this little idiom part of its string formatting engine:

I’m sure there are plenty of people who’d rather a different default, but so far no one’s raised it as particularly problematic. If you need to match a specific output format you can use a special “formatting” specifier to turn this off – just append :l (yes, a lowercase letter L) to the token:

Reusing message templates for output

The formatting above is applied when rendering log messages into text. A bit later in the design process, I think some time around when Serilog became a public project, we needed another formatting mechanism, this time to control how the different parts of the log event including the message get laid out e.g. for display at the console.

You’ll notice the same style of format string appears in there. The engine used to write the text is the same; there were (and are) a lot of advantages to reusing this code for “output templates” like this.

Unfortunately, some of the decisions that work well when rendering a structured message, get in the way when presenting text like this. You’ll see above one of the oddities – the pre-defined NewLine and Exception properties are represented as strings under the hood, so the :l literal specifier has to appear in there to avoid unsightly quotation marks appearing in the output. Not a big deal really, but too much to have to grasp as a new user.

The other issue is that in message templates, i.e. when calling Log.Information() and friends, it’s an error to provide less arguments than the number of tokens that appear in the string. Serilog helpfully prints out the whole token as text – {Token} so that these issues are obvious when reading the rendered message.

In output templates this isn’t so desirable. Using {HttpRequestId} in the format string should just be ignored if the event doesn’t have such a property attached.

Serilog 1.3.7 changes

I almost versioned this one “1.4”, given there’s quite a change being made, but since the API is binary-identical this one’s out as a patch.

Put simply, Serilog now has slightly different handling of message templates vs. output templates. Message templates keep their current behaviour – quotation marks and all. When used to configure a sink as an output template however:

The :l specifier is now “on by default”

Missing tokens will simply be ignored

The default output templates all used :l formatting anyway, and so will almost all custom ones, so for 99% of current users I’d say this change will go unnoticed. If you define your own output templates though you can now clean up a bit of clutter and remove the literal specifier from fields like {Exception} and {NewLine}.

If you’re coming to Serilog as a new user, hopefully these changes give you one less reason to go looking for explanatory posts like this one!