Brendan Abolivierhttps://brendan.abolivier.bzh/
Some guy sharing his knowledge and thoughts with random strangers. One post a week.Hugo (gohugo.io)(CC-BY-SA) Brendan AbolivierMon, 21 May 2018 00:00:00 +0200Manage your passwords with passhttps://brendan.abolivier.bzh/password-store/
Brendan AbolivierMon, 21 May 2018 00:00:00 +0200https://brendan.abolivier.bzh/password-store/Let's talk about passwords. Basically, that's the things you're supposed to keep different for each account you have on the Internet. Either you don't do it, do it partially, or have a password manager do it for you. This week, I'm writing about pass, a simple and minimal password manager mainly consisting in a 699-line long bash script, which I've been using for some months.Let’s talk about passwords. Basically, that’s the things you’re supposed to keep different for each account you have on the Internet. Either you don’t do it, do it partially (like a mix between a leet-speak version of the service’s name and a fix part, with an uppercase letter and a character that’s neither a letter nor a number at some place, such as mySup3rw3bs!t3MyUsualPassword), or have a password manager do it for you.

I’ve had quite a hard time finding a password manager that fits my needs. During the past few years, I’ve tried quite a few of them, and eventually stopped using them one after the other. LastPass because of its poor UX on points that mattered to me, and I couldn’t feel safe trusting that much into such a centralised and closed service. Keepass because it was a pain to synchronise my database between all devices. Passbolt because it focuses on a team use case and I want something designed for individuals. You name it.

After a while, I started trying to get a description of what I wanted. To me, the ideal password manager must be:

free software

security audited

synchronisable across devices

self-hostable

easy to set up

easy/quick to use

I realised that was quite an idealistic description, and thought I was done with password managers. To be fair, to this day, I still haven’t found one that match all of my criteria, though the one I’ll be talking about in this post gets quite close.

Also, let me get things straight first: the last two points in the list above are using the relative definition of “easy”, i.e. what’s easy to set up/use to me, as someone who has some technical knowledge and background. Specifically, the solution I’ll be writing about in this post would be labelled as quite painful to use by somebody who isn’t used to bash, git et al.

It’s all about simplicity

Pass is a minimal and very simple password manager which consists in a 699 line long bash script (including comments). It stores your password as files in a given directory (the “store”), and encrypt them using GnuPG. That way, you can organise your passwords as you want, in as many sub-directories as you wish, and they will be stored, possibly along with some metadata, in a somewhat-secure fashion.

Notice here that I made a compromise on my criteria of an ideal password manager, here, because, as far as my knowledge goes, pass hasn’t got a security audit yet (only GnuPG did). I consider it safe enough for my personal use, though.

Pass also has both CLI and GUI clients for most platforms, including OS X, Android, iOS and Windows, and also some browser extensions, but I’ll only cover the basic command-line use of the bash script here. All clients and extensions can be found here, though.

Creating the store

I won’t cover installation, which is already covered on pass’s website and should be quite easy on most systems.

You’ll also need to generate a GPG key, which is the pass equivalent of the store’s master key/passphrase, if you haven’t got one, which I also won’t cover here since there’s already great resources for that on the Internet.

Once pass is installed, let’s initialise a store with

pass init GPG-ID

Here, GPG-ID is the identifier of the key you’ll use to encrypt your passwords. It can be the key’s fingerprint (in the case of my own key, E1D4B7457A829D771FBA8CACE860157274A28D7E) or one of it’s associated email addresses (which, in my case, can be hello@brendanabolivier.com).

It will then initialise a store in a directory, which path is ~/.password-store and is created if it doesn’t exist. This directory is the one in which pass will work in every call you’ll make in the future. This value can be overriden by setting the environment variable PASSWORD_STORE_DIR.

Adding an existing password

Because you had accounts on the Internet before starting using pass, you might want to store their passwords in your brand new password store.

To insert a password into your password store, just run

pass insert PASSWORD-NAME

Where PASSWORD-NAME is the name you’ll give to this entry. If you want to manage your entries with sub-directories, the entry name can also be a relative path to the password store (e.g. pass insert hostProviders/ovh will create an entry in the sub-directory “hostProviders”). If a sub-directory doesn’t exist, pass will create it for you.

It will then prompt you for your password, which you can just paste and validate, and an encrypted copy of it will be stored in the password store. For example, if the entry name is hostProviders/ovh, it will store an encrypted copy of my password in ~/.password-store/hostProviders/ovh.gpg.

You might also want to add metadata to your password, such as the account’s login, or the service’s URL, which some pass clients can use. You can do that by appending the -m flag to your pass insert call (before the PASSWORD-NAME), which allows you to write your entry using more than just one single line and save it using Ctrl+D.

In case of multiline entries, it’s usually better to start an entry with the password as the first line’s only content, and then add your metadata on the following lines. The reason for that is because, to pass, a non-multiline entry is just a one-line long file with the password as the only content. Having the first line only including the password will help pass handle multiline entries the same way as a single line entry.

In the end, your multiline entry would look like this:

mySup3rw3bs!t3P4ssw0rd
login: me@me.tld
url: mysuperwebsite.com

It might be worth noting that if you come from another password manager, there might be a migration script aiming at migrating all of your entries to pass instead of doing it manually, one at a time. Migration scripts for most password managers can be found here.

Creating passwords

Of course, one of the good things with having a password manager is having it generate different strong passwords for each service you have an account on. Generating a password with pass is as easy as calling:

pass generate PASSWORD-NAME

As with pass insert, this will create a .gpg file at the desired location, and will this time fill it with a 25-character long password. If you want the password length to be something else than 25, you tell pass by appending the desired length after the PASSWORD-NAME.

Once the password is generated, pass will print it into the terminal, so you can copy it. If you don’t want it to appear on your screen, you can also append the -c flag to your call, right before the PASSWORD-NAME. Pass will then copy it into your clipboard, which it will clear after 45 seconds (the delay can be changed by setting the environment variable PASSWORD_STORE_CLIP_TIME to the number of seconds you want).

Another useful trick is appending metadata to the newly generated password, like we’ve seen before. It’s obviously possible to edit an existing password (using pass edit PASSWORD-NAME, which will open an unencrypted copy of the password entry in vim), but I personally prefer to never have pass printing my password on a screen.

To achieve that, we can first call pass insert -m PASSWORD-NAME, which will prompt for the password and its metadata, leave the first line blank and fill the following one with metadata before hitting Ctrl+D. We can then call pass generate -ci PASSWORD-NAME. Note the -i flag (which stands for “in place”), which means that the entry we want to generate a password for already exists, in which case pass will replace the entry’s first line with the newly generated password, and leave the rest of the file as it was.

You now have your newly generated strong password copied to your clipboard, and the desired metadata in its file.

Retrieving passwords

It would be quite useless to have all your passwords stored in your store without being able to retrieve them and use them. As everything with pass, this is quite easy:

pass show PASSWORD-NAME

Which you can even shorten as:

pass PASSWORD-NAME

Pass will then print out the corresponding password, along with its metadata (if it has any) in the terminal. If you don’t want the password to be printed out, but rather to be copied to your clipboard, just append -c before the PASSWORD-NAME, just like pass generate (and just like pass generate, it will clear the clipboard after 45 seconds (again, this delay can be overriden using the PASSWORD_STORE_CLIP_TIME environment variable)).

You might also prefer not having to fire a terminal and type a command line in order to get a password you’ll then copy to the website. In that case, you might be interested in using one of the few browser extensions available, such as passwe for Firefox and Chrome, PassFF for Firefox or Browserpass for Chrome, which you can use to automatically fill in login forms using passwords from your store and their metadata. For what it’s worth, I’ve been using PassFF for quite a while now, and it works pretty well.

Synchronising passwords

Because I always have more than one device, one thing I’m really looking for in a password manager is its ability to synchronise with other devices easily. This is the reason I stopped using Keepass, because having to manually copy your database across all of your devices each time you add/remove/change an entry was really painful.

Where I become really picky is that I don’t want to be stuck with a proprietary service’s hosting such as LastPass’s or Dashlane’s. I want to control where I send my passwords, who can access them, etc.

Once again, pass choses simplicity, by implementing a great compatibility with git, letting it do all of the versionning and networking, which is, obviously, optional.

If you want to synchronise your own password store with a git repository, create an empty one somewhere (I personally did that on one of my own servers, but a GitHub/GitLab/Gitea/etc. repository will, of course, work as well), grab its URL and run

pass git init
pass git remote add origin REMOTE-URL

Where REMOTE-URL is the repo’s URL.

This will initialise a local git repository at the root of you password store, and also create a commit containing all your store’s content. Note that the pass git commands’ syntax follow the standard git commands’. That is because pass git will actually run every git command you give it in the store, whatever your current working directory is. This means that you can basically use every git command you want, as long as you prefix them with pass, the commands will affect your password store and nothing else.

Now that the git repository is initialised in the password store, each time you’ll create, remove or edit a password, pass will automatically create a commit for that, so you only have to run pass git push now and then to synchronise your local password store with your remote copy.

In my case, I like to have a copy of my password store on my phone, and to manage it using the Password Store Android app (available on F-Droid and Google Play), to which I just have to give the URL and credentials required to clone the repository, and the GPG key to use when trying to decrypt passwords, and I can instantly use my passwords on my smartphone.

Of course, since pass manages your passwords files and directories, you can have multiple sub-directories in your password store, each one of them having a different git remote. For example, most of my passwords are pushed to a remote repository on a server I own, except for one folder containing internal passwords we use at CozyCloud, which are synchronised with an internal repository we have.

To infinity and beyond

Of course, I haven’t described all the features pass has. This post only describes the few I personally use, along with some setup instructions, and doesn’t really cover the various ways in which one can use it. Now it’s yours to play with it! 😉

Also, the length and complexity of the said latest post brought some fatigue with it, which explains this one’s lateness. Taking that into account, and given the fact that I’m working really hard on the Trancendances presents immersion{s} party in Brest that’s taking place in less than two weeks, I don’t think I’ll be publishing any more post in the next couple of weeks (except maybe a very small one on a couple tools I discovered recently, but that’s far from sure).

I’ll see you after that, most likely in a bit less than three weeks, for a brand new blog post (of which I already know the topic, and it’ll be a completely non-tech one, for a change!). See you then!

]]>Enter the Matrixhttps://brendan.abolivier.bzh/enter-the-matrix/
Brendan AbolivierSun, 13 May 2018 00:00:00 +0200https://brendan.abolivier.bzh/enter-the-matrix/Matrix is a protocol for decentralised, federated and secure communications, created and maintained by New Vector, a company split between London, UK and Rennes, France. It's based on RESTful HTTP/JSON APIs, documented in open specifications, and is designed to be usable for anything that requires real-time-ish communications, from instant messaging to IoT. Let's see how it works and how to make a basic use of it.As you might know if you’ve been following me on Twitter for some time (or if you know me in real life), I’m very fond of free software and decentralisation. I love free software because it matches the philosophy I want to live by, and decentralisation because it enlarges a user’s freedom and individuality, and I find working on decentralised systems fascinating. Doing so forces one to change their way of designing a system entirely, since most of the Internet now consists of centralised services, which leads people to only learn how to design and engineer these.

Today I want to tell you about one of my favorite decentralised free software projects right now: Matrix. Let’s get things straight first, I’m talking about neither the science-fiction franchise, nor the nightclub in Berlin. Matrix is a protocol for decentralised, federated and secure communications, created and maintained by New Vector, a company split between London, UK and Rennes, France (which I joined for an internship in London during the last summer). It’s based on RESTful HTTP/JSON APIs, documented in open specifications, and is designed to be usable for anything that requires real-time-ish communications, from instant messaging to IoT. Some people are also experimenting with using Matrix for blogs, RSS reader, and other stuff that’s quite far from what you’d expect to see with such a project. Despite that, however, it’s currently mainly used for instant messaging, especially through the Riot client (which is also developed by New Vector).

Matrix also distances itself from the “yet another comms thing” argument with its philosophy: it’s not another standard for communications, but one that aims at binding all communications services together, using bridges, integration et al. For example, at CozyCloud, we have a Matrix room that’s bridged to our public IRC channel, meaning that every message sent to the Matrix room will get in the IRC channel as well, and vice-versa. I’m even fiddling around in my free time to bridge this room with a channel on our Mattermost instance, to create a Mattermost<->Matrix<->IRC situation and allow the community to interact with the team without members from the latter having to lose time firing up another chat client and looking at it in addition to internal communications.

There’s also been quite some noise around Matrix lately with the French government announcing its decision to go full Matrix for their internal communications, using a fork of Riot they might also release as free software to the wide world in the future.

Under the hood

It’s great to introduce the topic, but I guess you were expecting more of a technical and practical post, so let’s get into how Matrix works. Quick disclaimer, though: I won’t go too much in depth here on how Matrix works (because if I do, the post would be quite too long and I’d never get time to even finish it in a week), and will mainly focus on its core principles and how to use it in the most basic way.

As I mentioned before, Matrix is decentralised and federated. The decentralised bit means that you can run a Matrix server on your own server (quite like other services such as Mattermost), and the federated one means that two Matrix servers will be able to talk to one another. This means that, if someone (let’s call her Alice) hosts her own Matrix server at matrix.alice.tld, and want to talk to a friend of her (let’s call him Bob), who also hosts his own Matrix server at matrix.bob.tld, that’s possible and matrix.alice.tld will know how to talk to matrix.bob.tld to forward Alice’s message to Bob.

Glossary break:

There are a few server types in the Matrix specifications. The homeservers (HS) are the servers that implement the client-server and federation APIs, i.e. the ones that allows actual messages to be sent from Alice to Bob. In my example, in which I was referring to homeservers as “Matrix servers”, matrix.alice.tld and matrix.bob.tld are homeservers. Among the other server types are the identity servers (IS) that allows one to host third-party identifiers (such as an email address or a phone number) so people can reach them using one of them, and application services (AS) which are mainly used to bridge an existing system to Matrix (but are not limited to that). In this post, I’m only going to cover the basic use of homeservers, since knowledge about the other types isn’t required to understand the bases of how Matrix works.

In the Matrix spec, both Alice and Bob are identified by a Matrix ID, which takes the form @localpart:homeserver. In our example, their Matrix IDs could respectively be @Alice:matrix.alice.tld and @Bob:matrix.bob.tld. Matrix IDs’ form actually follows a broader one, taken by any Matrix entity, which is *localpart:homeserver, where * is a “sigil” character which is used to identity the identity’s type. Here, the sigil character @ states that the entity is a Matrix ID.

Three roomies on three servers

Now that we have our two users talking with each other, let’s take a look at how third user (let’s call him Charlie), also hosting his own homeserver (at matrix.charlie.tld), can chat with both of them. This is done using a room, which can be defined as the Matrix equivalent of an IRC channel. As any entity in Matrix, the room has an ID which takes the general form with the ! sigil character. However, although it contains a homerserver’s name in its ID, and unlike a user ID, a room isn’t bound to any homeserver. Actually, the homeserver in the room ID is the homeserver hosting the user that created the room.

Technically speaking, if Alice wants to send a message in the room where both Bob and Charlie are, she’ll ask her homeserver to send a message in that room, which will look into its local database which homeservers are also part of that room (in our example, Bob’s and Charlie’s), and will send the message to each of them individually (and each of them will display the message to their users in the room, i.e. Bob’s server will display it to Bob). Then, each homeserver will keep track of the message in their local database. This means two things:

Every homeserver in a room keeps a content of the room’s history.

If a homeserver in a room goes down for any reason, even if it’s the homeserver which has its name in the room’s ID, all of the other homeservers in the room can keep on talking with each other.

Broadly speaking, a room can be schematised as follows:

This image is a capture of the interactive explanation on how Matrix works named “How does it work?” on Matrix’s homepage, which I’d really recommand checking out. That’s why the Matrix IDs and homeservers’ names aren’t the same as in my example.

For what it’s worth, I took a shortcut earlier since, in the Matrix spec, 1-to-1 chats are also rooms. So technically speaking, Alice and Bob were already in a room before Charlie wanted to chat with them.

It might also be worth noting that a room can have an unlimited number of aliases, acting as addresses for the room, which users can use to join it if it’s public. Their syntax takes the general form we saw earlier, using # as the sigil character. This way, !wbtZVAjTSFQzROqLrx:matrix.org becomes #cozy:matrix.org, which, let’s be honest, is quite easier to read and remember. As with a room’s ID, its homeserver part is the homeserver hosting the user who created the alias, which means that I can create #cozycloud:matrix.trancendances.fr if I have enough power level, as I’m using this homeserver.

As I quickly hinted at, a room can be either public or private. Public rooms can be joined by anyone knowing one of the room’s alias (or getting it from the homeserver’s public rooms directory if it’s published there), and private rooms work on an invite-only basis. In both cases, if the homeserver doesn’t already have a user in the room, it will ask another homeserver to make the join happen (either the homeserver alias which name is in the homeserver part of the alias for a public room, or the homeserver the invite is originating from for a private room).

Events, events everywhere

Now that we know what a room is, let’s talk about what’s passing inside of one. Earlier, I’ve been talking about messages, which are actually called “events”. Technically speaking, a Matrix event is a JSON object that’s sent in a room and dispatched to all other members of the room. It, of course, has an ID that’s generated by the homeserver hosting the user who sent the message, taking the general form we saw earlier and the $ sigil character. This JSON has metadata, such as a class name to identify different event types, an author, a creation timestamp, etc. It basically looks like this:

The example above is an event sent from Alice to Bob and Charlie in the room they’re all in. It’s a message, as hinted at by the m.room.message class name in the type property. The content property, which must be an object, contains the event’s actual content. In this case, we can see the message is text, and the text itself. This precision is needed because m.room.message can be a text, but also an image, a video, a notice, etc. as mentioned in the spec.

The unsigned property here only means the data in it mustn’t be taken into account when computing and verifying the cryptographic signature used by homeserver to pass the event to another homeserver.

The Matrix spec defines three kind of events that can pass through a room:

Timeline events, such as messages, which form the room’s timeline that’s shared between all homeservers in the room.

State events, that contain an additional state_key property, and form the current state of the room. They can describe room creation (m.room.create), topic edition (m.room.topic), join rules (i.e. either invite-only or public, m.room.join_rules), membership update (i.e. join, leave, invite or ban, m.room.member with the Matrix ID of the user whose membership is being updated as the state_key). Just like timeline events, they’re part of the room’s timeline, but unlike them, the latest event for a {type, state_key} duo is easily retrievable, as well as the room’s current state of the room, which is actually a JSON array contaning the latest events for all {type, state_key} duos. The Matrix APIs also allows one to easily retrieve the full state the room was at when a given timeline message was propagated through the room, and each state event refers to its parent.

Euphemeral events, which aren’t included in the room’s timeline, and are used to propagate information that doesn’t last in time, such as typing notification (”[…] is typing…“).

Now, one of the things I really like about Matrix is that, besides the base event structure, you can technically put whatever you want into an event. There’s no constraint on its class name (except it can’t start with m., which is a namespace reserved for events defined in the spec), nor on its content, so you’re free to create your own events as you see fit, whether they are timeline events, state events or both (I’m not sure about euphemeral events, though). That’s how you can create whole systems using only Matrix as the backend.

Matrix events can also be redacted. This is the equivalent of a deletion, except the event isn’t actually deleted but stripped from its content so it doesn’t mess with the room’s timeline. The redacted event is then dispatched to every homeserver in the room so they can redact their local copy of the event as well. Regarding editing an event’s content, it’s not possible yet, but it’s a highly requested feature and should be available in the not so distant future.

A very basic client

Now I guess you’re wondering how you can use Matrix for your project, because learning the core principles is great but that doesn’t explain how to use the whole thing.

In the following steps, I’ll assume a few things:

The homeserver you’re working with is matrix.project.tld, and its client-server API is available on port 443 through HTTPS.

Your user is named Alice. Note that you must change this value for real life tests, because the Matrix ID @Alice:matrix.org is already taken.

Your user’s password is 1L0v3M4tr!x.

Note that I’ll only cover some basic use of the client-server spec. If you want to go further, you should have a look at the full spec or ask any question in the #matrix-dev room. I also won’t cover homeserver setup, here (though I might do just that in a future post). My goal here is mainly to give you a look at how the client-server APIs globally works rather tha creating a whole shiny app which would take too long for a single blog post.

It might also be worth noting that each Matrix API endpoint I’ll name in the rest of this post is a clickable link to the related section of the Matrix spec, which you can follow if you want more complete documentation on a specific endpoint.

Registering

Of course, your user doesn’t exist yet, so let’s register it against the homeserver.

The endpoint for registration is /_matrix/client/r0/register, which you should request using a POST HTTP request. In our example, the request’s full URL is https://matrix.project.tld/_matrix/client/r0/register.

Note that every endpoint in the Matrix spec always starts with /_matrix/.

The request body is a JSON which takes the following form:

{"username":"Alice","password":"1L0v3M4tr!x",}

Here, the username and password properties are exactly what you think it is. The Matrix ID generated for a new user contains what’s provided in the username property as the localpart.

Fire this request. You’ll now get a 401 status code along with some JSON, which looks like this:

Now, this enpoint uses a part of the spec called the User-Interactive Authentication API. This means that authentication can be seen as flows of consecutive stages. That’s exactly what we have here: two flows, each containing one stage. This example is a very simple one, but it can get quite more complex, such as:

Here we can see two flows, one with a single stage, the other one with two stages. Note that there’s also a parameter in the params object, to be used with the m.login.recaptcha flow.

Because I want to keep it as simple as possible here, let’s get back at our initial simple example, and use the first one-stage flow. The only stage in there is m.login.dummy, which describes a stage that will success everytime you send it a correct JSON object.

To register against this stage, we’ll only add a few lines to our initial request’s JSON:

Note that the value to the session property in the newly added auth object is the value from sessions taken from the homeserver’s response to our intial request. This auth object will tell the homeserver that this request is a follow-up to the initial request, using the stage m.login.dummy. The homeserver will automatically recognise the flow we’re using, and will succeed (because we use m.login.dummy), returning this JSON along with a 200 status code:

The home_server property contains the address of the homeserver you’ve registered on. This can feel like a duplicate, but the Matrix spec allows for a homeserver’s name to differ from its address, so here’s why it mentions it.

The device_id property contains the ID for the device you’ve registered with. A device is bound to an access token and E2E encryption keys (which I’m not covering in this post).

The access_token property contains the token you’ll use to authenticate all your requests to the Matrix client-server APIs. It’s usually much longer than the one shown in the example, I’ve shortened it for readability’s sake.

Registering an user instantly logs it in, so you don’t have to do it right now. If, for any reason, you get logged out, you can log back in using the endpoint documented here.

Creating our first room

Now that we have an authenticated user on a homeserver, let’s create a room. This is done by sending a POST request to the /_matrix/client/r0/createRoom endpoint. In our example, the request’s full URL is https://matrix.project.tld/_matrix/client/r0/createRoom?access_token=olic0yeVa1pore2Kie4Wohsh. Note the access_token query parameter, which must contain the access token the homeserver previously gave us.

There are a few JSON parameters available which I won’t cover here because none of them are required to perform the request. So let’s send the request with an empty object ({}) as its body.

Before responding, the homeserver will create the room, fire a few state events in it (such as the initial m.room.create state event or a join event for your user). It should then respond with a 200 status code and a JSON body looking like this:

{"room_id":"!RtZiWTovChPysCUIgn:matrix.project.tld"}

Here you are, you have created and joined your very first room! As you might have guessed, the value for the room_id property is the ID of the newly created room.

Messing with the room’s state

Browsing the room’s state is completely useless at this stage, but let’s do it anyway. Fetching the whole room state, for example, is as easy as a simple GET request on the /_matrix/client/r0/rooms/{roomId}/state endpoint, where {roomId} is the room’s ID. If you’re following these steps using curl requests in bash, you might want to replace the exclamation mark (!) in the room’s ID with its URL-encoded variant (%21). Don’t forget to append your access token to the full URL as shown above.

The request should return a JSON array containing state events such as:

Now let’s try to send our own state event in the room, shall we? I order to do that, you’ll need to send a PUT request to the /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey} endpoint, repacing the room’s ID, the event’s type and its state key with the right values. Note that if your state key is an empty string, you can just omit it from the URL. Again, don’t forget to append your access token!

The body for our request is the event’s content object.

Let’s create a tld.project.foo event with bar as its state key, and {"baz": "qux"} as its content. To achieve that, let’s send a PUT request to /_matrix/client/r0/rooms/!RtZiWTovChPysCUIgn:matrix.project.tld/state/tld.project.foo/bar?access_token=olic0yeVa1pore2Kie4Wohsh (from which I’ve stripped the protocol scheme and FQDN so it doesn’t appear too long in the post) with the fillowing content:

{"baz":"qux"}

The homeserver then responds with an object only containing an event_id property, which contains the ID of the newly created state event.

If we retry the request we previously made to retrieve the whole room state, we can now see our event:

Note that sending an update of a state event is done the same way as sending a new state event with the same class name and the same state key.

Sending actual messages

Sending timeline events is almost the same thing as sending state events, except it’s done through the /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId} endpoint, and it uses one parameter we haven’t seen yet: the txnId, aka transaction ID. That’s simply a unique ID allowing identification for this specific request among all requests for the same access token. You’re free to place whatever you want here, as long as you don’t use the same value twice with the same access token.

Regarding the request’s body, once again, it’s the event’s content.

Retrieving timeline events, though, is a bit more complicated and is done using a GET request on the /_matrix/client/r0/sync endpoint. Where it gets tricky is in the fact that this endpoint isn’t specific to a room, so it returns every event received in any room you’re in, along with some presence event, invites, etc.

Once you’ve done such a request (again, with your access token appended to it), you can locate timeline events from your room in the JSON it responds with by looking at the rooms object, which contains an object named join which contains one object for each room you’re in. Locate the !RtZiWTovChPysCUIgn:matrix.project.tld room (the one we’ve created earlier), and in the corresponding object you’ll see the state, timeline and euphemeral events for this room.

Inviting a folk

So far, Alice has registered on the homeserver and created her room, but she feels quite alone, to be honest. Let’s cheer her up by inviting Bob in there.

Inviting someone into a room is also quite simple, and only requires a POST request on the /_matrix/client/r0/rooms/{roomId}/invite endpoint. The request’s body must contain the invited Matrix ID as such:

{"user_id":"@Bob:matrix.bob.tld"}

Note that the request is the same if Bob has registered on the same server as Alice.

If all went well, the homeserver should respond with a 200 status code and an empty JSON object ({}) as its body.

In the next request on the /_matrix/client/r0/sync he’ll made, Bob will now see an invite object inside the rooms one contaning the invite Alice sent him, containing a few events including the invite event:

Alice meets Bob

So here we are, with a fresh room where Alice and Bob are able to interact with one another, with everything done using HTTP requests that you could do with your terminal using curl. Of course, you don’t always have to do it that manually, and there are Matrix SDKs for various languages and platforms, including JavaScript, Go, Python, Android, iOS, and a lot more. The full list is available right here.

If you want to dive a bit deeper into the Matrix APIs, I’d advise you to have a look at the spec (even though it still needs a lot of work) and what the community has done with it on the Try Matrix Now! page on Matrix’s website.

I hope you found this journey into Matrix’s APIs as interesting as I did when I first heard of the project. Matrix is definitely something I’ll keep playing with for a while, and might have some big news related to some Matrix-related projects I’m working on to share here in the coming months.

As always, I’d like to thank Thibaut for proofreading this post and giving me some useful early feedback on it. If you want to share your feedback on this post with me too, don’t hesitate to do so, either via Twitter or through Matrix, my own Matrix ID being @Brendan:matrix.trancendances.fr!

See you next week for a new post 🙂

]]>Centralising logs with rsyslog and parsing them with Graylog extractorshttps://brendan.abolivier.bzh/logs-rsyslog-graylog/
Brendan AbolivierSat, 05 May 2018 00:00:00 +0200https://brendan.abolivier.bzh/logs-rsyslog-graylog/Logs are really useful for a lot of things, from investigating issues to monitoring stuff that can't be watched efficiently by other monitoring tools. When it comes to storing them, a lot of solutions are available, depending on what you need. At CozyCloud, our main need was to be able to store them somewhere safe, and process them. Let me walk you through how we did it.Onceagain, we’re up for a monitoring-related post. This time, let’s take a look at logs. Logs are really useful for a lot of things, from investigating issues to monitoring stuff that can’t be watched efficiently by other monitoring tools (such as detailled traffic stats), and some of us even live in a country where it’s illegal to trash logs that were emitted before a given time limit.

When it comes to storing them, a lot of solutions are available, depending on what you need. At CozyCloud, our main need was to be able to store them somewhere safe, preferably outside of our infrastructure.

Earth, lend me your logs! says syslog-dev

We started by centralising logs using rsyslog, an open logs management system that’s described by its creators as a “swiss army knife of logging”. One of its features I’ll be writing the most about in this post is UDP and TCP forwarding. Using that, we (well, my colleagues, since I wasn’t there at that time) created a host for each of our environments which task would be to keep a copy of every log emitted from every host and by every application in the given environment.

I’ll take a quick break here to explain what I mean by “environment” in case it’s not clear: our infrastructure’s architecture is replicated 4 times in 4 different environments, each with a different purpose: dev (dedicated to experimentation and prototyping, aka our playground), int (dedicated to running the developers’ integration tests, aka their playground), stg (dedicated to battle-testing features before we push them to the production) and prod (I’ll let you guess what’s its purpose). End of the break.

On each host of the whole infrastructure, we added this line to rsyslog’s configuration:

*.* @CENTRAL_LOG_HOST:514

Here, CENTRAL_LOG_HOST is the IP of the host that is centralising the logs for the given environment, in the infrastructure’s local private network. What it does is to tell rsyslog to forward every log it gets to the given host using UDP on port 514, which is rsyslog’s default port for UDP forwarding.

Then a colleague set up a Graylog instance to try and work out the processing part. He did all the set up and plugged in the dev environment’s logs output before getting drowned under a lot of higher-priority tasks, and since I was just finishing setting up a whole monitoring solution we figured I’d take over from there.

Let’s plug things

Of course, the first thing to do on your own setup is to install and configure Graylog, along with its main dependencies (which are MongoDB and Elasticsearch). The Graylog documentation covers this quite nicely with a general documentation and a few step-by-step guides offering some useful details on installation and configuration. Once your Graylog instance is set up, open your browser on whatever you set as the Web UI’s URI. In most cases, it will look like http://YOUR_SERVER:9000.

Once you’re authenticated, you’ll need to add an input source. Click on “Systems” in the navigation bar, then “Inputs” in the dropdown menu that just appeared. You’ll then be taken to a page from which you’ll be able to configure Graylog’s inputs.

Click on the “Select Input” dropdown, look for “Syslog TCP” and click “Launch new input”. Filling the form that appears then is done accordingly with your needs, however you might want to check “Store full message” at the very bottom. Graylog understands the Syslog protocol’s syntax, and the message it stores is a stripped version of what ®syslog actually sent. Because you might want to use some of the stripped out parts, it can be wise to tell Graylog to store the full message somewhere before processing it.

You’ll then have to configure rsyslog to send the logs it gets to Graylog. Because we centralise all of our logs, we only need to configure one rsyslog daemon, by adding this line to its configuration:

*.* @@GRAYLOG_HOST:PORT;RSYSLOG_SyslogProtocol23Format

Here, the host is your Graylog server’s address and the port is the one you previously configured while setting up your Syslog TCP input.

There’s two things to notice here. First, there are two @ symbols before Graylog’s host name, which means the logs are going to be forwarded to Graylog using TCP. We previously saw a forwarding configuration line with a single @ sign, which means rsyslog will use UDP. The second thing to notice is the ;RSYSLOG_SyslogProtocol23Format part. The semicolon (;) tells rsyslog that this is a parameter defining how to send logs, and RSYSLOG_SyslogProtocol23Format is a built-in parameter telling rsyslog to send logs using the Syslog protocol as defined in RFC 5424.

Restart rsyslog to apply the new configuration, and check it works by generating some logs while running

tcpdump -Xn host GRAYLOG_HOST and port PORT

with the same values for GRAYLOG_HOST and PORT as in the bit of configuration below. This tcpdump command line can be called from either the Graylog host or the rsyslog host. If those are the same, remember to add -i lo between tcpdump and -Xn to watch the loopback interface (in this case you can also remove the host GRAYLOG_HOST and part of the command line).

Once you’ve created your input, you might want to add streams. I’m not covering this part in this post as I didn’t get to play with these, and there’s a default stream where all messages go anyway.

Now that logs are coming in, let’s process them!

Stranger in a Strange Land

There are several ways to configure logs processing in Graylog. One of them is pipelines, which are, as you can guess by the name, processing pipelines you can plug to a stream. I played around with them a bit, but gave them up quite quickly because I couldn’t figure out how to make them work properly, and I was getting some weird behaviour with their rules editor.

Another way to process logs is to set up extractors. A Graylog extractor is a set of rules which defines how logs coming from a given input will be processed, using one of many possible processing mechanisms, from JSON parsin to plain copy, including splitting, substring, regular expressions or Grok patterns.

Now let’s talk about the latter in case it doesn’t ring a bell, because I’ll be talking a lot about this type of patterns in the rest of the post. Grok patterns are kind of an overlay for regular expressions, addressing the issue of their complexity. I’m sure that, just like me, you don’t find the thought of parsing 300-character long log entries using a custom format with standard regular expressions very exciting.

Grok patterns take the form of a string (looking like %{PATTERN}) you include in your parsing instruction that will correspond to either a plain regular expression, or a concatenation between other Grok patterns. For example, %{INT}, a common pattern matching any positive or negative integer, corresponds to the regular expression (?:[+-]?(?:[0-9]+)). Another pattern, included in Graylog’s base patterns, is %{DATESTAMP} which is defined as %{DATE}[- ]%{TIME}, which is a concatenation between a regular expression and two Grok patterns. These patterns are very useful as they make your parsing instructions way easier to read than if they were only made of common regular expressions.

Graylog, like other pieces of software, allow you to describe a log entry as a concatenation of patterns and regular expressions. For example, here’s the line we’re using to parse Apache CouchDB’s’ logs:

Note the colons inside the patterns’ brackets followed by lower case text: these are named captures, which means that what’s captured by the pattern with be labelled with this text. In this case, it will create a new field in the log entry’s Elasticsearch document (since Graylog uses Elasticsearch as its storage backend) with this label as the field’s name. We can even tell Graylog to ignore all un-named captures when creating an extractor.

Dissecting logs

The easiest way to create a new extractor is to browse to Graylog’s search, which can be done by clicking to the related button in the navigation bar. There you’ll see a list of all messages sent from your input.

Find a log entry you want to be processed, and click on it. If you have more than one input set up, you might want to double check that the entry come from the input you want to plug the extractor on, in order to avoid plugging it to the wrong input. Now locate the field you want to process (here we’ll use the full_message field, which is only available if “Store full message” is checked in the input’s configuration). Click on the down arrow icon on its right.

A dropdown menu appears, move your cursor over “Create extractor for field…”. Because that’s close to being the only extractor I got to use while working with Graylog, I’ll only cover extractors using Grok patterns here, so select “Grok pattern”.

Clicking on it will take you to the extractor creation page, using the entry you previously selected as an example to test the extractor against.

You can then enter your Grok pattern in the “Grok pattern” field. You can even ask Graylog to only extract named captures only by checking the related checkbox.

Now you might think of an issue with this setup: your extractor will be applied against all incoming messages from this input. To tackle that issue, let’s look at two points. First, extractors fail silently, meaning that if a log entry doesn’t match an extractor’s pattern, Graylog will just stop trying this extractor against this specific entry.

Making sure only entries from a specific program and/or host match is the reason we’re creating the exporter for the full_message field, since it contains the original host and the program which emitted the entry at the beginning of the message. These pieces of info are, of course, parsed as soon as the log reaches Graylog and saved in appropriate fields, but Graylog doesn’t allow an exporter to define execution conditions based on other field’s values.

Using values contained in the full_message field, the Grok pattern parsing CouchDB log entries I used as an example above now looks like:

Now that’s a first step, but it still means every log entry will be tested against the pattern, which is a waste of CPU resources. That’s where my second point comes in.

Graylog allows you to set some basic conditions that will define whether a log entry must be tested against the pattern. You can check whether the field contains a given string, or matches a given regular expression which can be very basic. I chose the string check because of lack of time, but I’d recommand checking against a basic regular expression to better match the log entries you want to target.

One last thing to chose is the “Extraction strategy”, which I usually set to “Copy” to better comply with the WORM (Write Once, Read Many) philosophy. You must also set a name to the extractor so you can easily identify it in the list of existing extractors.

Now your extractor should look like this:

All that’s left to do is to click “Create extractor” and that’s it! Your extractor is up and running!

You might want to check if it runs correctly by going back to the “Search” page and selecting a log entry the extractor should target. If the extractor ran correctly, you should see your new fields added to the entry. Note that an extractor only run against entries received after its creation.

If you want to edit an extractor, click on the “System” link in the navigation bar, the select “Inputs” in the dropdown menu that appears then. Locate the input your extractor is plugged to, and click on the blue “Manage extractors” button next to it. You’ll then be taken to a list of existing extractors for this input:

Click “Edit” next to the extractor you want to edit and you’ll be taken to a screen very similar to the creation screen, where you’ll be able to edit your extractor.

In the next episode

Now, we have a copy of all of our logs at the same place, and process them at a single location in our infrastructure, which is great but creates a sort-of SPOF (single point of failure). Well, only partial, since the logs are only copied from their original hosts, so if something happen to one of these locations, “only” the processing can be permanently impacted. Anyway, it doesn’t address one of our needs, which is to do all this outside of our infrastructure.

But this is a story for another week, since this post is already quite long. Next time I’ll tell you about logs, we’ll see how we moved our logs processing and forwarding to a remote service, without losing all the work we did with rsyslog and Graylog. This won’t be next week, though, because I already have next week’s topic, and it’s not even monitoring-related!

Anyway, thanks for bearing with me as I walked you through an interesting (I hope) journey into logs processing. If you’re note aware of it, this post was part of my One post a week series, in which I challenge myself to write each week a whole blog post in order for me to re-evaluate the knowledge I have and get better at sharing it. If you’ve enjoyed it, or if you have any feedback about it, make sure to hit me up on Twitter, I’ll be more than happy to discuss it with you 🙂

Thanks to Thibaut and Sébastien for giving this post a read before I got to publish it and getting me some nice feedback.

See you next week!

]]>Grafana Dashboards Managerhttps://brendan.abolivier.bzh/grafana-dashboards-manager/
Brendan AbolivierSat, 28 Apr 2018 00:00:00 +0200https://brendan.abolivier.bzh/grafana-dashboards-manager/I'm sort of in charge of all things monitoring currently at CozyCloud. Some of it is done using Zabbix (I already wrote about that), and the other part is pushed to OVH Metrics and visualised through Grafana. When I was alone working on dashboards and graphs, it was all right, but once a colleague came in, we felt the lack of version control would cause great troubles. That's where the Grafana Dashboards Manager comes in to save the day.At CozyCloud, most of my work orbites around monitoring and supervision. That’s the main reason explaning why I was tasked with dealing with Zabbix supervision on a remote infrastructure we’re setting up, and it also explains why I’ll write some more on monitoring solutions in the future.

As you already know, some of it is done using Zabbix, and the rest of it is done using OVH’s Metrics Data Platform, which, once again, I’ll write about in a future post. Since OVH hosts a Grafana instance to let their customer visualise their data, we use it to do just that. We actually have one dashboard for each kind of metrics we’re sending to the platform, e.g.:

a dashboard named “Infra” to visualise system metrics from each host in our infrastructure

a dashboard named “Cozy Stack” to visualise metrics specific to Cozy, CozyCloud’s product, including the evolution of the number of created instances, resources usage from the stack, etc.

etc.

I created most of these dashboards myself as part of prototyping and deploying the solution we’re using to push metrics to OVH’s platform (which I won’t be describing here as it deserves its own post). In fact, for my first couple of months working on this task, I was the only person creating, modifying or deleting dashboards in our Grafana organisation.

Then Nicolas started to work with dashboards too, and we stumbled across one big issue: because Grafana doesn’t embed a version control system (aka VCS, i.e. what Git, SVN et al. are), it became quite difficult to work on a dashboard: if a colleague modify a dashboard you’re currently working on, you can only either overwrite their changes, or give up yours (or merge both manually, which can be really painful).

Another situtation where I disliked the lack of a VCS was when I was editing huge and complex WarpScripts: if you save the dashboard with a faulty script by mistake, you’re going to have a very painful time finding it and fixing it. Add to this that the dashboard is actively used by other teams in your company, which adds to pressure you to patch it quickly, and compare that to the easiness of reverting to an older version and investigating calmly.

Considering all the burden this lack could create, I decided to start working on a tool for my team, which I later released as free software as the Grafana Dashboards Manager.

What is it?

The Grafana Dashboards Manager is a tool written in Go aiming at helping you manage your Grafana dashboards using Git. It takes advantage from the fact that Grafana describes a dashboard as JSON, making it easy to save and edit in a file.

Its goal is to let you retrieve your existing dashboards to a Git repository, and then edit them within your local Git repositrory, so merging two versions of the same dashboard doesn’t become a living hell. Once changes have been committed and pushed to the Git repository’s master branch the Grafana Dashboards Manager can handle synchronising the changes with your Grafana instance. And since only the master branch is watched, it means that you can take advantage of Git’s workflows, such as working on a separate branch, then merging it with the master one, either with a Pull/Merge request or not, and only then will its changes be synchronised with Grafana (if you want them to, of course).

So that’s the big picture, now let’s look at how it works. It is split in two part: a puller and a pusher. Basically, the whole thing is thought to work like this:

In this schema, the puller, a CLI tool, will fetch changes in the current Grafana dashboards, commit them to a local Git repository, push to a Git remote then exit.

In the meantime, the pusher will look for new commits in the repository to retrieve them and push changed files to Grafana as new or changed dashboards. If requested, it will also delete from Grafana all dashboards that were removed from the Git repository. It will, of course, ignore all commits created by the puller.

This check for new commits can be done in two ways: the first one will start a small web server which will only expose a route that can be used to send web hooks. Because we use GitLab internally, which means our dashboards will be versionned there, the dashboards manager currently only supports GitLab webhooks (and that’s also the reason the Grafana Dashboards Manager uses Git rather than another VCS). Does this mean you can only use the pusher with GitLab, you may ask? Of course not, I answer! The second available mode allows you to specify any Git repository URL which it will poll at a given frequency. In both mode, it will run as a daemon.

By the way, thanks to the refactoring work required to implement this “git pull” mode, if you really want to use a GitHub/Bitbucket/etc. webhook, it shouldn’t be too hard to add support for that in the pusher’s code. Any pull request is, of course, more than welcome!

I don’t want all dashboards to be pulled and pushed, how can I do that?

Let’s say you want to edit a complex dashboard, which JSON representation is thousands of lines long, so you want to edit it using Grafana’s GUI, using this setting you can change it’s name in the JSON file (which is at the end of the file) so it starts with the given prefix, import it, and you won’t be bothering by the puller committing your WIP changes or the pusher overwriting them.

It’s worth keeping in mind that this “ignore prefix” will be replaced with a regular expression in a future release.

What if I just want a back-up tool?

The reason the Grafana Dashboards Manager is split in two parts is because each is independant from the other. If you want it to work only one way, that’s possible. If you want to use it to only upload JSON descriptions of your dashboards to Grafan, that’s possible. If you want to use it to only back-up your dashboards and push them to a Git repository, that’s possible. Just run the appropriate binary with the appropriate configuration.

Wait, and if I don’t want to use Git at all?

Of course, if you don’t want to get a Git repository involved, the pusher won’t work, since its main feature is to interact with a one.

But if you just want to back-up your dashboards on your disk, well, that’s also possible! The puller has a second mode that only writes files to disk, which is called the “simple sync” mode, and allows you to back-up your dashboards as JSON files on your disk.

I’m sold! How do I get it?

The whole thing is available on GitHub as free software (AGPLv3-licensed), with instructions on how to build it, configure it and run it. If you want to skip the “building” part, here are some built linux-amd64 binaries. All that’s left for you is to download them, create a configuration file from the existing example and run the puller, the pusher or both in the configuration you want.

Thanks a lot to Nicolas who gave me the idea to work on this tool, and to Gilles who gave me a lot of amazing feedback on it 🙂 And as with the latest post, thanks also to Thibaut for his early feedback on this post.

See you next week for a new post, and in the meantime feel free to tweet me some feedback about this one!

]]>Zabbix supervision on a remote infrastructure with proxy and PSK-based encryptionhttps://brendan.abolivier.bzh/zabbix-proxy-encryption/
Brendan AbolivierFri, 20 Apr 2018 00:00:00 +0200https://brendan.abolivier.bzh/zabbix-proxy-encryption/At CozyCloud, we recently had to set up Zabbix supervision on a new infrastructure which could only speak to our Zabbix server over the Internet. As a result, we had to install a Zabbix proxy on the new infrastructure and configure it to use PSK-based encryption when talking to the server. Bear with me as I explain to you the steps we followed.All of CozyCloud’s production and development infrastructure is hosted in OVH’s datacenters. We monitor this infrastructure in two ways: by sending data points on various metrics to OVH’s Metrics Data Platform (I’ll write about that in a future post), and also by using a self-hosted Zabbix server.

All of our OVH hosts are connected to a virtual local network (vRack) that cannot be accessed from the outside world, so on-host Zabbix agents use it to send their unencrypted data to the Zabbix server, which is also connected to this local network. It was a very simple setup, which looked like this:

A new challenger approaches

Recently, we’ve been tasked with the setup of a new production infrastructure on another hosting provider. The question of how we were going to set up Zabbix’s monitoring in this new environment came up quickly. We decided not to set up another Zabbix server on the new hosting provider’s infrastructure as it would make things painful to set up and we’d have two places to watch instead of only one. So we decided that all Zabbix agents monitoring host on the remote infrastructure would send their data to the Zabbix server we already had set up on OVH’s infrastructure.

Now, this brought up an issue that needed solving before we could do anything: there’s no private local network linking the two hosting providers, so the traffic between the two goes through the Internet with neither encryption nor checksum. Luckily, Zabbix provides an encryption feature, and a proxy software which forwards data from agents to a server, so we decided that we would set up a Zabbix proxy on the remote infrastructure and would turn encryption on between the proxy and the Zabbix server. The resulting setup would look like this:

Let’s encrypt stuff

Let’s have a look at how we’ll encrypt the traffic between the proxy and the server. Zabbix actually provides three modes to describe encryption for incoming or outgoing connections:

unencrypted: the data is sent in plain text over the Internet (aka what we don’t want).

PSK (aka Pre-Shared Key): an encryption key that must be shared between the proxy and the server and is used to encrypt and decrypt the data.

Certificate-based: a PEM certificate signed by a certification authority (either public or in-house) must be generated; the CA’s certificate must be provided to the Zabbix server and is used to validate the certificates used by the proxy.

Because it was simpler to set up, we went with the PSK option. However, our Zabbix server was built and installed from the sources, with the --with-openssl option, and Zabbix’s doc on encryption states the following:

If you plan to use pre-shared keys (PSK) consider using GnuTLS or mbed TLS libraries in Zabbix components using PSKs. GnuTLS and mbed TLS libraries support PSK ciphersuites with Perfect Forward Secrecy. OpenSSL library (versions 1.0.1, 1.0.2c) does support PSKs but available PSK ciphersuites do not provide Perfect Forward Secrecy.

And since we had to update the server anyway, one of my colleague thought he would create an unofficial package (for internal use) from the sources. Why not use the official Debian packages, you ask? Because the packages coming from the official Debian repos are outdated, and we couldn’t find whether the packages coming from Zabbix’s official repos were built using OpenSSL or GnuTLS. This way, we were sure to use the latest Zabbix version with the best encryption settings.

I’m explaining this because it means we’re not using the official packages, which means that, although the setup process should be roughly the same, some steps may differ from the official from-packages install.

At this point, we have our internal packages of the Zabbix server, proxy and agent, and I was tasked to set up the whole thing on the remote infrastructure.

The proxy: a walkthrough

I’ll begin with the assumption that you already have a running Zabbix server somewhere on the Internet.

First, you need to install the Zabbix proxy. This should be as simple as running

sudo apt install zabbix-proxy-BACKEND

but can be a bit more complicated if you’re installing the proxy from the sources. Either way, it’s all documented.

In my case, once I created the proxy’s PostgreSQL user and database, I also had to manually load the database schema into PostgreSQL, or else the proxy wouldn’t start. If that’s your case, find the schema.sql or schema.sql.gz file installed on the proxy’s host by the sources or the package, un-compress it using gunzip if necessary, then enter the PostgreSQL shell (psql -U PROXY USER -d PROXY DATABASE), and run \i /path/to/schema.sql. This will do all the necessary operations to make the database usable by the proxy.

Now let’s configure the proxy. The configuration file we use, located at /etc/zabbix/zabbix_proxy.conf looks like this:

# Proxy operating mode.
# 0 - proxy in the active mode
# 1 - proxy in the passive mode
ProxyMode=0
# IP address (or hostname) of Zabbix server.
Server=ZABBIX SERVER IP/HOSTNAME
# Unique, case sensitive Proxy name.
Hostname=zabbix-proxy
# Log file name
LogFile=/var/log/zabbix-proxy/zabbix_proxy.log
# Database name.
DBName=POSTGRES DB NAME
# Database user.
DBUser=POSTGRES USER
# Database password.
DBPassword=POSTGRES PASSWORD
# How often proxy retrieves configuration data from Zabbix Server in seconds.
# For a proxy in the passive mode this parameter will be ignored.
# The default is 3600, which is an hour. We don't want to wait up to an hour
# for a new host to start being supervised.
ConfigFrequency=300
# How long we wait for agent, SNMP device or external check (in seconds).
Timeout=4
# How long a database query may take before being logged (in milliseconds).
# Only works if DebugLevel set to 3 or 4.
LogSlowQueries=3000
# How the proxy should connect to Zabbix server, aka the encryption mode we want
# to use.
TLSConnect=psk
# Unique, case sensitive string used to identify the pre-shared key.
TLSPSKIdentity=psk_remote
# Full pathname of a file containing the pre-shared key.
TLSPSKFile=/etc/zabbix/zabbix_proxy.psk

Some values have been censored because they contain sensible data (such as secrets or passwords).

This part tells the proxy how to connect to its database. In this case we’re using PostgreSQL.

# How the proxy should connect to Zabbix server, aka the encryption mode we want
# to use.
TLSConnect=psk
# Unique, case sensitive string used to identify the pre-shared key.
TLSPSKIdentity=psk_remote
# Full pathname of a file containing the pre-shared key.
TLSPSKFile=/etc/zabbix/zabbix_proxy.psk

Now here’s the interesting part: the part where we set up encryption for outgoing connections. We don’t set up any encryption for incoming connections, because we’re running our proxy in the active mode, which means a connection between the server and the proxy will always come from the proxy to the server.

The first parameter is TLSConnect, which tells the proxy what mode it should use to connect to the server. It can either be unencrypted, psk or cert.

Once we’ve told our proxy we want to talk with the server, there are two parameters we must define:

TLSPSKIdentity: the “identity” of the pre-shared key, aka a non-secret string identifier. You can basically input whatever you want here.

TLSPSKFile: the file containing your secret pre-shared key.

Zabbix’s documentation provides two ways to generate the PSK, which is basically a random 32-byte long string, using either OpenSSL or GnuTLS. I used GnuTLS, which looked like this:

Let’s just clarify a point here: the key isn’t the one we’re using. The code block above is just an exact copy from Zabbix’s documentation.

Now that we have generated our database.psk file, we’ll need to transform it a bit so Zabbix can read it, by removing the identity and the colon, leaving only the key in the file. Using the file generated in the previous example, it should now look like this:

You may of course rename the file and move it on the proxy’s host. The next step is to re-open the proxy configuration file, copy the .psk file’s absolute path as the value for the TLSPSKFile parameter, restart the proxy and voilà! The proxy should now be able to talk to the server! Or at least try to, because the server doesn’t know our proxy. Let’s see how we can fix this.

Server meets proxy

Now you’ll need to log into your Zabbix server’s web interface (as an administrator), and click on the “Proxies” sub-menu from the “Administration” menu. From there, click “Create proxy”.

Fill in your proxy’s name, but don’t click “Add” yet. Also, make sure the name is exactly the same as the Hostname you specified in the proxy’s configuration (it’s case-sensitive).

Then click “Encryption” (at the top of the gray block, next to “Proxy”), uncheck “No encryption”, check “PSK”, fill in the PSK’s identity (again, this needs to be exactly the same as the value you set to TLSPSKIdentity, and is case-sensitive), and the PSK (which is the content of the .psk file we generated just before).

Now you can click “Add”, and voilà! Your server now knows your proxy and will be happy to talk to it, using the PSK to encrypt all communications.

A few words on the agents

Now this whole setup won’t disturb on-host agents that much. They talk to a proxy the same way they talk to a server. However, you’ll need to make them talk to the proxy, and this is done in two parts:

In the agent’s configuration file, set the Server parameter to the proxy’s address, not the Zabbix server’s.

In the server’s web interface, when creating the host, make sure to select the proxy in the “Monitored by proxy” dropdown at the bottom of the main view:

There’s one special case, though, it’s the agent that’s on the proxy’s host. If you use it with the same configuration than the other agents in your remote infrastructure, it will make that the proxy forward its own monitoring data, which is not good if you want to be able to investigate incidents efficiently (and can lead to countless issues). So I’d advise to make it talk (in an encrypted fashion) directly to the Zabbix server. The agent’s configuration is almost exactly the same than the proxy’s, in fact we can even use the same encryption key. At CozyCloud, we only append these lines to the proxy’s agent configuration:

And voilà!

Make sure this command line is run from the proxy’s host. You may want to change the interface (here eth0) and the port the Zabbix server listens to (here 10051) accordingly with your own setup.

If encryption is indeed turned on, all of the translated content sent from the proxy to the server (the right part of the output) must be un-understandable gibberish.

If no traffic goes between your proxy and your server (i.e. if tcpdump shows nothing), you might want to update the firewall rules on your Zabbix server’s host to allow incoming connection on port 10051 (or any other port you might have configured the server to listen to).

If you were not aware of it, this blog post was the first episode of my One post a week series, in which I’m trying to keep up with writing a blog post a week to help me get better at sharing my knowledge. If you have any feedback on this post, make sure to hit me up on Twitter, I’ll be more than happy to discuss it with you 🙂

I’d also like to thank Nicolas who spent so much time helping me with this setup and explaining so much things on Zabbix to me, along with Thibaut and Sébastien for their early feedback on this post, which helped me make it even better.

See you next week for a new post!

]]>One post a weekhttps://brendan.abolivier.bzh/one-post-a-week/
Brendan AbolivierSat, 14 Apr 2018 00:00:00 +0000https://brendan.abolivier.bzh/one-post-a-week/When I was at BreizhCamp, a 3-day long tech conference in the West of France that happened a couple of weeks ago, I attended a talk that gave me the idea to share each week on this blog some new stuff I learned on the way.My name is Brendan Abolivier. I’m a young guy from Brest, France working as a junior system administrator at CozyCloud, a small French company working on an open personal cloud platform aiming at giving people ownership on their personal data back.

When I was at BreizhCamp, a 3-day long tech conference in the West of France that happened a couple of weeks ago, I attended a talk called “Teaching is learning: become a better dev by sharing your knowledge”. During this talk, the speaker, Céline Martinet Sanchez, spoke about her journey in software development and how she used knowledge that was shared by others and slowly became the one to share her own knowledge with random people on Internet forums. The full 28-min long talk is available right here.

In the “sharing” part of the talk, she described the different ways in which you can share knowledge with other people (forum posts, blog posts, talks, etc.), and remarked that we usually refrain from sharing such knowledge. We sometimes use excuses such as “I’m not good at explaining” or “I don’t have anything interesting to share with people”. She actually listed most of the excuses she used to either hear or say herself, and explained how most of them were just that, excuses with no real base. She explained that you won’t get better at explaining stuff by not doing anything about it, and that most of the time you actually have something interesting to share (you must have learned something at work this week, or while talking with friends or colleagues, that helped you in your projects), but you usually consider it as not interesting enough to share it with the rest of the world.

While listening to her speaking, I noticed that, most of the time, when I was considering going to a conference, I always had a small moment when I was undecided about how to attend (speaker? attendee? volunteer?), and always quickly rejected the speaker option because I thought I had nothing worth sharing. Same goes with writing blog posts. Most of the excuses she listed during her talk were excuses I heard coming from myself, and it made be think that maybe I devalue what I know too much, and maybe what I learn each day/week/month is worth sharing with the rest of the world. This thought became even more realistic as I got to speak with Céline Martinet Sanchez later that day, when she told me she was actually pushed by her colleagues towards doing a talk, went through this whole thought process and came up with an amazing talk that really stand out to me.

Realising all of this, I thought it would be a great exercise to finally make use of this blog I set up without a real goal a few months ago, and, each week, share something I learned at work or while working on personal projects, or just something I have in mind and want to share on this space. The posts can be tutorials, feedbacks, or even reflections on non-technical parts of stuff I work on. Some week there might even be nothing because I won’t write random stuff if I have nothing to talk (even though it’s very unlikely).

I hope you’ll hang here with me, and I’ll see you next week for the first post from this series!