Introduction

With the advent of distributed applications, we see new storage solutions emerging constantly.
They include, but are not limited to, Cassandra, Redis, CockroachDB, Consul or RethinkDB.
Most of you probably use one, or more, of them.

They seem to be really complex systems, because they actually are. This can’t be denied.
But it’s pretty easy to write a simple, one value database, featuring high availability.
You probably wouldn’t use anything near this in production, but it should be a fruitful learning experience for you nevertheless.
If you’re interested, read on!

Dependencies

Small overview

What will we build? We’ll build a one-value clustered database. Which means, numerous instances of our application will be able to work together.
You’ll be able to set or get the value using a REST interface. The value will then shortly be spread across the cluster using the Gossip protocol.
Which means, every node tells a part of the cluster about the current state of the variable in set intervals. But because later each of those also tells a part of the cluster about the state, the whole cluster ends up having been informed shortly.

It’ll use Serf for easy cluster membership, which uses SWIM under the hood. SWIM is a more advanced Gossip-like algorithm, which you can read on about here.

Following this, it’s time to write a simple thread-safe, one-value store.
An important thing is, the database will also hold the generation of the variable. This way, when one instance gets notified about a new value, it can check if the incoming notification actually has a higher generation count. Only then, will it change the current local value.
So our database structure will hold exactly this: the number, generation and a mutex.

We’ll also need a way to set and get the value.
Setting the value will also advance the generation count, so when we notify the rest of this cluster, we will overwrite their values and generation counts.

Finally, we will need a way to notify the database of changes that happened elsewhere, if they have a higher generation count.
For that we’ll have a small notify method, which will return true, if anything has been changed:

We’ll also create a const describing how many nodes we will notify about the new value every time.

const MembersToNotify = 2

Now let’s get to the actual functioning of the application. First we’ll have to start an instance of serf, using two variables. The address of our instance in the network and the -optional- address of the cluster to join.

As we can see, we are creating the cluster, only changing the advertise address.

If the creation fails, we of course return the error.
If the joining fails though, it means that we either didn’t get a cluster address,
or the cluster doesn’t exist (omitting network failures), which means we can safely ignore that and just log it.

To continue with, we initialize the database and the REST API:
(I’ve really chosen the number at random… really!)

It’s also here where we start our server and print some debug info when getting notified of new values by other members of our cluster.

Great, we’ve got a way to talk to our service now. Time to make it actually spread all the information.
We’ll also be printing debug info regularly.

To begin with, let’s initiate our context (that’s always a good idea in the main function).
We’ll also put a value into it, the name of our host, just for the debug logs.
It’s a good thing to put into the context, as it’s not something crucial for the functioning of our program,
and the context will get passed further anyways.

If there are only two members then it sends the notifications to them, otherwise it chooses a random index in the members array and chooses subsequent members from there on.
How does the errgroup work? It’s a nifty library Brian Ketelsen wrote a great article about. It’s basically a wait group which also gathers errors and aborts when one happens.

We craft a path with the formula {nodeAddress}:8080/notify/{curVal}/{curGen}?notifier={selfHostName}
We add the context to the request, so we get the timeout functionality, and finally make the request.

Next on you can test your deployment by stopping and starting containers, and setting/getting the variables at:

localhost:8080/set/5
localhost:8082/get/5
etc...

Conclusion

What’s important, this is a really basic distributed system, it may become inconsistent (if you update the value on two different machines simultaneously, the cluster will have two values depending on the machine).
If you want to learn more, read about CAP, consensus, Paxos, RAFT, gossip, and data replication, they are all very interesting topics (at least in my opinion).

Anyways, I hope you had fun creating a small distributed system and encourage you to build your own, more advanced one, it’ll be a great learning experience for sure!

Practical Golang: Getting started with NATS and related patterns

Introduction

Microservices… the never disappearing buzzword of our times. They promise a lot, but can be slow or complicated if not implemented correctly. One of the main challenges when developing and using a microservice-based architecture is getting the communication right. Many will ask, why not REST? As I did at some point. Many will actually use it. But the truth is that it leads to tighter coupling, and is synchronous. Microservice architectures are meant to be asynchronous. Also, REST is blocking, which also isn’t good on many occasions.

What are we meant to use for communication? Usually we use:
– RPC – Remote Procedure Call
– Message BUS/Broker

In this article I’ll write about one specific Message BUS called NATS and using it in Go.

There are also other message BUS’ses/Brokers. Some popular ones are Kafka and RabbitMQ.

Why NATS? It’s simple, and astonishingly fast.

Setting up NATS

To use NATS you can do one of the following things:
1. Use the NATS Docker image
2. Get the binaries
3. Use the public NATS server nats://demo.nats.io:4222
4. Build from source

Also, remember to

go get https://github.com/nats-io/nats

the official Go library.

Getting started

First, let’s write one of the key usages of microservices. A fronted, that lists information from other micrservices, but doesn’t care if one of them is down. It will respond to the user anyways. This makes microservices swappable live, one at a time.

Now, let’s write the first provider service. It will receive a User Id, and answer with a user name For which we’ll need a transport structure to send its data over NATS. I wrote this short proto file for that:

Notice that it’s a QueueSubscribe. Which means that if we start 10 instances of this service in the userNameByIdProviders group , only one will get each message sent over UserNameById. Another thing to note is that this function call is asynchronous, so we need to block somehow. This select {} will provide an endless block:

Now if you actually test it, you’ll notice that if one of the provider services isn’t active, the frontend will respond anyways, putting a zero’ed value in place of the non-available resource. You could also make a template that shows an error in that place.

Ok, that was already an interesting architecture. Now we can implement…

The Master-Slave pattern

This is such a popular pattern, especially in Go, that we really should know how to implement it. The workers will do simple operations on a text file (count the usage amounts of each word in a comma-separated list).

Now you could think that the Master, should send the files to the Workers over NATS. Wrong. This would lead to a huge slowdown of NATS (at least for bigger files). That’s why the Master will send the files to a file server over a REST API, and the Workers will get it from there. We’ll also learn how to do service discovery over NATS.

First, the File Server. I won’t really go through the file handling part, as it’s a simple get/post API.I will however, go over the service discovery part.

Our Master will hold a list of tasks with the respecting UUID (at the same time the name of the file), id (the position in the master Tasks slice), and a pointer which holds the position of the last not finished Task, which will get updated on new Task retrieval. It’s pretty similar to the Task storage in my Microservice Architecture series

How do we get the next Task? We just loop over the Task to find one that is not started. If tasks above our pointer are all finished, then we also move up the pointer. Remember the mutex as this function may be run in parallel:

Awesome, our Master-Slave setup is ready, you can test it if you’d like. After you do, we can now check out the last architecture.

The Events pattern

Imagine you have servers which keep connections to clients over websockets. You want these clients to get live news updates. With this pattern you can. We’ll also learn about a few convenient NATS client abstractions. Like using a encoded connection, or using channels for sending/receiving.

Wait… What’s that at the end!? It’s an encoded connection! It will automatically encode our structs into raw data. We’ll use the protobuf one, but there are a default one, a gob one and a json one too.

Introduction

In our microservice architectures we always need a method for communicating between services. There are various ways to achieve this. Few of them are, but are not limited to: Remote Procedure Call, REST API’s, message BUSses. In this comprehensive tutorial we’ll write a service, which you can use to distribute messages/events across your system.

Design

How will it work? It will accept registering subscribers (other microservices). Whenever it gets a message from a microservice, it will send it further to all subscribers, using a REST call to the other microservices /event URL.

Subscribers will need to call a keep-alive URL regularly, otherwise they will get removed from the subscriber list. This protects us from sending messages to too many ghost subscribers.

Implementation

Let’s start with a basic structure. We’ll define the API and set up our two main data structures:
1. The subscriber list with their register/lastKeepAlive dates.
2. The mutex controlling access to our subscriber list.

We initialize our subscriber list and mutex, and also launch, on another thread, a function that will regularly delete ghost subscribers.

So far so good!
We can now start getting into each functions implementation.

We can begin with the registerAndKeepAlive which does both things. Registering a new subscriber, or updating an existing one. This works because in both cases we just update the map entry with the subscriber address to contain the current time.

The register function should be called with a POST request. That’s why the first thing we do, is checking if the method is right, otherwise we answer with an error. If it’s ok, then we register the client:

We then lock the mutex for read. That’s important so that we can handle huge amounts of messages efficiently. Basically, it means that we allow others to read while we are reading, because concurrent reading is supported by maps. We can use this unless there’s no one modifying the map.

While we lock the map for read, we check the list of subscribers we have to send the message to, and start concurrent functions that will do the sending. As we don’t want to lock the map for the entire sending time, we only need the addresses.

It’s important to notice, that we have to create a buffer from the data, as the http.Post(…) function needs a reader type data structure.

We’ll also implement the function which makes it possible to list all the subscribers. Mainly for debugging purposes. There’s nothing new in it. We check if the method is alright, lock the mutex for read, and finally print the map with a correct format of the register time.

We just range over the subscribers and delete those that haven’t kept their subscription alive.

To add to that, if you wanted you could first make a read-only pass over the subscribers, and immediately after that, make a write-locked deletion of the ones you found. This would allow others to read the map while you’re finding subscribers to delete.

Conclusion

That’s all! Have fun with creating an infrastructure based on such a service!

Introduction

In this part we will finally finish writing our application. We will implement the last two services:
1. The Worker
2. The Frontend

The Worker

The worker will communicate with the Master to get new Tasks. When it gets a Task it will get the corresponding data from the storage and will start working on the task. When it finishes it will send the finished data to the storage service, and if that succeeds it will register the Task as finished to the Master.

That means that you can easily scale the workforce, by turning on additional machines, with the worker service on them. Easy scaling is good!

Implementation

As usual we will start with a basic structure which is similar to the structure of our previous services. Although there is one big difference. There won’t be any API here as the worker will be a client. You could, if you wanted, add an API for debugging purposes. Things like getting the processor usage. But this can also be implemented using 3rd party health checking services.

Ok, what are we doing there? We create a waitGroup. The main function has to be waiting for the goroutines and not just finish execution, that’s why we create a waitGroup and add the thread count. You could add a functionality to break the endless for loop and after that use the Done() function on the waitgroup. We won’t be adding this as we just want endless for work loops.

Now we will write down the execution process for each Task.
First we get a new Task:

The work is largely irrelevant, but I’ll explain it anyways. First we create a RGBA. That’s something like a canvas for drawing, and we create it with the size of our image. Later we draw on the canvas swapping the red with the green channel. Later we use the RGBA to return a new modified image, created from our canvas with the size of our original image.

After working on the image we have to send it back to the storage system. So let’s implement the sendImageToStorage function:

We create a databyte slice, and from that a data buffer which allows us to use it as a readwriter interface. We then use this interface to encode our image to png into, and finally send it using a POST to the server. If everything works out, then we just return.

When we successfully saved the image, we can register to the Master that we finished the Task.

We’ve got the code of our index web page here, and we also have the API declared. After starting the program we check if we have the k/v store address. If we do (we hope so), then we can get the master address from it.

Now we can go on to implementing the functions. We’ll start with the simples. The index handler:

After writing this we can go on to writing the more complicated functions. We will start with the handleTask function. This function is responsible for parsing the user form and sending the raw image data to the master.

Ok, what do we have here? We check the method as we always do, and later parse the multipart form. We’ve got a nice lovely magic number there. The number is responsible for setting the max size of the form held in RAM. The rest will be stored in temporary files. We later do the request using the file we got:

It just checks the id, sends the request, and copies the response to answer the user.

Conclusion

So now it’s all finished. You can start all applications and they will work. You can contact the frontend and make requests, they will all start working and you will get your modified images. Six microservices working beautifully together.

This may be the end of this series, so have fun extending this system alone.

I may add a finishing part about deployment to container infrastructues, or a another extending series about refactoring the system with 3rd party libraries in the future but I’m not sure.

All in all, good luck!

UPDATE: Also remember, that when running the worker, it’s good to launch it with goroutines in the number of 2-4x your system threads. They have near-0 overhead in switching and this way your not unnecessarily blocking when waiting for http responses.

Introduction

In this part we will implement the next part of the microservices needed for our web app. We will implement the:
* Storage system
* Master

This way we will have the Master API ready when we’ll be writing the slaves/workers and the frontend. And we’ll already have the database, k/v store and storage when writing the master. SO every time we write something we’ll already have all its dependencies.

The storage system

Ok, this one will be pretty easy to write. Just handling files. Let’s build the basic structure, which will include a function to register in our k/v store. For reference how it works check out the previous part. So here’s the basic structure:

We create a file in the tmp/state directory with the right id. Another thing we do is check if the id really is a valid int. We parse it to an int, to see if it succeeds and if it does then we use it, as a string.

we use the io.Copy function to put all the data from the request to the file. That means that the body of our request should be a raw image.

Next we can write the function to serve images which is pretty similar:

That’s it. The new task will be created, the storage will get a file into the working directory with the name of the file being the id, and the client gets back the id. The important thing here is that we need the raw image in the request. The user form has to be parsed in the frontend service.

There’s not much to explain. They are both just passing further the request and responding with what they get.

You could think the workers should communicate directly with the database to get new Tasks. And with the current implementation it would work perfectly. However, if we wanted to add some functionality the master wanted to do for each of those requests it would be hard to implement. So this way is very extensible, and that’s nearly always what we want.

Conclusion

Now we have finished the Master and the Storage system. We now have the dependencies to create the workers and frontend which we will implement in the next part. As always I encourage you to comment about your opinion. Have fun extending the system to do what you want to achieve!

Introduction

In this part we will implement part of the microservices needed for our web app. We will implement the:
* key-value store
* Database

This will be a pretty code heavy tutorial so concentrate and have fun!

The key-value store

Design

The design hasn’t changed much. We will save the key-value pairs as a global map, and create a global mutex for concurrent access. We’ll also add the ability to list all key-value pairs for debugging/analytical purposes. We will also add the ability to delete existing entries.

The key shouldn’t have a length of 0, hence the length check. We also check if the method is GET, if it isn’t we print it and set the status code to bad request.
We answer with an explicit Error: before each error message so it doesn’t get misinterpreted by the client as a value.

It’s the same as setting a value, but instead of setting it we delete it.

The database

Design

After thinking through the design, I decided that it would be better if the database generated the task Id‘s. This will also make it easier to get the last non-finished task and generate consecutive Id‘s

How it will work:
* It will save new tasks assigning consecutive Id‘s.
* It will allow to get a new task to do.
* It will allow to get a task by Id.
* It will allow to set a task by Id.
* The state will be represented by an int:
* 0 – not started
* 1 – in progress
* 2 – finished
* It will change the state of a task to not started if it’s been too long in progress. (maybe someone started to work on it but has crashed)
* It will allow to list all tasks for debugging/analytical purposes.

Implementation

First, we should create the API and later we will add the implementations of the functionality as before with the key-value store. We will also need a global map being our data store, a variable pointing to the oldest not started task, and mutexes for accessing the datastore and pointer.

We check if the GET method has been used. Later we parse the id argument and check if it’s proper. We then get the id as an int using the strconv.Atoi function. Next we make sure it is not out of bounds for our datastore, which we have to do using mutexes because we’re accessing a map which could be accessed from another thread. If everything is ok, then, again using mutexes, we get the task using the id.

After that we use the JSON library to marshal our struct into a JSON object and if that finishes without problems we send the JSON object to the client.

It’s also time to implement our Task struct:

type Task struct {
Id int `json:"id"`
State int `json:"state"`
}

It’s all that’s needed. We also added the information the JSON marshaller needs.

Nothing new. We get the request and try to unmarshal it. If it succeeds we put it into the map, checking if it isn’t out of bounds or if the state is invalid. If it is then we print an error, otherwise we print success.

If we already have this we can now implement the finish task function, because it’s very simple:

It’s pretty similar to the getById function. The difference here is that here we update the state and only if it is currently in progress.

And now to one of the most interesting functions. The getNewTask function. It has to handle updating the oldest known finished task, and it also needs to handle the situation when someone takes a task but crashes during work. This would lead to a ghost task forever being in progress. That’s why we’ll add functionality which after 120 seconds from starting a task will set it back to not started:

First we try to find the oldest task that hasn’t started yet. By the way we update the oldestNotFinishedTask variable. If a task is finished and is pointed on by the variable, the variable get’s incremented. If we find something that’s not started, then we break out of the loop and send it back to the user setting it to in progress. However, on the way we start a function on another thread that will change the state of the task back to not started if it’s still in progress after 120 seconds.

Now the last thing. A database is useless… when you don’t know where it is! That’s why we’ll now implement the mechanism that the database will use to register itself in the key-value store:

We check if there are at least 3 arguments. (The first being the executable) We read the current database address from the second argument and the key-value store address from the third argument. We use them to make a POST request where we add a databaseAddress key to the k/v store and set its value to the current database address. If the status code of the response isn’t OK then we know we messed up and we print the error we got. After that we quit the program.

Conclusion

We now have finished our k/v store and our database. You can even test them now using a REST client. (I used this one.) Remember that the code is subject to change if it will be necessary but I don’t think so. I hope you enjoyed the tutorial! I encourage you to comment, and if you have an opposing view to mine please make sure to express it in a comment too!

UPDATE: I changed the sync.Mutex to sync.RWMutex, and in the places where we only read data I changed mutex.Lock/Unlock to mutex.RLock/RUnlock.

UPDATE2: For some reason I used a slice for the database code although I tested with a map. Sorry for that, corrected it already.

Introduction

Recently it’s a constantly repeated buzzword – Microservices. You can love ’em or hate ’em, but you really shouldn’t ignore ’em. In this short series we’ll create a web app using a microservice architecture. We’ll try not to use 3rd party tools and libraries. Remember though that when creating a production web app it is highly recommended to use 3rd party libraries (even if only to save you time).

We will create the various components in a basic form. We won’t use advanced caching or use a database. We will create a basic key-value store and a simple storage service. We will use the Go language for all this.

UPDATE: as there are comments regarding overcomplication: this is meant to show a scalable and working skeleton for a microservice architecture. If you only want to add some filters to photos, don’t design it like that. It’s overkill.

On further thought and another comment, (Which you can find on the golang Reddit) do design it this way. Software usually lives much longer than we think it will, and such a design will lead to an easily extendable and scalable web app.

The functionality

First we should decide what our web app will do. The web app we’ll create in this series will get an image from a user and give back an unique ID. The image will get modified using complicated and highly sophisticated algorithms, like swapping the blue and red channel, and the user will be able to use the ID to check if the work on the image has been finished already or if it’s still in progress. If it’s finished he will be able to download the altered image.

Designing the architecture

We want the architecture to be microservices, so we should design it like that. We’ll for sure need a service facing the user, the one that provides the interface for communication with our app. This could also handle authentication, and should be used as the service redirecting the workload to the right sub-services. (useful if you plan to integrate more funcionality into the app)

We will also want a microservice which will handle all our images. It will get the image, generate an ID, store information related to each task, and save the images. To handle high workloads it’s a good idea to use a master-slave system for our image modification service. The image handler will be the master, and we will create slave microservices which will ask the master for images to work on.

We will also need a key-value datastore for various configuration, a storage system, for saving our images, pre- and post-modification, and a database-ish service holding the information about each task.

This should suffice to begin with.

Here I’d like to also state that the architecture could change during the series if needed. And I encourage you to comment if you think that something could be done better.

Communication

We will also need to define the method the services communicate by. In this app we will use REST everywhere. You could also use a message BUS or Remote Procedure Calls – short RPC, but I won’t write about them here.

Designing the microservice API’s

Another important thing is to design the API‘s of you microservices. We will now design each of them to get an understanding about what they are for.

The key-value store

This one’s mainly for configuration. It will have a simple post-get interface:

POST:

Arguments:

Key

Value

Response:

Success/Failure

GET:

Arguments:

Key

Response:

Value/Failure

The storage

Here we will store the images, again using a key-value interface and an argument stating if this one’s pre- or post-modification. For the sake of simplicity we will just save the image to a folder named, depending on the state of the image, finished/inProgress.

POST:

Arguments:

Key

State: pre-/post-modification

Data

Response:

Success/Failure

GET:

Arguments:

Key

State: pre-/post-modification

Response:

Data/Failure

Database

This one will save our tasks. If they are waiting to start, in progress or finished, their Id.

POST:

Arguments:

TaskId

State: not started/ in progress/ finished

Response:

Success/Failure

GET:

Arguments:

TaskId

Response:

State/Failure

GET:

Path:

not started/ in progress/ finished

Reponse:

list of TaskId’s

The Frontend

The frontend is there mainly to provide a communication way between the various services and the user. It can also be used for authentication and authorization.

POST:

Path:

newImage

Arguments:

Data

Response:

Id

GET:

Path:

image/isReady

Arguments:

Id

Response:

not found/ in progress / finished

GET:

Path:

image/get

Arguments:

Id

Response:

Data

Image master microservice

This one will get new images from the fronted/user and send them to the storage service. It will also create a new task in the database, and orchestrate the workers who can ask for work and notify when it’s finished.

Frontend interface:

POST:

Path:

newImage

Arguments:

Data

Response:

Id

GET:

Path:

isReady

Arguments:

Id

Response:

not found/ in progress / finished

GET:

Path:

get

Arguments:

Id

Response:

Data/Failure

Worker interface:

GET:

Path:

getWork

Response:

Id/noWorkToDo

POST:

Path:

workFinished

Arguments:

Id

Response:

Success/Failure

Image worker microservice

This one doesn’t have any API. It is a client to the master image service, which he finds using the key-value store. He gets the image data to work on from the storage service.

Scheme

Conclusion

This is basically everything regarding the design. In the next part we will write part of the microservices. Again, I encourage you to comment expressing what you think about this design!