Surprisingly I keep answering this question over and over. I think the first time I wrote a definition online in a forum was 10 years ago. I am still seeing the question everywhere and it’s not unusual to answer it at work several times. So I decided to add a quick summary here, one more time.

Let’s analyze every part individually. REST is Representational Web Transfer, basically, the web. As you probably now the principles of REST are simple and it could be summarized like this:

Connect to the server Give a VERB and a path to an URI Obtain an answer It’s surprising how many things you can do with such a simple schema. To consume websites you only need one (or two) verbs. GET to obtain data and POST to send data. The POST used in many web forms is not the one we use for APIs and it’s called “overloaded post”.

RESTful refers to a full implementation of the verbs and in general doesn’t make too much sense if you don’t add “ROA” to the end (Resource Oriented Architecture). Without the ROA part, using the verbs is not accurate enough and in general tend to represent different concepts like SOAP which is mainly RPC calls using an envelope. Most of the time you will find RPC or hybrid REST services that are only calls to methods that do a lot of things internally.

Even when RESTful is too abstract the concept of RESTful ROA Api is more accurate and they work together. ROA means you represent and expose every resource and give access to the possible actions (verbs) to that resource. A resource must have at least one URI that represents it and could be the result of an algorithm (GET /search/?something)

Another principle of RESTful is stateless. You don’t have to coerce the server in any state before requesting something. That’s a principle broken very often. If I have to go from the step 1 to the step 2 in that order because something happens in the server in the step 1 that let me continue it’s probably you’re breaking the principle.

VERBS are PUT, POST, GET, DELETE and even when they exist they are not used very often HEAD, PATCH and OPTIONS.

There are several other concepts to cover a book, unfortunately a lot of pages are not even sufficient to explain RESTful APIs but I hope this summary is useful enough.

We all have been in that situation, unless you just started working a year ago or less, the odds of dealing with legacy code are really high in every single industry. And most of the time you will end up like this guy.

But there is one question I always love to clarify.

What is legacy code?

Legacy code is a broad expression that may make you think on old AS/400 computers (which have not died yet) or ancient languages. The reality is very different, legacy code is any piece of code that’s not been modified by you or any other member of the team in the last six months.

I propose you an exercise, open a project you worked on a while ago and let me know if the code looks familiar to you. You could think it was written by somebody else, the underlying logic might not come up to you immediately, even the code style has changed (I hope so!). We get better, we find better ways to do it, our old habits don’t stick (luckily) and we tend to write better code when we learn new patterns and practices.

If the code has been written by other people it could be worse. We don’t solve software problems in the same way, so first we need to adapt to other people’s logic. It is not unusual long lasting projects have been modified by several developers, now you need to swim across totally different concepts and ideas.

The biggest issue when you deal with legacy code is nobody wants to touch it, because it is hard to guarantee you won’t break anything else.

That leads to the second definition I always like to remark.

Legacy code is code without unit testing

I totally agree with this definition. If we don’t have unit testing we cannot guarantee any path of execution, any logic and input/output validations. What we also guarantee is we will finish our code changes and we will try to immediately throw the code over the fence to QA that will spend 2 or 3 days using a huge spreadsheet containing different test cases to probe we didn’t actually break anything.

That costs money, it is inefficient and impact our delivery. The agility of software delivery is heavily compromised. Not only that, let’s imagine the scenario of a any application in production. On a peak season the client finds a bug, it is very critical and it’s impacting sales, or payments or something else. Phones are ringing, managers are being called to urgent meetings and you get the bug fix assigned to solve asap.

First, you panic, code is huge, messy, complicated, there were 20 programmers before you and the logic is complex. After struggling for 2 hours you seem to find the bug, at this moment your boss drank the third coffee in a row and he/she is feeling anxious to get an answer.

You modify the code, cross your fingers, it builds. Run some tests, it seems to be fixed. Now what?

QA….

QA doesn’t want to approve anything that’s not been properly tested. As a developer you cannot guarantee you didn’t break anything. There is nothing to probe it, so the boss goes to QA and ask how fast they can validate the new version. The answer is 2 days, 1 day is they don’t test everything. By this time somebody called an ambulance while your boss lies on the floor. The client will continue losing money for a couple of days if we are lucky to have fixed the bug.

There is the case for unit testing.

Unit testing

Let’s review quickly some principles about unit testing. Our code has only 4 reasons to change:

Adding a feature

Fixing a bug (most of the time)

Improving design (refactoring)

Optimizing resource usage

The first two cases fall under the category of Behavioral Changes. They change how the application behaves. The other two cases must not change behavior, they are Non-Behavioral Changes.

Unit testing must ensure behavior doesn’t change. Let’s look at a visual representation of how it looks to add a feature and a bug-fix to current code.

Despite of my natural tendency to choose ugly colors, I can represent the current codebase as the largest bar and the tiny portions on the right show what is to change the code. Without unit tests we risk the entire project every time a change is introduced. But, what if we have some unit tests, not even a lot? We create something called “test harness”, we just wrap our code with a proof the logic is still valid, at least we know some logic is not broken.

If we had some unit tests we could at least ensure some logic was not affected by our changes. More logic we wrap with tests, more secure we feel every time we make a change. But, to reach that goal we will have to apply some good practices and principles.

Another great side effect of adding unit tests is the fact we will review the logic acquiring domain knowledge and deeper understanding of our application. Also the tests reflect the logic intentions, our test cases will indicate what we are trying to probe in every single case.

Principles

To be effective unit tests must follow the principles below:

Tests MUST run fast

Tests MUST run in isolation

Tests DO NOT depend on the environment

Test ONLY validate behavior

A practical case to determine this is true should be the following. Go to the repository where your code is located, clone it for first time in a clean environment, build the code and run the tests. They MUST run, even if you disconnect from the network, change the computer or don’t have access to any external resource. And they MUST run fast. This is a clear proof our unit tests are applying correct principles.

The last item indicated on the list is crucial to be totally effective. A test only validates behavior, it is very common to find unit tests that create an object and check the properties. That doesn’t probe anything (we are not validating our language can create an object), we only validate behavior which means the code logic.

The isolation refers to dependencies and mostly services or resources. Our tests cannot depend on a database, folder, specific file, a third-party service active somewhere, etc. Our code will mostly always depend on something else but tests must create fake implementations of any external dependency, again, we test our logic not if a database can save a record properly to give an example. We assume the dependency works as we expect for our logic, for example, if our logic depends on a value returned from the database, we assume the database will return a valid value with a fake object that mimic that behavior. At the same time we can execute a negative test case, what happens if the database doesn’t return a valid value or even if an exception is thrown downstream (the logic must manage scenarios with partial operations).

Now… where to start

To be honest this is the simplest question to answer. If we never added unit tests to the project it means we are not very familiar with it, so we have to take a very conservative approach and it’s the low hanging fruit:

Start with methods that calculate, compute or process an input and produce an output without any other dependency.

I’ve never seen a project, legacy or not, where we cannot find this kind of code. It is very simple. I will add an example that is part of code I wrote some years ago which constitute a great example of this:

The code above represents an ideal case to start unit testing legacy code. Based on the input it validates certain values, throws controlled exceptions if those values are not in the expected range and returns an output. The rationale behind this code is very tied to the API provided by Walmart where certain values are in between a range and there is some logic to create default values that are required by the end system.

In fact, this code also gives us the ability of writing positive and negative tests. What happens if I pass correct values, and what happens if I pass incorrect values. In the specific code I wrote there are controlled exceptions, so tests must validate certain parameters will throw exceptions.

When we write unit tests there is a very good naming convention I recommend to use that will immediately show to any code reader what is intended to test, and the unit test writer a good mechanism to write good unit tests:

[System under test]_[Scenario]_[Expected output or behavior]

Using this convention we can immediately recognize what we are testing. Let’s see a simple example:

As you can see, I followed the naming convention, my system under test is SearchParameters method inside of a factory; the scenario indicates I ony pass the search phrase; the expected output is an object with the Query property populated. Looking at this specific example might be confusing, I said before we don’t test properties, only behavior, so why do I test just an object creation? Because there is logic within that method, passing only one parameter is accepted and internally set all the default values passing certain validations. In fact, a second test will validate a a default.

We can continue validating those scenarios. A different example is the negative case, what happens if I enter an invalid input? If we look at the code those cases will throw exceptions, let’s see one example:

Passing an invalid start (-1 in this case) throws an exception. I didn’t want to dig into the syntax of unit tests. I am using the Visual Studio Unit Test framework which is very simple, other frameworks don’t use decorators like the example, but the principles are still the same; we will have somehow the ability of testing a property or determine the code throws an exception.

Summary

Legacy code is tough to modify, it might be the time nobody has updated it, or many people working on it using different styles and practices, or simply it’s too complicated to understand. Adding unit tests serves two purposes; first, you can ensure you don’t break anything after modifying the code; second, it helps you out to understand current functionality when you add tests.

In future posts I will explain more complex techniques regarding to legacy code unit testing. Cases mentioned here are simple, most of the times we will find dependencies that make it hard to test. Luckily, there are several known patterns to deal with those issues.

Framing using Pipelines

One of the latest additions to .Net Core 2.x was Pipelines, that I consider “Microsoft to the rescue” concept. In a previous article I talked about manual framing which was the most popular solution despite of optimizations and nuances, until Microsoft release the Pipelines package.

If you haven’t read the previous article about framing I suggest you to review it, it will help to understand pipelines much better.

This code creates a Pipe and starts a writer and a reader method. Something very interesting is the concept of decoupling reader and writer. In theory, we don’t need to write anything we are reading from the socket leading a potential null writer implementation.

Names can be a bit confusing in this case. The “Writer” means a Pipe Writer, it will consume the socket and write the Pipe. The Pipe is a very optimized circular queue intended to manage back pressure and flow control efficiently. It saves us of building our own buffers but also controls very important aspects of network communications.

Both tasks are executed and never end until connection is closed. It will be easier to see in the following code.

At the beginning this code might seem intimidating but it is actually very simple. First, we allocate a buffer at the application level from the Pipe Writer. Socket reading is exactly the same of before, the difference is now we will write our bytes to the Pipe which will efficiently manage our buffer. Exception handling is added as a real-life example.

Once we write data to the writer to need to inform the Reader that data is ready using FlushAsync. This method has two goals, making data available and also returning a flag to indicate communication is finished. IsCompleted means there is no more data to read and we can exit which is what this code does. The writer signals when it is done with the Complete method.

Part 3: Pipe reading – Framing

I have to tell you the truth. Framing is not entire eliminated, it is still necessary to determine boundaries, but pipelines create a way simpler code that let you focus on the main part which is trivial after releasing the heavy work to the pipes.

Code is extremely simple and way cleaner. We need to read from the Pipe reader that handles the buffer for us, extracting a ReadOnlySequence of bytes. The code that searches for boundaries uses byte array slicing and markers to determine messages.

In the same way we advance the writer position in the pipe we need to do it in the reader, indicating where we ended reading. Same concept applies to stop reading, if the flag for completed is set, socket has finished reading and it’s safe to leave. The pipe reader also needs to flag the pipe as complete before ending.

Once both tasks exit the main method used to orchestrate will detect the tasks have finished and it will exit as well.

Summary

If you are using .Net Core 2.2 or greater and you are still using manual buffer handling, switch to Pipelines immediately. It will simplify the code and also, performance will mostly always be better, especially if you are managing multiple threads at the same time. I noted a significant better CPU utilization after switching to Pipelines.

A sample case – TKStar TK901

I had to deal with several GPS protocols and similar devices during the last years and I find more cases every day. Newer devices, clones, cheaper options, different kind of applications such as pet trackers, personal trackers, trucks, vehicles and even devices used to spy (despite of how controversial it sounds) are implementing protocols to communicate with different services.

I chose the TKStar TK901 because it represents a normal scenario where the device is new, not very well known or tested and it doesn’t fit any specific standard protocol (even when standard is such a strong statement in a sea of non-standard solutions).

This device is intended to be used as a portable tracker, also advertised as anti-theft which is at least arguable in terms of effectiveness. I found it useful as a personal tracker but it has to be very visible (in a thin-layered backpack or so) to acquire signal properly. I guess the size of the device makes its antenna not so strong, or the battery consumption that is very low (most devices do not last more than some hours and this one can work for days).

Find below a link to the device if you are interested on buying it:

Protocol and Control

Most GPS protocols have commonalities despite of how messages are implemented. The biggest differences are always related to device’s quality, better quality mostly always leads to richer protocols. Besides, richer protocols are required in better devices to represent events or extended information.

For example, trucks’ devices come with several possible integrations such as panic button, RFID recognition to detect tags associated to a driver, fuel alarms, door alarms and so on. Another common extension in protocols is the ability to control configuration and behavior sending messages to the device.

The majority of the simplest devices allows to configure them using SMS commands (normal messages), giving the ability of changing the server and port they will report to, behavior such as speed alarm, sending SMS in case of some events, switching GDPR to 3G, control which phone number is allowed to administer the equipment and several other combinations.

At the minimum shared level, all of them will report at least the position. In general they will also report 2g/3g network information such as base stations, area codes, signal strength and satellite positioning information (which minimum level is whether or not is positioned).

The TK901 protocol is very simple. First, it can report a heartbeat/login message. Heartbeats are very common, they normally consist of a very small packet indicating minimum information intended to keep the socket open (GPS mostly always stay connected but servers usually close communications after some period of inactivity). TK901 heartbeat looks like the following example:

[3G*9010008512*000A*LK,0,0,100]

As a general rule which is not obviously written in stone, most protocols contain a start/end marker and a command. The base message format of TK901’s messages is

[3G*device_id*flags*command,....]

In this case, LK is a heartbeat message and it only reports few things. The most important property is the number 100 at the end, battery level, very common among GPS with internal battery (some of them must be connected to a car battery for example and don’t have internal battery)

Another common message found among similar devices is cellular network information. This device follows a common pattern:

GS is the command of network information and reports LAC, MCC, MNC, Station number and signal strength of all stations the GPS can detect. In that way it works exactly like a cellular phone (well, nowadays all of them are smart phones!). This information is important to analyze communication problems, blind areas, weak signal areas and so forth.

The third message I will mention here is pretty much shared across all GPS types, position information.

Position messages

GPS devices try to report as much as possible using positioning messages. From my experience we will find very often the following properties:

Date and time the message was sampled in the GPS. It is possible to receive messages from many days ago if the GPS was accumulating packets to send not having signal. It is also configurable so it might be wrongly configured.

Positioned or not. It means the device has GPS signal. Some devices are able to report they do not have signal but they were able to approximate the position using cellular antennas

Latitude and longitude. It is very important to remark the format can differ. A popular format for most properties is NMEA which I heavily suggest to read before dealing with these kind of devices. Some devices like the one I mentioning here use decimal coordinates.

N/S and E/W indicators for latitude and longitude. Coordinates are always positive, the indicator is fundamental to construct a geographic point to process or store. This is part of the NMEA specification but even when coordinates are decimal, indicators will be reported too.

Speed. From my experience I found NMEA (Nautical miles), ground miles and metrical system (kilometers)

Direction angle (0-360). The direction the GPS is moving onto.

Event. Depending on the GPS type it is possible to receive one or more events, such as shock alarm, speed alarm, etc. In fact, events use to trigger a position report as well.

The current device is very simple and there is no protocol documentation at the time being. Analyzing packages and correlating them to known protocols is a very fast way to get to understand and implement a protocol that lacks of documentation.

Final words

Take this article as an introduction to a very extensive world of similar devices. It is very normal to find similar types of communication in all sort of IoT devices as well.

Stream and Framing

Sockets implement an abstraction of NetworkStream which is a Stream. The stream is always a sequence of bytes. We read and write a stream of bytes, that is very relevant to understand what we do. Sockets do know nothing about the data itself. Applications must know what to do with the byte stream and how to convert it to something meaningful.

The string “Hello my friends!” means nothing to a network communication. The string needs to be encoded and represented in a byte array. The application receiving that data needs to decode it properly from the stream in order to translate it to something understandable. In this context, understandable only makes sense for the two parts of the communication. Information broadcast can represent an image, text, encrypted data or any kind of other format.

In general, simple high-level protocols deal only with ASCII strings or byte arrays, but this is not a rule, it is just an observation.

For example, a valid representation of the phrase “Hello my friends!” can be encoded using the following C# code:

var myMessage = Encoding.ASCII.GetBytes("Hello my friends!");

Which yields a byte array containing the following bytes:

72
101
108
108
111
32
109
121
32
102
114
105
101
110
100
115
33

Of course, the message must be decoded using the same method on the destination to make sense, using the following C# code:

var decoded = Encoding.ASCII.GetString(myMessage);

Let’s remark again the byte stream doesn’t mean anything to the socket and network broadcast. Only high-level applications writing and reading to the socket can translate it into meaningful messages or entities such as images, documents and so forth.

A very normal application for sockets is high-level protocols. At the time being many small devices such as GPS, IoT or similar devices implement a socket communication to a server that can be configured using simple packets with commands, instructions, information, monitoring data, etc.

These kind of high-level protocols are mostly byte or ASCII based. The majority of them are TCP/IP based with some unusual UDP/IP implementations. TCP is (mostly) reliable in terms of package delivery, order, duplication, retry. Under certain circumstances such as bad connectivity, UDP will lost a high percentage of packages, TCP might be better due to retrying mechanisms. Our applications need to be aware of the fundamentals of protocols to expect appropriated output.

High-level interpretations lead to another common issue which is part of this post’s goal, framing.

Framing

In a nutshell, framing is the process of messages interpretation, which requires to identify message boundaries in a stream context.

The term boundaries is just a definition that correspond to whatever decisions were taken in terms of defining a high-level protocol. In that sense there is no restriction, I can determine any kind of rule that is doable. For example, I could define my own chat protocol that will always start messages with @@ and finish with $$. Messages will look like the following examples:

@@ey! this is a message in the protocol, pretty cool, isn't it?$$
@@sure, let's keep chatting this way$$

Boundaries were defined by my “protocol” and they only make sense in the context of my potential application. In theory, receiving those messages should be a trivial task, I just need to read from a socket and pass what I read to a method, class or function that handles it. The reality is, there is no any possibility of having certainty of receiving those messages with boundaries preserved (in one read). So the first issue is dealing with the reception. A possible reception of the messages in 3 different reads could be the following:

Comment: Chat protocols might be implemented using UDP, which is less reliable but at the same time simpler to implement. UDP preserves boundaries, messages are broadcast in one piece so we don’t have to deal with framing.

The data read from the stream is copied into a buffer, encoded in ASCII, and passed to a string manipulation function that will look for start / end of the boundaries. I will show all the code a bit further. For the sake of simplicity I didn’t define some of the variables but it is worth to explain some of them.

Buffer size is related to how much data I am trying to read at once, this process is required to read from the communication queue and clean up that data downstream. In general Socket Flags are not used, in fact, it requires a different level of complexity and in some cases are hacks not entirely recommended unless you are dealing with unusual communications. When I say “I am trying to read” I mean exactly that, there is no guarantee I will read a specific number of bytes, I just indicate the maximum size I am trying to read.

What is the main problem of this algorithm? Even when string manipulation is not complex, a high workload could lead to heavy delays. Delays in reading process causes something called flow control, the underlying layers will force a transfer interruption. If the interruption happens very often it will cause many other issues such as high CPU or data loss for time out.

Read about flow control in TCP/IP connections to get a deeper understanding.

The problem gets much worse when protocols are binary. I refer to a binary protocol which is not entirely accurate, to those protocols that are not encoded in a human readable format. They are very common in small devices where memory and buffer size are critical. In a typical ASCII encoded format I could send the following flags in a readable ASCII message:

1,0,1,0,1,1

Which is acceptable, but when memory and buffer are critical, those flags take a big chunk of it. Of course I exaggerated the representation but it’s not unusual, this message takes 11 bytes to send 6 flags. In a binary compact protocol I could represent it in one byte where 6 bits build the flags and I still have room for 2 more flags. The byte transmitted could be:

45 = 0x2B

One byte makes all the eleven previous bytes if I change the protocol.

I did mention it makes it worse. Why? It is simple, once we deal with array of bytes instead of string operations, we need to create arrays and use array copy methods to move those chains of bytes. String operations can be used in a more efficient way using different mechanisms but byte array operations are heavy, the fact of “extending” an array requires to create a new one and copy the two previous arrays into the new one which is a very inefficient operation (I know, strings concatenation does the same but we have alternatives such as StringBuilder).

Once we deal with heavy workload, binary protocols cause a lot of troubles, mainly high CPU consumption if there are many processes or threads dealing with devices at the same time. The problem doesn’t arise on the message reception, in fact, it is entirely message interpretation.

Let’s imagine a binary protocol that marks message start with some bytes that might be any kind, 1 or 2 start bytes and 1 or 2 end bytes (I indicate 1 or 2 because it’s unlikely to find binary protocols that mark with less than 2 for obvious reasons). Once packages arrive we need to concatenate binary data that might be something like the following example (in hexadecimal for reading clarity):

There is no guarantee we receive all at once. What is the big issue? The quantity of Array operations we will need for a very simple case. Let’s indicate it in a series of steps before digging into the code:

First message arrives. It is copied into a buffer and passed to a message analysis function. Boundaries are not present so it just copies it into the buffer and does nothing.

Second message arrives. The message analysis function requires to do the following, create a new buffer that holds message 1 and 2, copy both into it. Because message boundaries now are present it takes the first packet and send it for processing trimming again the remaining buffer. Because there is no any other message it returns.

Third buffer arrives, same as in 2, a lot of Array copies and creations.

CPU consumption of the process sky-rockets from that point and onward. Binary processing is the worst case. Very basic processing functions will look like the following code:

There is a common method to enhance binary copy but it is something to be careful with, Buffer BlockCopy. In theory we can improve our code using buffer block copy operations but it is not that simple, first, it requires a deeper understanding of the underlying implementation. In some critical cases, block copy operations are mandatory to achieve an expected output. Also, some improvements to the code I show here might work, such as keeping fixed size buffers (avoiding constant creation) but the implementations will become more complex.

In the next part we will explore a new method implemented in .Net core 2.2 and above, it solves many of the issues of framing in a more efficient way. The method is called Pipelines and it was not available until the mentioned version. Life is easier when implementations of common techniques are standardized!

Fundamentals

A couple of decades ago, it was very normal to have the necessity of writing all sort of socket clients or servers. At some point the abstractions of some languages made it irrelevant, or to be more accurate, less used than ever, keeping the scope of sockets for college degrees with the typical chat servers or similar exercises. Those examples from classic university literature were enough to understand the foundations, but very far from reality where a lot of different challenges exist.

The history is cyclic as we learned; some new application such as Internet of the Things, GPSs and similar devices required to write sockets again, at least when you are implementing those solutions from scratch. In some cases, abstractions were not enough to cope with fine tuning to be able to deal with heavy workloads or implementations nuances. Microsoft itself implements some abstractions such as TcpClient that are less complex but hide all sort of details.

To make it worse, C# is not a language that was born with concurrency in its core. It is possible to deal with concurrency but it’s far from languages written with that purpose. If you require extremely high concurrency, performance and practically zero downtime, it is more adequate to go for another language such as Erlang that is very specific for that.

Principles

Socket principles are well known and there is a lot of literature about it. However, it is not easy to find online real cases of blocking, heavy workload, how to deal with concurrency and the worst nightmare, framing. The reason might be most applications for sockets do not need such complexity, connecting 10 or 20 clients for a small chat server does not require more than a simple implementation that can handle some threads and minor locking techniques. A mutex will mostly always do the work.

However, I want to remark that I noted a lack of understanding of how the underlying layers work that leads to wrong implementations. This is the area I want to focus first.

A world before new Socket()

Our applications sit on top of lower level implementations that made possible the communication. Even when it is not required to understand everything about those low level implementations, it is necessary to know some of them.

Our sockets always use (at the date of writing this article) the Winsock implementation when we deal with Windows machines. WinSock is based on the original Berkeley implementation (BSD). Even when .Net Core is written to be used in other operative systems, the good news is Winsock was entirely derived of the Unix implementation so they look a lot alike and behave nearly in the same way. So when we are writing sockets we are indirectly using the old and reliable Winsock (BSD based) concept updated and enhanced through Windows versions.

As you probably know we can use UDP or TCP over IP (Internet Protocol). IP provides a datagram service and it is important to remark that is a “best effort protocol”. It will try to deliver packets to destination but it might (and eventually will) fail. You might probably ask how it is reliable, packets can be lost, reordered, retried. The reason is TCP manages that, it is designed to recover from losses, duplication, errors, etc. UDP does not care about that and packets might be received N times, get lost, reordered from how they were sent. Implementations on UDP and TCP must contemplate that.

Running under IP we have the real communication layer that can be all sort of transports, such as old modems, ethernet networks, satellite routers, microwave communications, etc. Why is that important? The main reason resides on focusing on the following principle:

No matter what we push on the Socket stream on top of the implementation; everything will be buffered, pushed downstream, converted to a series of binary datagrams and it will eventuallybe broadcast in chunks to the destination that needs to revert the entire process to reach the upper layer.

Socket in .Net is the upper layer abstraction. Our messages will flow downstream following this path:

Socket buffer

WinSock implementation

TCP / UDP

IP

Network

Flowing backwards in the reception.

Another important difference between TCP and UDP is the way packets are broadcast. Because UDP is a disconnected protocol, every packet must contain the destination address and it will be always sent in one piece that cannot exceed 64kb (technically a bit less but it’s not relevant for this article).

TCP is a connected protocol, the connection has to be established before sending packets. Messages will be transformed by the buffers, queues and network protocols to reach the destination. At the end TCP will rebuild the original message from the packets and push them upstream. The original message might be reconstructed in smaller chunks.

This last principle can be summarized as message boundaries. TCP does not preserve them, UDP does.

.Net implementation

As we know the Socket itself is just an abstraction to send and receive information through the network.

The base implementation in .Net is the Socket class which allows a lot of low level access that upper-level implementation does not. In Windows it is possible to see how much of the WinSock implementation is still available, socket exceptions populate a property ErrorCode which is the original WinSock code. To make it more compatible with other Operative Systems .Net Core also adds more data such as NativeErrorCode property which is not tied to a particular operative system.

The Socket class is more powerful but a bit more complex, for that reason Microsoft created three higher-level implementations:

TcpClient

TcpListener

UdpClient

All of them are implemented on top of Socket hiding some complexity. Again, let’s keep in mind the Socket class is still an abstraction that depends on WinSock (in Windows) and BSD Socket (for Linux).

Threads and blocking operations

Another aspect of sockets is the fact most of the time we will have to deal with threads while implementing real-world solutions. A socket can only attend one client (at least in TCP) at a time that is connected. The classic example of a chat room will have some few dozens of hundreds of clients connected from different computers to our virtual room. Every client is an incoming connection with a specific IP and port number (the server port is known, client ports are assigned dynamically at the time of creating the connection).

In theory we could create a list of connections and attend them in a cycle, let’s imagine the following scenario:

Connection 1 is received in port 9999, we add it to the list of open sockets

Start cycling on 1 connection checking for incoming data

Connection 2 is received in port 9999, we add it to the list of open sockets

Connection 3, 4, 5… 20 is received in port 9999, we add them to the list of open sockets

Keep cycling on 20 connections checking for incoming data

This approach is possible but inefficient. It will not scale either. Loops in sockets have several problems, one of them is flow control that I will briefly explain in the framing post about sockets. Another important issue of sockets is most basic operations block. What does it mean? The program will be completely blocked in that instruction waiting for something (reading operation, socket connection, etc). If we accept connections in blocking mode our potential loops indicated before won’t ever run until a new connection unblocks the program.

If you take a look at the code that is a basic example of a socket server bound to 10.0.0.1:9999, I indicated in the last line where it blocks. Execution of the Console WriteLine instruction will not happen until a connection is accepted by the server in that address and port.

A normal technique is polling, which means just checking if there is a pending connection for a certain amount of time and keep running code. In general, the client accept code part, except for very particular scenarios, is implemented as a blocking code section. How can we execute more code if the server is almost always blocked waiting for a connection? The answer is threading, we need to spawn a thread with every connection accepted and keep blocking (not necessarily always) in the main thread. So updating the previous code to loop accepting connections and spawn a thread would look like the following example:

The ServerStart method loops accepting connections. Due to Accept blocks, it will wait until a new connection is accepted and hand over the connection passing the socket object to a new spawned thread that can keep running independently.

Reading data, blocking and unblocking methods

Most examples online show basic data reading techniques, which might be useful in many contexts. Reading is not complex but it is important to understand there is only one type of reading. Sockets manage stream of bytes, it doesn’t matter what kind of high-level protocol we use or invent, we will be always reading bytes.

A further section about the stream focus on that.

There are different parameters passed to the Receive method at the time of reading but they basically do the same; they read from the stream into a buffer, The stream is in fact data ready to be processed that has flown all the way to the client (or server, at this point reading in each endpoint is exactly the same) to the network buffer, IP implementation, operative socket implementation and it’s buffered waiting to be consumed by our application.

As an example I used a socket passed to a method, which doesn’t mean we have to, it is just for this particular code.

When we receive data we need to pass a buffer and the length of it. It doesn’t mean that:

We will actually receive any data, that’s why the entire instruction determines if there are bytes read (> 0)

We will receive the exact buffer size, it might be possible the buffer is full but we need to assign it to a variable to determine how many bytes we actually read.

The Array Copy instruction will copy the section of the buffer read to a new byte array with the exact information. There are potential optimizations but the principles are the same.

As I mentioned before, this operation also blocks. Receive will block until data is read or socket is disconnected (0 bytes means there is nothing to read at the end of the connection). In general we will need to execute more code such as processing that data that might take longer or any other task type. Spawning a new thread to read the data is complex and unnecessary.

There is a simple solution to this and it’s polling. Polling will check if there is something buffered to be read, if not, we can continue the execution. The polling mechanism is simple, it checks the underlying buffer for data to read for a period of time (passed as a parameter) and returns true if data is present. It avoids blocking; technically it avoids blocking forever, we are still waiting a certain period of time.

What is REST?

Rest is a client-server principle and it is not necessarily web, even when the most common representation is any HTTP service serving web pages. I don’t want to go into details about REST because it is very simple to find all sort of literature, blog posts, wiki pages and so forth related to that. I will only emphasize the principles:

Stateless

Cacheable

Uniform interface

Layered system

Code on demand (my less favorite principle by far)

RESTful

A RESTful interface sits on top of REST using the same principles, but implementing verbs that represent operations on the resources. Operations on a RESTful interface will execute a verb on a specific resource and return a response code and possible a body. Everything is executed over HTTP.

Resources in a RESTful interface become more important, due to the fact we are executing operations the resource needs to be addressable and unique.

ROA = Resource Oriented Architecture

A well designed RESTful interface sticks to the ROA principles. This is the point where all these principles are combined to constitute an Architecture. ROA defines much better what a resource can be, how it can be addressed, uniqueness, principles, designs and so forth. RESTful is more ambiguous than ROA because it’s a set of principles, not an architecture.

ROA makes a very clear distinction of all sort of things that can constitute a resource such as:

A document

A version of the document

A language version of the document

The last version of the document

A search result

The list of items I normally buy in a store

My favorite products

My preferences

I emphasized these examples to show how an addressable resource doesn’t mean it’s always the same resource. For example, analyzing “the last version of the document” can yield different results every day I execute it, the document can be modified and returns a different output but it is still addressable and unique.

Another important principle of ROA is resources must be self-descriptive. A human might be able to infer easily what the resource is about. There are several notations and concepts to design, such as entities hierarchy, for example, the preferences of a particular user should be defined as:

/user/{identifier}/preferences

Those principles might change based on system necessities but it is a good idea to read about them at least to form some opinion about them.

Limitations

The biggest limitation I always found is practicability. A common example is search features. Following the principles our search should be addressed with something like:

GET /search/products?param1=value1&param2=value2....

Which is great from a formal standpoint. But imagine a search engine of a particular entity with 50 parameters, readiness gets complex and in some cases browsers might not be able to represent it. Those limitations are fading away but it is fairly complex to stick to the principles and keep system’s maintainability in some cases. We are always tempted (and many times it happens) to use a POST /search/products with a JSON/XML body. In theory, it violates addressability and RESTful principles (POST can only create entities and return responses according to it)

Conclusion

It is very important to differentiate principles, practices and architecture. In some cases we will only find guidelines or best practices. I cannot remember how many times developers argue about some particular design guidelines which are only guidelines. As a general rule, it is important to know the theory but apply a pragmatic approach to design.

Design is very important anyways, there are a lot of APIs poorly designed in a way that emulate old RPC calls, RESTful is a good design, know the basis before starting your development.

At the time being, Enterprise Architecture and patterns related to it, have evolved into a very solid set of principles. It is pretty straightforward to recognize patterns of enterprise architecture in fewer time than ever due to the knowledge we’ve been accumulating the last decade or so. If it has a peak, quacks and I can spot feathers it might be a duck after all.

Microservices are the natural evolution of SOA (Service Oriented Architecture). SOA is probably the most infamous and ambiguous denomination in terms of what exactly is. It was a very good attempt of breaking down old monolithic systems into something more manageable, but as usually happens, it was co-opted by vendors considering every single framework, tool or application a modern piece of SOA.

The biggest problem in SOA was the lack of boundaries and definitions in terms of what a service should be. Instead of a three (or more) layered system, many enterprise applications ended up in several services without clear boundaries and a lot of dependencies. A very common case was having a shared library that every single service knew, which pretty much tied all the system to a single piece of code that could break the entire application. Another common issue is the service did too many things, leading to a smaller layer but still pretty big and hard to maintain.

I don’t want to go further describing the issues founds on SOA during my career, but I want to say it started with good intentions, the result was not so great.

Microservices to the rescue

As I mentioned before, microservices are a natural evolution to SOA. At the beginning it was fairly simple to split off layered systems (classic view, business logic, database access systems) into services. Once systems grew larger and larger the SOA pattern became a nightmare, pretty big services with a lot of functionality and most of the time, heavy dependencies on each other. I remember to see in a company I worked for a Catalog Service (part of evolution to SOA) which was built 8-10 years before I joined and it required an entire server that kept growing in memory and CPUs. Simply, the catalog service did everything about the company’s products to the point it was an entire system itself, built into one particular service.

Also, systems are larger than ever. When I started working in software more than 25 years ago it was possibly for most developers to write any entire application from scratch. Five years later it was pretty much normal to have teams or 3-4 people. By the time I worked for a decade it was very normal that Enterprise Applications required dedicated teams of 10-20 people. Nowadays the amount of functionality, size of applications, data volume and complexity has sky-rocketed and it is very normal to have a code base that requires several teams.

In that context, microservices came to the world to solve those issues

What is a microservice?

I found many definitions of a microservice but I prefer to stick to the simplest one. They are small (important!), autonomous (more important!) services that work together. And the second question is, how small? Another simple definition I found is:

“So small that can be rewritten in most languages in less or around than 2 weeks”

“My service can manage catalogs and products”. That is incorrect, too many responsibilities, and if you have a very large company that manages tons of products, a “products service” simply won’t work, it will have too many responsibilities.

The autonomous definition is the key. Boundaries must be very clear and well defined. Microservices can change and be deployed independently, communication between services is enforced to happen via network calls. This definition makes the next step natural, services must expose an API.

APIs

An application programming interface (API) is core to microservices. To be fair, APIs have existed for decades in different forms such as RPC (remote procedure calls), SOAP interfaces, Java beans, custom tailored communications and so forth. It is common to think on an API as a RESTful API but it is not entirely true. APIs nowadays are mostly always REST APIs (I deliberately eliminated “ful” due to the fact many of them are not).

A microservice must expose an API which constitute the contract with that service. Every single communication with the service must stick to the API. For that reason, a microservice is not even tied to a framework, programming language or specific implementation. If I can expose an API the service can be coded in any possible way.

As a general rule it is fairly common to use HTTP APIs, they are broadly used and most languages have a lot of support to build them fast. They also have a lot of support to scale, if the microservice is well designed and implemented over HTTP calls, it is very simple to split the workload adding a load balancer on top of it and replicating the service in different machines (in this case machine is a blurry concept nowadays!)

What about queues?

Going back to the title you can tell I mentioned queues and you could be asking now why I mentioned it.

There is a common confusion about microservices architecture, queue based systems and using queues in microservices.

Microservices in many large systems tend to have heavy workloads. A normal approach to cope with it is using queues to defer execution. Let’s imagine a microservice receives orders to be delivered in a very popular e-commerce site, the service needs to keep up with requests at fast pace, it might be good to defer execution placing a payload in a queue to be executed as soon as possible, but the service itself it is not immediately executing it.

The confusion I mentioned is some microservices design tend to use the queue as a communication backbone instead of an API. Even when this is not entirely incorrect, it is pretty dangerous to use queues as the communication backbone. The reason is that placing a message in a queue transfers the contract to another system and there is no any restriction for any other service to consume that message. Boundaries get blurry.

To make it worse, not all queue software works in the same way. We might need to rely on a particular vendor that implements what we need to fit our design, coupling a loosely couple design to a particular vendor.

Part of the confusion is related to a different pattern. There is a reason to rely entirely on queues, when systems are designed as QBS (Queue Based Systems). Queue based systems are very particular, every single service consumes a payload that acts as a Unit of Work, external interfaces are not required (the service doesn’t need to expose an API). They share similarities with microservices but those services cannot be considered such.

Conclusion

Microservices Architecture is subject of entire literature about it, however, some principles are simple and easy to follow. The goal of this article is clarifying some design rules and warn about pitfalls and common mistakes. I will dig into those design principles in subsequent articles.