Highlights from DevTalks—Microservices and Advanced SQL

May 23rd saw the second edition of DevTalks. The event brought together 100 hard-boiled software developers from the Helsinki tech scene for an evening of learning and sharing knowledge. This blog post features highlights of the talks by our amazing guest speakers Markus Winand, Denis Rosa, and Frans van Buul.

SQL: Evolution of a Dinosaur by Markus Winand

Markus Winand is an author, coach and trainer on all things SQL, gave the first talk about the evolution of SQL since the popular SQL-92 standard.

Markus Winand began his talk by asking the audience how many people were still using Windows 3.1 and followed up this question by asking how many still restrict themselves to the SQL-92 standard that came out the same year.

Obviously, no one is using Windows 3.1 anymore, but far too many are still stuck using only SQL features from SQL-92 standard. The new features can make complex SQL queries more readable while also improving their performance.

Winand emphasized how SQL-92 is composable, and that building complex queries with it was like combining Lego blocks. In practice, these nested queries become hard to read, as the important parts are buried several levels deep in the statement.

The common table expressions (CTE) that came with the SQL 1999 standard make it possible to write longer queries that can be read more naturally top down. A CTE defines a named temporary result set that you can reference within the query. It’s statement scoped and can also be used in recursive queries, when traversing nested data-structures. Adding recursive CTEs was also the point when SQL was transformed from its purely relational model.

OVER (PARTITION BY) and NULLS FIRST/LAST came with the SQL 2003 standard. OVER when used together with PARTITION BY is useful for getting aggregates (COUNT, SUM, AVG, MIN, MAX) for related rows, e.g. calculating the total employee salaries for each department of an organization:

SELECT department,

salary,

SUM(salary)

OVER(PARTITION BY department)

FROM employees

You could also achieve the same using self-joins, but that will again lead to nested queries and make the query less readable.

The problem that NULLS FIRST/LAST solves is that the sorting of NULL differs between each database, using it either as a very big or small value. Using NULLS FIRST/LAST one can explicitly define which is the preferred outcome:

SELECT ...

FROM ...

ORDER BY nullable NULLS FIRST

Other newer SQL features worth having a look at, and that Winand covered in his talk were GROUPING SETS from SQL 1999, FILTER, BOOLEAN Aggregates, BOOLEAN Tests from SQL 2003, XMLTABLE from SQL 2006, OFFSET (which is best to be avoided in most cases) from SQL 2011 and JSON, LISTAGG, ROW PATTERN MATCHING, DATE FORMAT and POLYMORPHIC TABLE FUNCTIONS from SQL 2016.

Winand’s talk was particularly captivating for us at Smartly.io, as we rely heavily on many of the newer SQL features in our reporting backend. Our queries build a base dataset using CTEs, and the final queries are run against those abstractions. NULLS FIRST/LAST is useful for ordering joined data with NULLs, OVER is used for aggregated counts, and we use JSON features of PostgreSQL to store the more fine grained data.

Denis Rosa is EMEA Developer Advocate at Couchbase, took the stage next and described microservices; when they make sense and how to build them as more robust systems.

Denis Rosa has been involved in multiple projects splitting monoliths to microservices, and he gave tips on how to drive a successful microservice project.

He quoted Jonas Bonér on microservices being more a necessary evil than something that everyone should aim for and that, in fact, a monolithic architecture is a lot simpler at least in the beginning.Splitting the system to microservices does add complexity to the infrastructure, but the benefits become visible when the development organization grows bigger and the software more complex.

According to Rosa, microservices should be built autonomous. It reduces complexity of the whole system,isolates possible failures in the service state, allows scaling the services more easily, and makes them more resilient and elastic. Rosa also mentioned caching as one way to help isolate services and build up fault-tolerance for services that operate on data owned by another service.

Rosa also highlighted that asynchronous communication should be the default, as it makes the microservices more resilient to downtime of other services, and unties dependencies between instances compared to using streams or synchronous communication.

Kubernetes (K8s) can be a big help in making microservice more elastic, which means enabling them to be scaled out more easily. It enables easy configuration for defining pods and adding load balancing, and makes the service discovery layer obsolete. K8s also helps in making the services cloud agnostic, reducing the risk of vendor lock-in (check Rosa's Kubernetes demo here)

Microservice architecture enables polyglot programming—choosing the best fitting programming language for each service, but it also opens up new possibilities with polyglot persistence. This means that you can choose the best fitting database for each service. One service might benefit from the fixed schema of SQL databases, whereas another service needs the high availability and easy scalability of Couchbase.

DDD, CQRS and Event Sourcing by Frans van Buul

Frans van Buul, evangelist at AxonIQ, finished the evening by explaining design patterns that are helpful when splitting the monolith.

In his talk at DevTalks, Frans van Buul presented three design patterns, Domain-Driven Design (DDD), Command Query Responsibility Segregation (CQRS) and Event Sourcing (ES), which he has found useful in splitting a monolithic application to microservices. They help to structure the monolith so that it’s easy to split into different services when the time is right. DDD, CQRS and ES are old concepts, which have regained popularity with the microservice movement.

Domain-Driven Design is used to divide the domain into bounded contexts that allow splitting the microservices more sensibly along the borders of business domains rather than technical layers of the code. Van Buul demonstrated bounded contexts with an example from the aviation industry. The same concept of flights is shared by both passengers and the crew that prepares the plane for another flight between flights.

Both of these groups share the concepts of arrival time and departure time, but how they use those concepts differs: for passengers, the arrival time is always before the departure time, but for the gate crew the work starts when the plane arrives and ends when the plane departs. Therefore it makes sense to split the flights into two bounded contexts, one for the passengers and one for the airport crew. In this example, a possible split for microservices could be to build one for each of the bounded contexts.

CQRS refers to splitting object models for read and write operations. It’s a way to split the different concepts to different microservices as they are only coupled by message exchange. Shifting from a more traditional all-in-one CRUD object model to CQRS with separated read and write models makes sense when the queries and updates done to the data models become complex.

Separating the read and write models to different microservices enables scaling them individually based on the load, or choosing the best fitting database that optimizes either for reads or writes. With the read concerns separated to their own service, it’s also easy to create use case specific view models for reading the data. They can be very context-specific and easily disposed of when they aren’t needed anymore.

Event Sourcing is a pattern that brings immutability to the application state. Instead of only storing the current state and mutating it with CRUD operations, we store the full sequence of events that change the state. This way we keep the full historical data, which can be valuable when dealing with audits, security incidents, customer support, debugging or for business reasons that require querying the changes of data.

Going out of the context of Van Buul’s talk, the ES pattern is already familiar to many frontend developers who use Redux to store application frontend state. Yet, probably the most popular system that uses the event sourcing pattern is the Git version control system, as Martin Fowler pointed out in his talk “The Many Meanings of Event-Driven Architecture” at the GOTO 2017 conference.

With the new European GDPR legislature, users of an application have, among other things, the right to be forgotten, or have their data removed. An interesting question from the DevTalks audience was “how removing the user data should be done when using the event sourcing pattern?” It turned out not to be a trivial task, and Van Buul mentioned that various approaches are being used.

One option is to delete or modify events anyway, even though that’s not what you should do in event sourcing. Another option is to split the application to the part that contains personally identifiable information (PII) and the part that does not, and to use ES only for the non-PII domain. There are also ready-made solutions that encrypt the events and store the encryption keys for those events separately. If a user requests to have their data removed, one can simply discard the encryption keys, which is equivalent of deleting the data.

Microservices at Smartly.io

Like most applications, Smartly.io was initially built as a monolith, which allowed to bootstrap the application quickly and to react to changes fast with a small development team. Since then, our code base and the development team building it has grown. New features were added on top of the old code, which added to complexity in the code base.

Eventually, it was time to start shaving off microservices off the monolith. Scoping the services has been done using bounded contexts. All bigger new features are nowadays implemented as separate services. The microservice architecture allows us to choose the tech stack that best fits each project.

Today, apart from our monolith written in PHP, our microservices are implemented for example in Ruby on Rails and Node.js with TypeScript. The databases vary from MongoDB to PostgreSQL and Cassandra.

We currently run 22 individual services. Next, the team I work in starts to split yet another core feature to a separate service. I’m sure the lessons learned from the speakers at DevTalks will come in handy.

DevTalks returns in November

DevTalks will be back in November 2018, featuring Sam Newman, the author of one of the most popular microservice books out there Building Microservices: Designing Fine-Grained Systems. Stay tuned!