Topics

Featured in Development

Alex Bradbury gives an overview of the status and development of RISC-V as it relates to modern operating systems, highlighting major research strands, controversies, and opportunities to get involved.

Featured in Architecture & Design

Will Jones talks about how Habito, the leading digital mortgage broker, benefited from using Haskell, some of the wins and trade-offs that have brought it to where it is today and where it's going next. He also talks about why functional programming is beneficial for large projects, and how it helps especially with migrating the data store.

Featured in AI, ML & Data Engineering

Katharine Jarmul discusses research related to fair-and-private ML algorithms and privacy-preserving models, showing that caring about privacy can help ensure a better model overall and support ethics.

Featured in Culture & Methods

This personal experience report shows that political in-house games and bad corporate culture are not only annoying and a waste of time, but also harm a lot of initiatives for improvement. Whenever we become aware of the blame game, we should address it! DevOps wants to deliver high quality. The willingness to make things better - products, processes, collaboration, and more - is vital.

Featured in DevOps

Service mesh architectures enable a control and observability loop. At the moment, service mesh implementations vary in regard to API and technology, and this shows no signs of slowing down. Building on top of volatile APIs can be hazardous. Here we suggest to use a simplified, workflow-friendly API to shield organization platform code from specific service-mesh implementation details.

Focus on the Process, Not on Individual Microservices

The key to success when working with a microservices based distributed system is to focus on the distributed process as a whole, not on the microservices themselves. The services are the least important part, Eric Ess claimed at the recent Microservices Conference in London, in his presentation on how to monitor distributed processes at jet.com.

At Jet a process initiated by a user involves at least a few microservices to complete and is called a distributed process; Ess, Director of Engineering, explains and notes that this is a key term for them when looking at how their system execute user requests.

Jet has about 800 microservices in production today giving a very complex communication topology. Because of this complexity, it’s infeasible for any team to know what’s happening outside of their scope, as well as being impossible for any individual to fully understand the system architecture. Despite this complexity, during problems in production, it’s essential to know exactly what the root cause is, and in which service it originates.

To overcome this challenge, there are two key things they want to accomplish:

Know how a single process is behaving, what microservices it has passed through and what it’s currently doing- basically be able to follow different type of processes as they move through the system via microservices interacting with each other.

Validate processes by defining the expected workflow for a given process and then validate that it follows that path when executed. Ess notes that even though a process is not generating any errors, it can still be behaving incorrectly. One example is a bug in A/B testing that routes a process the wrong way, causing a flaw in the testing data.

Ess notes that by focusing on the distributed process as a whole, not on the microservices themselves, they can ignore the services; they are a means for moving the process to the next service and a step towards process completion. The current state of the process and what is happening to it is what they care about.

This requires an altered mindset, with engineers focusing on the behaviour of a process within the system, not on a microservice and how it should behave when receiving a message. A team is not building individual microservices, but microservices that interacts with other services around it.

There are a lot of tools available to evaluate microservices or a system, but not to evaluate the process, or the behaviour of the process, as it’s being executed. In addition, Jet is using F# and since it’s hard to find suitable tools targeting F# they have created their own toolbox.

To provide a view of the running system and its processes, they have created a communication protocol (Dr Orpheus) which provides a set of header metadata that goes into every message and some rules for what a microservice must do when receiving a message with metadata. They are also building a telemetry processing / data streaming engine (XRay) that is doing some basic complex event processing (CEP), collecting data that every microservice emits as it processes messages. Engineers and business people can now supervise all processes and react when they are misbehaving in any way, not following the predefined flow, progressing too slowly or blocking in some service.