Topics

Featured in Development

Alex Bradbury gives an overview of the status and development of RISC-V as it relates to modern operating systems, highlighting major research strands, controversies, and opportunities to get involved.

Featured in Architecture & Design

Will Jones talks about how Habito, the leading digital mortgage broker, benefited from using Haskell, some of the wins and trade-offs that have brought it to where it is today and where it's going next. He also talks about why functional programming is beneficial for large projects, and how it helps especially with migrating the data store.

Featured in AI, ML & Data Engineering

Katharine Jarmul discusses research related to fair-and-private ML algorithms and privacy-preserving models, showing that caring about privacy can help ensure a better model overall and support ethics.

Featured in Culture & Methods

This personal experience report shows that political in-house games and bad corporate culture are not only annoying and a waste of time, but also harm a lot of initiatives for improvement. Whenever we become aware of the blame game, we should address it! DevOps wants to deliver high quality. The willingness to make things better - products, processes, collaboration, and more - is vital.

Featured in DevOps

Service mesh architectures enable a control and observability loop. At the moment, service mesh implementations vary in regard to API and technology, and this shows no signs of slowing down. Building on top of volatile APIs can be hazardous. Here we suggest to use a simplified, workflow-friendly API to shield organization platform code from specific service-mesh implementation details.

Microservice Threading Models and their Tradeoffs

Architects designing Micro-Service Architectures typically focus on patterns, topology, and granularity, but one of the most fundamental decisions to make is the choice of threading model. With the proliferation of so many viable open source tools, programming languages, and technology stacks, software architects have more choices to make now than ever before.

It is very easy to get lost in the details of nuanced language and/or library differences and lose sight of what is important.

Choosing the right threading model for your micro-services and how it relates to database connectivity can mean the difference between a solution that’s good enough and a product that’s amazing.

Related Vendor Content

Related Sponsor

Paying attention to the threading model is an effective way to focus the architect on considering the trade-offs between efficiency and code complexity. As a service is decomposed into parallel operations with shared resources, the application will become more efficient and its responses will exhibit less latency (within limits, see Amdahl’s Law). Parallelizing operations and safely sharing resources introduces more complexity into the code.

However, the more complex the code is, the harder it is for engineers to fully comprehend; which means developers are more likely to introduce new bugs with every change.

One of the most important responsibilities of the architect is to find a good balance between efficiency and code complexity.

Single Threaded, Single Process Threading Model

The most basic threading model is the single threaded, single process model. This is the simplest way to write code.

A single threaded, single process service cannot execute on more than one core at a time. A modern, bare metal server typically has up to 24 cores. A service built around this model will not be able to utilize more than one server core. The throughput of these services will not increase with additional load and their CPU utilization will not be able to rise over single digit percentage. With so much underutilization, a compensating tactic is to have larger server pools in order to handle the load.

This approach works, but is wasteful and ultimately expensive. The most popular cloud computing vendors offer single virtual core instances fairly cheaply in order to facilitate this approach’s more granular scaling needs.

Single Threaded, New Multi-Process, Threading Model

The next step up in both complexity and efficiency would be the single threaded, multi-process, threading model where a new process gets created for each request. Code for this type of micro-service is relatively simple, but it does contain more complexity than the previous model.

(Click on the image to enlarge it)

The overhead of process creation and constantly having to create and destroy database connections can steal processor time and thereby increase latency across all collocated services. The reason why this threading model creates more database connections is because database connections are per process and cannot be shared across process boundaries. Because the process lives only as long as the request, each request has to reconnect to each database.

Micro-services that run in this threading model should delay connecting to databases until they are needed. There is no reason to incur the cost of a database connection if the code path does not require that connection. While database connections cannot be cached across processes, some environments support a cross process opcode cache where you can store your service’s configuration data such as host IP and credentials for connecting. to a database; two popular examples of opcode caches are Zend OpCache and APC.

Single Threaded, Reused Multi-Process, Threading Model

The next increase of code complexity with efficiency is a threading model which is single threaded, multi-process, and any new request reuses existing worker processes. This is different from the previous threading model which always created a new process for each request. In this threading model, a new process is not created with each request after the process has been provisioned.

(Click on the image to enlarge it)

The service’s complexity is relatively simple but extra orchestration code must be involved to manage the worker process life-cycle. Code must also correctly re-initialize itself with each request. For example, programmers might maintain static variables instead of passing around a lot of extra data as parameters. That makes for simpler code and is fine when those static variables are reset with each new request. If the code doesn’t reset these variables, then behaviour will be based on previous requests instead of the current one. The last bit of additional code complexity is that logic for recovering from stale database connections will need to be included. A database connection can go stale when the database disconnects most likely due to inactivity.

Because each process can service multiple requests, there is no need to reconnect to each database with each request; database connections get reused which reduces latency by avoiding connection costs. But each process still has to create and manage its own database connections. Because processes cannot share database connections, shared databases maintain more open connections. Excessive open connections can degrade database performance. That is because database connections are stateful so the database application has to allocate resources in its own process for each connection.

Multi-Threaded, Single Process, Threading Models

There is a way to better protect the databases with a configurable number of connections. By using connection pooling in the multi-threaded, single long lived process model. Although a database connection cannot be shared across multiple processes, it can be shared across multiple threads in the same process.

(Click on the image to enlarge it)

Here is an example: If you have 100 single threaded processes each on 10 servers, then the database will see 100 X 10 = 1000 connections. If you have 1 process each with 100 threads on 10 servers and each process has 10 connections in its connection pool, then the database will see only 10 X 10 = 100 connections and the service can still achieve high throughput. Cross thread connection pooling is very efficient for both the service and the database.

This connection pooling technique achieves high throughput while protecting the databases but comes at a cost of extra code complexity. Because threads must share stateful database connections, developers must be able to identify and fix concurrency bugs such as deadlock, livelock, thread starvation and race conditions. One way to address these types of bugs is to serialize access but serializing access too much reduces parallelism. These types of bugs can be difficult for junior developers to identify and correct.

Multi-threaded, single long lived process models come in two flavors; by dedicating a thread per request or by sharing a single thread for all requests. In the former threading model, an extra thread is tied up with each request which limits the number of requests being processed in parallel. Too many threads can lead to inefficiencies due to excessive task switching in the CPU scheduler part of the Operating System.

In the latter threading model, there is no need to have an extra thread for each request but I/O bound tasks must run in a separate thread pool in order to prevent the entire service from hanging on the first slow operation that it encounters. If the results must be returned to the caller, then the request handler must wait for the results from the thread pool to finish.

With the no dedicated thread per request approach, expect high throughput and low latency for asynchronous operations but no real performance gains over the dedicated thread per request approach for synchronous operations.

Summary

threading model

efficiency concerns

code complexity issues

single threaded, single process

The service will not be fully able to utilize server cores. Expect throughput to not increase with additional load and CPU utilization to not be able to rise over 10%.

The simplest and most easy to understand approach.

single threaded, multi-process, new process for each request

The overhead of process creation and constantly having to create and destroy database connections a lot can raise latency.

Database connections should be lazy loaded. Consider using an OpCode cache.

Single threaded, multi-process, requests reuse worker processes

The databases see more open connections because they cannot be shared across process boundaries. Excessive open connections can degrade database performance.

Extra code must be present to manage the worker process lifecycle. The code must be able to recover from stale connections. Static variables should get reset with each request.

Multi-threaded, single long lived process, dedicated thread per request

Cross thread connection pooling is very efficient for both the service and the database but an extra thread is tied up with each request which limits the number of requests being processed in parallel.

Because threads must share stateful database connections, developers must be able to identify and fix concurrency bugs such as deadlock, livelock, thread starvation and race conditions.

Multi-threaded, long lived single process, no dedicated thread per request

Cross thread connection pooling is very efficient for both the server and the database. Expect high throughput for asynchronous operations.

I/O bound tasks must run in a separate thread pool. If the results must be returned to the caller, then the request handler must wait for the results from the thread pool to finish.

Conclusion

Before thinking about libraries and languages, software architects should reflect on the choice of threading model most appropriate to their engineering culture and competency. Striking the right balance between code complexity and efficiency will help sort out the confusion and give direction in choosing between the various technology stacks available. Because each micro-service has less scope than a monolithic application, consider leaning a little more towards code complexity in order to achieve higher efficiency.

About the Author

Glenn Engstrand is the Technical Lead for the Architecture Team at Zoosk. His focus is server side application architectures that need to run at B2C web scale with manageable operational and deployment costs. Glenn was a breakout speaker at the 2012 Lucene Revolution conference in Boston. He specializes in breaking monolithic applications up into micro-services and in deep integration with Real-Time Communications infrastructure.

Re: Threading is not specific for microservices, but ...

Your message is awaiting moderation. Thank you for participating in the discussion.

I count five points about threading models in micro-services here in this article. It is true that the same guidance provided by computer science applies equally to both micro-services and to other forms of application development.

Did you read all the way to the end? There is a recommendation to lean more towards complexity in micro-services due to their limited scope.

Re: Threading is not specific for microservices, but ...

Your message is awaiting moderation. Thank you for participating in the discussion.

Sorry if I sounded like a troll.

I think the topic is very important and interesting to me. And it is also a hard topic to discuss, because threading itself is already complicated.

I would like to see more discussion about how the design decisions of microservices impact what we already know about threading. For example, what if a microservice owns its data storage, and what will that result compared to the situation a number of microservices share one storage service.

Re: Threading is not specific for microservices, but ...

Your message is awaiting moderation. Thank you for participating in the discussion.

Hi Dong,

I believe you would be interested in taking a look at the single-thread single-owner data model of Baratine (doc.baratine.io/v0.11/architecture/service-arch... ) . A service has a single inbox, where requests are queued and answered by a single thread on an event loop. In this model, requests are nonblocking and the encapsulation of data being accessed only by a single thread prevents possible concurrency issues. In short, you no longer need huge synchronization blocks when accessing data due to an improved encapsulation model encompassing the thread + data.

How about in case of PaaS(Cloud) model?

Your message is awaiting moderation. Thank you for participating in the discussion.

Hi Glenn,Thank you for a nice article. And wondering what would be your choice of threading model in case of Microservices architecture on Platform as a Service hosting model (Heroku, Cloud Foundry, Azure Service Fabric etc.)?

12factor.net (Concurrency factor) suggests that we should choose Single Threaded, Multi Process threading model over others because scaling-out is simple and reliable, do you second with their opinion? or do you've different perspective?

Re: How about in case of PaaS(Cloud) model?

Your message is awaiting moderation. Thank you for participating in the discussion.

Thanks for asking these questions, Praneeth. I believe that the concurrency factor is about scaling out with multiple processes. The Heroku folks are actually quite neutral when it comes to threading models. The point that they are trying to make is that a single process won't scale no matter how many threads you use. My article here on InfoQ focuses on threading models. It was not my intention to advocate for systems where everything runs in a single process. I completely agree with the concurrency factor. Design systems where you can scale out on multiple, horizontally partitionable, share nothing processes. In any single process, you still need to decide on which threading model is best for your situation.

I find a lot of sentiment on the web that single threaded models are superior. In this article, I summarize that there is a complexity vs efficiency trade off. Single threaded apps are simpler but less efficient. You may be wondering if efficiency is all that relevant anymore since you can always scale out on more instances in the cloud. That is true but instances cost money. When you are staring at a six figure monthly AWS bill, you suddenly realize just how important efficiency still is.

Is there any concrete evidence that can back up my claims? At the beginning of this year, I conducted some research into this very question. I ran load tests on AWS against two functionally identical micro-services. One micro-service was written in Java using the DropWizard framework and the other was written in javascript using the node.js framework. You might be interested in my findings.