Friday, February 3, 2017

Microflows

Introduction

In this post, I introduce the concept of Microflows. A Microflow is the combination of a Microservice and a well-defined transaction integration flow implemented with an Integration Framework Library.

The concept of Microservices has been very trendy in the last couple of years. In my point of view, Microservices is a way to streamline some of the principles of the Service Orientation Architecture paradigm (SOA): service loose coupling, service abstraction, service reusability, service autonomy, service statelessness, service composability. On the other hand, governance complexity increases since you end up with many software assets that you need to maintain, version, and standardize. This is why the service discoverability and standardized service contract principles are extremely important in order to achieve a good level of service governance.

For many programmers and software architects, SOA is a synonym of ESB (enterprise service bus). An ESB tool is mainly a compound of SOA patterns that can be used together to integrate services and monolithic applications that usually don't implement same communication protocols and define a different data model. These differences usually bring the burden of endpoints call orchestrations, data model and data format transformations, protocol bridging and service transaction management [2] entirely to the ESB layer. Mule and Apache Camel are good lightweight ESBs and integration libraries options to avoid the high cost of big and fat ESB platforms.

Common implementation problems in system integrations

Bad transaction design: in many cases -especially in implementations with ESB platforms- the entire integration flow has no proper transaction design, the flow is composed of many activities that do not share the ACID transaction properties or do not compensate transaction routines in case that any activity fails. This problem introduces data inconsistency across the applications that receive information handled by the flow, creating a lot of complexity in compensating the system failures.

Weak or missing fault tolerance controls: bad transaction design usually lacks fault tolerance controls. Sometimes, well-design transactions implement weak fault tolerance mechanisms, leaving to the ACID properties all the responsibility to guarantee data consistency and durability across applications, especially in distributed systems; a good example is when one of the transaction activities needs to persist data into a system or application that cannot enroll in the ambient transaction, like in the two-phase commit protocol. Another common example is when the integration transaction rolls back without providing a way to automatically retry the execution or to notify a compensating process for further retry or to alert to a person via email, log system or a dashboard.

In this integration flow the trigger email activity executed successfully while the archive file activity failed, the lack of fault tolerance controls like a retry and a bad transaction design leave an inconsistent state of the data.

Lack of metrics definitions for alerting controls: logging process activity in integrations is a common practice to record successful or unsuccessful executions in software applications. In many cases, the logs are not further processed or analyzed automatically by other applications to raise alerts in case that an unexpected result or a severe problem is found. It is common that the logs are referenced as a reactive measure (several hours or days after) to find any information that can help us with identifying the root cause of the problem. Defining the right metrics to log is not a trivial task, it is important to analyze and classify what the data points are and how they can provide us with the right metrics for an application to log. Good data metrics will facilitate the automation of raising alerts to either prevent or rapidly react to a problem.

Shared or global configuration dependencies: this recurrent practice is found when there are several applications sharing the same application host. One example is when there is more than one application exposing web services via common ports 80 for HTTP or 443 for HTTPS. In this case, one single component is configured at the server level globally to be shared by all the web service apps. This convention breaks the autonomy of the applications since, if the configuration needs to be changed to fulfill the requirements of any one application, it might affect all the rest, forcing to test any other application that would be impacted. Test automation is a good way to mitigate this risk. This is something that usually is desired, but carried out rarely.

Monolithic scalability: this is found when several integration flows are implemented in one single application where the flows are executed concurrently. If there is a need to scale one of the flows within the application due to load increase or usage, the entire application will be deployed to satisfy this demand. This might introduce several problems in case that not all the flows were designed to run with several instances in parallel, some few unwanted result examples are data inconsistency, excessive computer resource consumption or random errors that are difficult to trace.

Despite being very common in system integrations, the previous problems are not unique to integration flows, in fact, there are many best practices and guidelines to avoid them in application development, like the SOA principles and Microservices.

What is a Microflow?

A Microflow is the combination of a Microservice and a well-defined transaction integration flow implemented with an Integration Framework Library.

The main reason of using an Integration Framework Library like Mule or Apache Camel is to take advantage of the implemented 65 integration patterns [3]. The implementation of the patterns are proven, tested, and maintained in these integration libraries, so we do not need to reinvent the wheel and custom implement them when we want to use a Microservices approach instead of an ESB.

Anatomy of a Microflow

Microflows promote service autonomy and abstraction, it only exposes a well grain-defined public interface to communicate with external applications, like other Microflows. Alike Microservices, to set a good application boundary for our Microflows, is key. Containers are a good artifact that can help us to achieve the application division that we are looking for.

Docker is very popular in the containerization world and we can find images of almost all application runtimes and servers. Therefore, it is very tempting to utilize an App Server like Mule Server or Apache Server to host our integration applications. I don't recommend this practice since it can break the desired application independence if we host more than one app in the Docker App Server image, we are practically taking the common integration problems to the next level. Instead, we should host a lightweight application process in our containers, where the main dependency is the application run-time - like JRE or .NET -, Docker OpenJDK [4] is a good image to use for our purpose.

The principal components of a Microflow

Integration Framework Library: This library must provide a good compound of enterprise integration patterns. Integration flows should use these patterns for integration solutions, "ranging from connecting applications to a messaging system, routing messages to the proper destination, and monitoring the health of the messaging system." [3]

Application Package: This is the package of code libraries -compiled or to be interpreted- that will be executed by the run-time system.

Container Image: A container image is the basis of containers. The containers are instances of these images. Docker' glossary defines an Image as "the ordered collection of root file-system changes and the corresponding execution parameters for use within a container run-time. An image typically contains a union of layered file-systems stacked on top of each other" [5].

This image shows a blueprint of the anatomy of a Microflow. The integration flow is implemented in an Integration Framework Library like Mule or Apache Camel, wrapped in a JAR file and hosted in a Docker Java image.

Inter-flow communication

How to achieve inter-service communication is a common discussion when designing Microservices. This should not be foreign to Microflows either. Many purists may say that the best way to communicate services should be via HTTP with RESTful interfaces. When talking to system integrations, especially when uptime is not a quality that all systems, applications, and third-parties have, there is a need to guarantee successful data exchange delivery. While in Microservices the arguments focus mainly on sync vs async communication, Microflows do it in terms of system availability and SLAs. Messaging patterns fit most of the system integration needs very well, regardless of the communication protocol used.

To make our integrations more resilient, we need a buffer in between our services when transmitting data. This buffer will serve as the transient storage for messages of data that cannot be processed by the application destination yet. Message queues and event streams are good examples of technologies that can be used as transient storage. The Enterprise Integration Patterns language defines several mechanisms that we can implement to guarantee message delivery and how to setup fault tolerance techniques in case that a message cannot reach its destination.

A Microflow should not be limited to one simple message communication exchange, in many cases, we need to expose different channels for integration, leaving to the consumer the decision of what message channel exchange fits better its integration use case. Actually, I recommend that for every Microflow, you expose an HTTP/S endpoint and a message queue listener as the entry inbound components of the flow.

Migrating a legacy integration flow to Microflows

To migrate legacy implementations of integrations flows to Microflows, it is necessary to have a good understanding of transaction processing [6] and better yet, experience. Transaction processing will help us with identifying indivisible operations that must succeed or fail as a complete unit, any other behavior would result in data inconsistency across the integrating systems. These identified indivisible operations are the transactions that we will separate to start crafting our Microflows. Each one transaction must fulfill the ACID properties [7] to provide reliable executions. There are some design patterns that can facilitate the transaction design and implementation like the Unit of Work [8].

System integrations commonly exchange data among applications distributed in different servers and locations, where no single node is responsible for all data affecting a transaction. Guaranteeing ACID properties in this type of distributed transactions is not a trivial task. The two-phase commit protocol [9] is a good example of an algorithm that ensures the correct completion of a distributed transaction. One main goal when designing Microflows is that one Microflow's implementation handles one single distributed transaction as a whole.

Database management systems and messages brokers are technologies that normally provide the mechanisms to participate in distributed transactions. We should take advantage of this benefit and always be diligent on investigating what integrating systems or components can enlist in our Microflow's transaction scope. File systems and FTP servers are commonly not transaction friendly, for this purpose we need to use a compensating transaction [10] to undo a failed execution to bring the system back to its initial state. We need to consider what our integration flow must do in the case that the compensating transaction fails too. Fault tolerance techniques are key to maintain system data consistency in this corner scenarios. Dead letter queues and retry mechanisms are artifacts that we should always consider to improve our fault tolerance in our transaction processing. If we are creating Web APIs, our APIs must provide operations that we can use to undo transactions.

In summary, these are the steps to follow when migrating a legacy integration flow app to Microflows. The steps are not limited to Microflows migration since they can be used to design Microflows integrations from the green field:

Identify all the indivisible transactions in the implementation

Separate each transaction in its own flow

Promote each transaction to a Microflow

Identify what activities and integrating components can enlist to a distributed transaction

Define a compensating transaction for each integrating component that cannot enlist to a distributing transaction

Addressing common implementation problems in system integrations with Microflows

Bad transaction design

To address this problem, it is necessary to carry out steps 1 through 3 of the Microflows migration steps. First, we need to identify all the indivisible transactions and to achieve this, we can leverage design techniques like state machine diagrams. Each state usually represents one activity that needs to be executed in an integral fashion to meet the post-conditions needed to move to the next state. If any of the conditions are not met, the integration flow must undo any partial execution and move back to the original state. Second, we separate each indivisible transaction in its own flow, this will facilitate working on the integrating activity in isolation. This step facilitates the design and development of good practices like unit testing and user acceptance testing. Finally, we promote each transaction to a Microflow to deploy in our solution environments. This will help us to treat any Microflow independently to better maintaining it and supporting it.

Monolithic scalability

With Microflows, we do not need to redundantly deploy our whole integration blueprint to handle load peaks or to provide high availability to the application consumers. Microflows support high availability since they can scale horizontally and we can cherry-pick the strategy to scale each one independently: a Microflow with a synchronous web service interface can be set in a cluster with a minimum of X instances running for availability purposes, whereas a Microflow that listens to a message queue can scale based on computer resources usage or queue length.Weak or missing fault tolerance controls The intrinsic transnational design promoted by Microflows helps substantially with improving fault tolerance by making our integrations more resilient to error recovery due to the ACID properties. In many cases, this is not enough and we need to put other mechanisms in place to assure that our transaction will be executed successfully. Some examples of fault tolerance mechanisms and patterns are redundancy, error escalation to dead letter queues and poison queues, and compensating activity among others.

One major advantage of using an integration framework library, for the core development of Microservices, is that a big subset of the implemented 65 enterprise integration patterns facilitates (if not yet implemented entirely) the correct application of many fault tolerance controls.

Lack of metrics definitions for alerting controls: Another advantage of using queues as a mechanism to communicate Microflows is that we can easily setup monitor and alert controls on the queue itself. Alerts can be setup based on message longevity (e.g.: alerting when a message has been in the queue for more than X hours), queue length (e.g: alerting when there are more than Y number of messages in the queue), etc.. These alerts will tell us when something is wrong in our integrations like when a third party system is not working. Dead letter queues come very useful for this purpose, we can trigger alerts as soon as a DLQ contains one or more messages. Many monitoring tools offer plug-ins to setup alerts on the integration components based on resource limit usage.

Business based alerts must not be forgotten either, we should be able to send notifications to stakeholders when a transaction presents a problem based on business values conditions. The design principles of Microflows facilitate the implementation of business alerts since we can focus on it in isolation, based on the use case that such Microflow implements, on what notifications need to be sent for that given integration transaction.Shared or global configuration dependencies:

Microflows promote process and resource configuration and access autonomy. Each Microflow instance is responsible to access computer resources as needed to achieve the successful execution of the integration transaction. One Microflow may be polling an FTP server in a higher frequency than the rest. This is a good example of why the practice of creating global configurations for computer host consumption is not recommended, otherwise, we might be forced to share the global configuration that is not optimal for the transaction needs. Microflows can be tuned and maintained in isolation, without having a significant impact on each other.

Summary

Microflows are the result of applying Microservices design principles to system integrations flows that are implemented with an Integration Framework Library. The practice of Microflows moves system integrations away from a centralized ESB orchestration approach to a more distributed and decentralized choreography of independent transactions that conform a cohesive system integration solution as a whole. Several common integration problems were discussed to later address them by using Microflows principles. Specific examples of Microflows implementations targeting the pointed integration problems will be covered in future articles.

7 comments:

Based on my experience with exactly same architecture, there are some drawbacks - an integration service (i.e. Microflow) is significantly smaller then a micro-service (therefore, I prefer calling them nano-services). Having each flow running in as a separate micro-service, possibly within a Docker container, results in with the significant overhead both in footprint, as well as in overall complexity. That usually induce developers to create a one gigantic Micro-Service, running all the flows. Which, in turn, significantly reduces the benefits of having micro-services for integrations, as there will be slightly less or no autonomy of the integration flows. Could you, please, share your views or observations on the above? Thanks

Usually I recommend to limit the scope of a Microflow to a single use case transaction, and this does not mean necessarily that the integration service is smaller than a micro-service. A use case commonly defines the happy path, alternative paths and error handling of a complete transaction, and the idea that all this is implemented entirely in one Microflow. The overhead of micro-services does not fall on the technical aspect but in service governance; a good governance is always needed regardless of the architecture approach anyway. In my experience, governance is even worse in huge monolithic applications where one change in code usually affects many other functional areas of the application, requiring full regression testing in many cases. The main purpose of my post is to help on addressing some common limitations that we have when building gigantic integration services like the one you described. If a developer decides to go back to the practice of monolithic integrations, it may be the case that he has not faced those limitations and therefore he does not see the benefit of using a Micro-Services based approach.

If we combine this approach with agile methodology, it should be powerful framework. Of course, I assume that, as you mentioned, the appropriate level of granularity of microflow is provided via governance process.

Good argument Veniamin. I would say it is the perspective. It all comes down to what unit of work we are targeting as part of the micro-service and how the concept of microflow would fit in to carryout/integrate the tasks from these micro services as Javier tried to explain.

Even in micro services arena today there is no clarity or a standard framework for defining the unit of work. The thumb rule we have today is that micro service needs to be autonomous and should have enough functionality within to fulfill a task. Now what makes a task is subjective which in turn calls for good governance process.

I agree with you, that governance might help here, as long as the governance process it self has the right tools and guidelines on how to govern. In my opinion, the granularity should be decided considering cohesion of the services / flows within one micro-services. Both functional as well as from the lifecycle perspective, after all one of the reasons to have micro-services, is to be able to maintain them separately both in development, as well as in runtime. This is pretty much the same principle, as in SOA service design, where, in case of web services, WSDL should maintain a cohesion between operations, and be single-business-context-centric.

To the point I raised earlier, I believe, the amount of resulting integration flows is not generic, and depends on the particular architectural context. In many cases, a lightweight integration runtime is still a better tradeoff, then an Integration-only Micro-Service (i.e. cohesive set of flows / operations).

The business cohesion is defined in the use cases or in well-structured user stories. This is why I propose to set granularity at this level.

The architectural approach is commonly defined based on the project, team and tools constrains. If your team is not mature enough to implement micro-services properly, and you don't have the time to overcome the learning curve based on project timelines, of course it would be a bad decision forcing a micro-services approach, a better trade-off may be a well-known tool by your team like a ESB. Forcing a pattern just for the sake of using it, it results in an anti-pattern after all. Identify the driving forces of our project will help us to select what is the better architectural approach that better suits our needs and constrains.