unclouding the cloud

Azure Service Bus is a great feature for B2B integrations in Azure. For B2B communication- particularly Queues and Topics delivers an extremely price-worthy, reliable and secure messaging as a service platform service. There are some best practices for setting it up and benefits and what to consider that I will share in this post. I will also mention the Relay Services briefly because it could also be a part of your integration strategy and is included in the typical B2B scenarios in service bus. For the more architectural build-up to this post please read the previous entry B2B Integrations using Azure (focus Azure Service Bus) – Part 1 Integration Architecture.

Service Bus Queues

Azure Service Bus queues offer asynchronous messaging as a (PaaS) service. This means that someone can put messages on a queue and someone can consume them. Wouldn’t be much value from the service otherwise but it is important to understand that we are only talking about temporary message storing. For permanent storage you should not use queues. They can handle great scale compared to traditional synchronous requests. For instance you may not need the machine park for serving the maximum number of simultaneous requests but you may allow messages to be stacked up in the queue and processed during time as long as you can handle them over-time and the queue does not fill-up.

Cross platform (almost)

As I mentioned in the “build-up post” the Service Bus Queues are virtually cross platform because there are APIs/SDKs for simplifying development for several platforms like .Net, Java, JavaScript, C, Python etc. In addition to that it supports the AMQP standard that enables it to talk to clients that use AMQP to post or read messages. And as a fallback it has a very simple to use REST Api that virtually all languages with possibility to post web requests should be able to use.

Messages are unstructured

One of the most difficult thing for people to comprehend with service bus messages are that the messages are not forcibly structured. If I have access I can post messages with nonsense data and I will succeed as there is no XSD format validation or similar when you post the messages.

We are all valid bodys!

It is therefor important to understand that you may get data quality related issues when you try to interpret the messages.

Dealing with currupt messages

As described above, many of Microsofts Azure services the queueing components do not support schemed messages. This means that you cannot enforce that all messages in the queues follow a certain format. You have to deal with this when parsing the messages. There is built-in functionality that after (default) 10 tries to consume a message, the message gets automatically put in the dead messages queue. Note: This value can be set/changed both in the portal and through code.

You can however move a message to the dead queue if you already know that the message is corrupt (so you can avoid try 2-9). Example below using C# SDK.

Security/Shared Access Secret Token

The security is based on creating a MD5 hash at the caller (based on the date, access key and access secret (like a password)) that you supply to the service bus to perform actions in the service bus. As you don’t send any password the requests are also safe not least considering that the tokens also expire. So to be able to exploit an intercepted message (you have to crack the encryption first) you can only reuse an intercepted token for a short period of time (after that the validity of the token has expired and they are invalid). The password/access secret is never transmitted so you cannot create a new token based on an intercepted one. The access key is also assigned to certain rights so you can limit a service bus access key to only allow Sending (like post a message, call a Relay Service), Listening (like consume a message, register a service bus relay) or manage (full rights). So if I manage to crack a Listen access secret you are still limited in what you can do.

TCP vs HTTP Binding vs AMQP

When using the .Net API you can decide to communicate though HTTP(S) ports or the TCP binding. By default the APIs are set to automatic meaning that it will try TCP but will failover to HTTP(s) communication if TCP cannot be achieved. The TCP protocol version may give you outgoing firewall issues in some cases (if your firewall rules are very restricted), while HTTPs/HTTP will most likely not give you any kinds of problems (as outgoing port 443 most often allowed). You can read another of my posts that goes into details about Azure Service Bus/Ports/firewalls. A hint may be that you set the communication mode to HTTPs directly if you know that the TCP ports are locked by a corporate firewall. Otherwise the TCP protocol is preferable due to that it faster.

Note that you can force a particular protocol by running any of the following lines of code (when using the service bus API)

REST API or .Net API with ConnectivityMode.Http or ConnectivityMode.Https or ConnectivityMode.AutoDetect (default) and TCP Ports below are not open.

Connects to the service bus through HTTPS/HTTP

9350-9354

For Relayed services and Brokered messaging (queues) with ConnectivityMode.TCP or fastest connection speeds with ConnectivityMode.AutoDetect

Connects to service bus through the fastest TCP communication protocol

5671-5672

Only for AMQP clients

These ports are used when communication with the AMQP protocol.

Formats/Serialization

You should format the body of the messages with JSON. The first reason is that JSON is quickly gaining an advantage over XML as the default Object serialization due to the smaller payload and its usefulness in many tools/programming languages. Secondly some parts of the service bus is already JSON, the Message Header property is a collection of JSON serialized headers – to use them you need to deserialize JSON so why mix? Integration platforms such as BizTalk use XML internally but has no problem in deserializing messages from JSON and serializing responses in JSON (at least not if you are using a somewhat modern release of the platform).

So when should you serialize as XML instead?

You have an ancient version of your integration platform and don’t plan to upgrade (mind the expiration of support though, extended support is expensive and Microsoft is clearer and clearer propagating that they support the current version and one older version of each of their platforms. The trend is also auto upgrade to the highest version so that people will run on updated software so it might be time to upgrade. So don’t get to far behind in your system software.

The posting application is older and cannot serialize JSON. But can they in that case create a SAS Token?

Don’t put a bottleneck in front of the service bus

It may in some cases be tempting to put a web app with a web service (perhaps with SOAP protocol) to simplify for older applications work with messages in the service bus. You should typically avoid this. To be able to throughput the same amount of messages in the web app as the service bus is capable of you need a really powerful web app with most likely many instances to do something that the service bus already does for you included in the ridiculously low monthly fee for service bus. Plus the fact that you would be paying for two components to get/post messages instead of just the one really needed (service bus). Use apis to directly access the service bus instead, the multiple apis and REST should enable almost any client to process messages directly.

Don’t bottleneck the service bus

An alternative to putting an extra layer in in front of the service bus one could to add some of local API component closer to the sender/receiver instead fronting the service bus with a commonly front to the service bus. You could also use your integration engine such as BizTalk to send/receive messages from/to the service bus from/to a source/destination that cannot speak directly with the service bus.

What to think about?

Shared access keys

I mentioned earlier that the shared access tokens are really safe. So that unless you have the secret key you cannot exploit the service bus. However if you have the key you can do whatever the access key has been configured to do. This fact dictates two important security measures you should take make it more difficult to exploit:

Assign minimal rights to resources (read more about this in section Set up Minimum Access/Partner rights).

Rotate the access keys. This means that you should periodically change the access key for the partner. Read up on how to do this in this section Shared Access Key rotation.

Set up Service Bus for Minimum Access

First of all I would recommend removing the default RootManageAccessKey on the namespace and create a new account for managing your service bus. This would make it more difficult to gain access over your namespace as the offender would already know the SharedAccessKey and only have to hack your secret assuming he/she knows your namespace. So set up another manage account (different key) for managing the Service bus. Maybe I am a bit paranoid here- but it is a bit like not naming your computer administrator account Administrator.

Remove the default root shared access

Set up access on resources rather than namespace. This means that you in a common send/receive scenario would need to assign the partner (send) rights on a request queue (incoming requests) and (listen) rights on the response/receive queue. This (by default) forces the partner to create different tokens for sending and receiving messages (or connection strings that you probably call it when using the .Net API). If the partner has access to many queues and you have set up password rotation you can get into an administrative nightmare and may cause the partner some headaches in setting the right key to the right queue. I will discuss how I handled this problem later in the post for minimum access/single access key on multiple entities.

Set up separate shared access keys for all(each) partners. This enables you to remove access to a certain partner without affecting others, it does not have to be company it may very be company/actor (like a system). You can also better monitor activity. For example if multiple partners shall send messages to the same queue then don’t give them the same key/secret for posting messages.

If you have an integration engine that is going to consume messages, like BizTalk, you could perhaps use a key/secret placed on the namespace that BizTalk should use to simplify management. But that is basically the only scenario that I recommend that you add keys and secrets on the entire namespace.

Minimal permissions

Function

Permission

Send message to queue

Send

Receive message from queue

Listen

Call Relay Service

Send

Register and serve relay service calls from WCF Service

Listen

Manage namespace

Manage

Shared access key rotation

As previously mentioned you should rotate your access keys. Microsoft did think about this when designing the product so each access key has two allowed secrets (one primary and one secondary).

A good approach would be to (on regular intervals) move the primary secret to the secondary and then communicate the new access key to the partner in a secure way. This allows the partner to change to the new primary key at their own pace and in their own release windows/low usage hours as they can still access resources with the now secondary key.

How to change access key

In code that could look like this… (download sourcecode if you want a full running sample). The sample below accepts the name of a Shared Access Policy and swaps the keys. It will also do it for all SharedAccess policies with the same (in my case partner)name so if you have a partner sending and receiving on different queues this sample will change both to the same key.

Note that this (in a big scenario) would probably be done by having a scheduled web job, logic app or Azure function doing this on predefined intervals. With logic apps one could easily also imagine sending the new access secrets directly to a customer through SMS or a secure email service.. Future post perhaps….

At least once vs at most once – Message consumption pattern

Another thing you should consider is your consumption pattern. The following scenario could absolutely happen: You lock and read a message from a queue, before you can commit the message (DELETE in REST) the message is returned to the queue (by error or by plan) and another thread picks up the message again. The second process of the message does not know the state of the previous processor and how far it came and if the changes are rolled back or committed. While there are ways to reduce this possibility the simplest alternatives being longer lock time or renewing the lock in your code you cannot be 100% sure that your message is not consumed more than once in this scenario. And while WebJobs automatically renew the lock as the program runs – the program may still fail/hang and if you are unlucky it fails after you have committed your changes (even if you use transactions). So you should consider what is worse of the following two alternatives.

The message is consumed more than once – not good in your accounting or bank system for example. Here you may want to process the message at-most-once but write to a log if it fails to repair from the logs. If you catch an error you may want to move the message to another queue or the DeadMessage queue instead of letting it be consumed again so you can take action on it but to risk crediting a bank account multiple times is not OK, it is better to raise an exception.

It does not matter much if the message is consumed multiple times. While it is not fun with multi-posts to your timeline in Facebook it may not matter so much in the long run. You could remove the extra post quite easily or possibly even detect it by looking at identical posts from the last 15 minutes. This is also an OK scenario if you have some sort of logging, GPS tracking or fire alarm detection system where the potential loss of data is worse than duplicates.

So depending on your desired consumption pattern you would process messages differently.

At-most once
API

queueClient.Mode = ReceiveMode.ReceiveAndDelete;
message = queueClient.Receive(); //Directly removes the message from the queue//Do your stuff.//If fails you must handle it if you want to retry like put message somewhere else

Handling rights on multiple resources

All operations in the service bus can be programmed against and also things as creating queues, access keys, changing secrets etc. As you probably want this to be handled without having someone going into the portal to add 10 separate queues and assign the new partner access to old resources you probably want to build an administrative user interface. And I am not talking about the service bus explorer but a more customized solution. I quickly discovered the need for this when implementing Azure Service bus at one of my customers.

But back to the problem first. So you have a partner that will post changes to 10 of your queues, and will receive messages in 15 (Response) service bus queues. And you don’t want them to have 25 access secrets for accessing each but still utilize minimal access rights.

What I did was to add the same access key to several resources by a little web app (similar code in downloadable in here) where I could add new partners, select which of the existing queues they should access with what right and add new queues.

SB Admin Partner Tool

If you add Shared Access Policies through the portal they will get separate access secrets to all queues which is very secure but could be difficult to maintain, the portal assigns the same secrets to all the queues that the partner should be able to access. By having a tool like this you could also set up your other queue settings the way you want to without being a accustomed to and trained user in the portal. You can download a working samle of this related to this post (see beginning of post). Still separate acces keys offer better security but more problems when the keys needs to be updated.

Send Multiple messages in one go (Batch Sends)

If you are going to send multiple messages you can use the batch send functionality to send multiple messages, like for example CreateCustomer, CreateOrder that will happen simultaneous. You can send multiple messages in a batch which is more efficient than sending them individually. Once the service bus receives the messages it will split them to separate messages thereso it is very userfull if you can find a use-case for it.

Note that Content-Type shuld be set to application/vnd.microsoft.servicebus.json

Important: As far as I can find out the old limitations on the Batch POST that it cannot be more than 256k in Size is still valid (should any reader have any other information then please make a comment and send the link). So you cannot send huge batches.

BizTalk / Partitions

If you intend to use BizTalk 2013 (and higher) as consumer/provider of messages through the SB-Messaging Adapter you must make sure that the queues are NOT set up as partitioned queues. Read the entire post if wrote earlier about this here.

Sidenote: Partitioned queues simplifies having multiple consumers of messages and higher overall throughput. You can also set up rules for prioritized messages. Read more about partitions in the Azure Service Bus documentation.

Broadcasting changes with Topics

The above scenario works best with partner to partner communications like someone posting an update to be handled somewhere else or requests for specific data where a response is fetched to match the specific request and put in another queue.

If you have another scenario such as people subscribing to changes in the customer or article databases you probably want to use topics instead. In that scenario you can put one message on a “queue” that is then possible to received by multiple receivers (for example systems). Each of these receivers can consume the message separately so there is no competitiveness between subscribers. Another great feature with topics is that you can add logic to the subscribers such as individual subscribers only want swedish orders, another wants only “Big” orders (over € 10000).

Synchronous integration with (most likely) on premise resources

The previously described scenarios (queues and topics) are all aimed to deliver asynchronous message-based communications. What if you need to access on premise resources synchronously?

Before continuing the discussion about Azure Service Bus, there are other alternatives for on premise communication between Azure (the cloud) and on premise resources including Site-Site VNets, Point to Site VNet, Onpremise Data Gateway (I beleive they work with Service Bus Relays internally but I am not absolutely sure), Logic Apps (some adapters can be set up to connect to on premise databases with some simple steps and to SOAP services with some small adjustments as well). But this discussion was about the service bus so I will discuss the method on how to work manually with the servicebus instead.

Service Bus relays enable you to create WCF services (REST/SOAP) that run on premise inside your corporate firewall and therefore likely are not subject to that many regulations on what they can access. They could access files, databases, etc (you name it) just like any other on premise application. The big difference is that they use also go out to Azure (though TCP or HTTP(s)) and register themselves in the Service Bus (if they have the correct rights set up of course). You can then assign B2B partners to access secrets and allow them to call through service bus authorization) right in to your on premise hosted services. The benefits of course that the callers does not know the details of your on premise implementation but only how to connect to the service bus service.

You pay for the number of hours a relay service is online * the number of instances. So the more instances -> the more it will cost.

High availability for Service Bus Relays

You should set up your relays with the appropriate failover matching your needs. A B2B communication may be considered online as long as the systems can communicate with the service bus queue (for queue based messaging) but for relays you can’t really say the same if you get a 404 because the services are unreachable. So when designing B2B solution based on the Service Bus Relays you should do the following:
1. Add a retry functionality at the sender (caller) easilly done if you use BizTalk or other integration software for calling the relay)
2. Have multiple instances (ideally 3 or more if performance does not dictate more) of the Relay Service (WCF service on premise). If you have two data centres with different internet routing it could be wise to put one instance at the other location.

Have multiple listeners!

Avoid long-running relay services

As always when it comes to synchronous calls you should avoid long-running processes. For report creation and other time consuming services put the request on a queue handler (SB queue, MSMQ queue, Windows Service Bus) and have a background job creating the reports. You could then return “Report request submitted” directly and can have your service handling other incoming requests instead. This is not specific to Service Bus Relays but is a good architecture overall. Consider that the first version of Azure had Web Roles and Worker Roles. Web roles handled web requests and worker roles handled heavy/batch processing.