Case Study: How to Use SOA in Web Apps

Imagine you are a retailer who wants to write a web application listing your products (books, DVDs, CDs, etc) at the best possible price on a large e-commerce platform. But the competition is stiff, and they’ll often set prices for the same products only a few cents lower than yours so that their offerings will appear at the top of a page of search results. How do you cope with that?

Ingredients for SOA Web Apps

Let’s think about what exactly you need:

a list of products you would like to sell,

detailed information about these products,

a minimum price at which you want to sell these products,

package dimensions, weight, condition,

up-to-date information about the competition for each product,

a database of your offers.

A rough description of how your app should work for each product you want to sell might be:

gather information about current existing offers,

find the cheapest offer,

bump it down by a specified amount (say $0.01),

check whether the new price isn’t below the specified threshold,

push the price update to the e-commerce platform.

Excellent! Let’s build the app!

Divide the app into smaller parts

This mission will require several databases, and some workers doing their own unique jobs. Overall, it sounds like there will be lots of background jobs and database work. The first thing to do is not to build the entire application, but to divide the app into small, specific parts, in keeping with the service-oriented architecture pattern.

The main advantages of the SOA pattern are:

small app size,

easy testing,

independence from other apps,

a simple workflow: do your job, send the result to some message broker (like redis or RabbitMQ) and don’t worry about the rest!

exposes a simple API returning the full information of a specific product

competition

gathers information about the competitions’ offers for each product we want to sell

detects changes in the competitions’ prices and sends this information to the message broker

exposes a simple API returning the information about competitors offers

creator

stores our offers in a database

listens to the message broker for incoming changes from the competition app

uses information from the products and competition apps (using their exposed APIs) to find the best offer and re-price it properly

detects changes in our offers and sends this information to the message broker app

pusher

listens to the message broker for incoming changes in creator

pushes price updates to the e-commerce platform

Having such a structure allows us to allocate specific jobs to separate apps and to focus on developing or improving one of these, instead of thinking about an app as a whole.

Use cache for API calls

This may sound a little obvious, but it’s often forgotten. If your goal is to hit the API 45k times per minute, then in some cases it would be good to simply cache the responses. It saves database reads, execution of Ruby code in order to generate json responses, etc.

The first software that comes to mind for caching is memcache. However, I would recommend using redis - redis beats memcache in terms of caching performance.

Use redis to publish/subscribe and use queues to send information between apps

The competition app, after it detects changes in the competitions’ offers, would like to send this information to the creator app, so it properly alters our prices. There are a couple of ways to do this:

call the creator API about the change

add a background job in the creator’s background worker

put information directly into the creator database

Unfortunately, none of these solutions will do the trick. Why?

Calling creator is the best option out of these three, but it’s still not perfect. It requires the competition app to have a reference to the creator app, breaking the SOA rules. Besides, what if another app would also like to know about the changes? We’d have to refactor the code to add another API call.

Similarly to the previous issue, we need to have a reference to the worker which is inside the creator app. Changing that worker would require us to also reference the change in the competition app.

Exposing the database to other servers, adding privileges, and not keeping track of database schema are just a few of the problems with this suggestion. This is the worst idea.

All these issues can be resolved by using the redis publish/subscribe method. In short: the competition app is pushing information to a specific redis channel and another application is listening on that channel to get the information. This solution is perfect for SOA architecture. The competition app just sends the information to the global channel and doesn’t care what is happening to it. This also allows us to have multiple applications listening to the same channel and performing their jobs accordingly.

Using the redis pub/sub system is very easy. Doing it from command line looks like this:

Open up terminal and execute

You should see something like this:

Now, open another terminal and execute

You should see something like this in the first terminal:

This means that redis has received message “a message” on the “channel” channel. This way you can send any message to a channel, and other subscribers will fetch that message.

Doing this in Ruby is very simple. A listener worker might look like this:

And the publishing code is as simple as:

I recommend using this method as it allows you to easily hook more applications into the same channel without refactoring.

Use proper indices and SSD drives with raid on servers

Databases need a good foundation to store large amounts of data and to have quick access to it. It’s important to know how indices work and to add them for the columns that will be read the most.

Having too many indices is bad because every time you change something in the database (add a record, modify, delete, etc), the indices tree will have to be rewritten. Having too few indices will cause issues with retrieving data. There are no good tips on resolving such issues, every table is different and you should design your database structure according to what you need.

However, using SSD drives on your machine will definitely help speed up your database. Databases like fast disks and SSDs are fast ;)

Use background jobs for ALL longer-running tasks

I talked about using the redis subscribe method to fetch the data from other apps. Why not execute the necessary job instantly, as soon as the the message is received? Because it takes time to execute it, thus blocking any other incoming messages. It is always better for a listener task only to put the message into a queue (sidekiq or similar) and have background job workers fetch the message from the queue and then execute it.

You might think “Hey, but I can just run multiple listeners and execute jobs the same way as having background workers!” - which is not entirely true. If you run, say, 5 listener workers - then all of these workers will receive a message from the redis channel and they will all execute the same task. That’s probably not what you want to achieve, is it?

These tips should help you start building large, high performance apps which are easy to extend. I hope you find it useful!