Focus: Robustness Description: Set robustness goals, and consider what should happen, and what should not happen in the face of likely (and less likely) errors.

In the last post, we were able to increase robustness, by modifying the design of an order import application separating the process of downloading orders from the process of importing them into the warehouse system (see previous posts for more details on the example we're working on). The redesign required some understanding on the existing design’s effect on the robustness of the system, and some experience to decide what to do about it.

It may be argued that we traded complexity for robustness as we split one component into two, as we need to add some glue to integrate the two components. However, each of the two new components are now conceptually simpler and smaller than the combined one they replace, and it should therefore be simpler to implement each part in a robust manner, as now error handling for the download process does not need to be intermingled with error handling for the import process. Also the glue needed in this case is more or less just a shared folder. Modularization and componentization of your software is therefore often a good thing when it comes to robustness.

Now lets consider the implementation of each component, in this post we take a look at the the downloading process, and we leave the importing process for a later post. Please note that I’m going to have to show simplified implementations, or they will become way too long to reason about in a single post. But I hope they will be able to get the message across anyway.

We’ll start by setting some robustness goals for our processes.

Errors should not cause the processes to exit or stop processing.

A single order file should not be able to block further processing.

Orders should never be permanently lost.

All correct orders should be imported exactly once (not zero, nor twice).

These goals will help us take appropriate action when handling errors.

It will be rather impossible to guarantee we’ll meet all goals, especially the first one, regardless of which errors will occur. In the face of some errors, such as the complete loss of a necessary resource such as disk space (or power), or an access denied failure that needs manual administrator intervention to correct, we may have no other option but to stop processing. However, we should hopefully be able to resume processing when the error condition is cleared, without needing a restart.

Also, if we suffer a hard drive crash after downloading orders, but before importing them, some order will probably get lost. There’s always a trade-off between the cost of failure, and the cost of preventing failure.

The Order Downloading Process

How can we make the downloading robust? We need to consider the problems that could occur. Some likely errors are:

Network issues, that could temporarily prevent access to FTP Server, or affect downloads in progress, or make the file server unreachable.

Permission issues, such as not being able to delete a downloaded file from the FTP, or even log in to the FTP.

In order to meet goal 1, we’ll simply add an outer catch-all exception handler (line 10). We also add a short pause to the outer loop (line 13), not to overload any services, if there’s an error.

In order to meet goal 2, we will use a file mask (Order*.xml, say) to include only relevant files. Even if we have a folder on the FTP server dedicated to the order files, an irrelevant file might accidently be placed there and this could potentially interfere with processing - say, a 12GB file called ‘webshopdb.bak’, or thousands of image files, would likely cause issues and, if nothing else, consume time and download bandwidth unnecessarily. We’ll also add an inner catch-all exception handler (line 24) to catch errors while processing individual files.

In order to meet goal 3, we will only delete files from the FTP once we are certain we’ve successfully stored a local copy (we’ll reach line 23 only if line 20 didn’t throw an error).

As the code is written now, if we cannot delete a file from the FTP once downloaded, we will download many copies of this order and place in the import folder (and keep doing so until it can be removed from the FTP). This means that the import process must be able to handle duplicate orders, or we’ll violate goal 4! If we instead tried deleting before calling MoveToOrderImportFolder, undeletable files would delay the orders instead of creating duplicates, so it’s a trade-off.

We’ll take a closer look at the order import process in the next post.