CodeSod

Tuesday, May 1, 2018

In this article we will be extending an ExecutorService implementation with monitoring capabilities. This monitoring capability will help us to measure a number of pool parameters i.e., active threads, work queue size etc. in a live production environment. It will also enable us to measure task execution time, successful tasks count, and failed tasks count.

Monitoring Library

As for the monitoring library we will be using Metrics. For the sake of simplicity we will be using a ConsoleReporter which will report our metrics to the console. For production-grade applications, we should use an advanced reporter (i.e., Graphite reporter). If you are unfamiliar with Metrics, then I recommend you to go through the getting started guide.

Let's get started.

Extending the ThreadPoolExecutor

We will be using ThreadPoolExecutor as the base class for our new type. Let's call it MonitoredThreadPoolExecutor. This class will accept a MetricRegistry as one of its constructor parameters -

Registering Gauges to measure pool-specific paramters

A Gauge is an instantaneous measurement of a value. We will be using it to measure different pool parameters like number of active threads, task queue size etc.

Before we can register a Gauge, we need to decide how to calculate a metric name for our thread pool. Each metric, whether it's a Gauge, or a Timer, or simply a Meter, has a unique name. This name is used to identify the metric source. The convention here is to use a dotted string which is often constructed from the fully qualified name of the class being monitored.

For our thread pool, we will be using its fully qualified name as a prefix to our metrics names. Additionally we will add another constructor parameter called poolName, which will be used by the clients to specify instance-specific identifiers.

For our example we are measuring core pool size, number of active threads, maximum pool size, and task queue size. Depending on monitoring requirements we can register more/less Gauges to measure different properties.

Measuring Task Execution Time

To measure the task execution time, we will override two life-cycle methods that ThreadPoolExecutor provides - beforeExecute and afterExecute.

As the name implies, beforeExecute callback is invoked prior to executing a task, by the thread that will execute the task. The default implementation of this callback does nothing.

Similarly, the afterExecute callback is invoked after each task is executed, by the thread that executed the task. The default implementation of this callback also does nothing. Even if the task throws an uncaught RuntimeException or Error, this callback will be invoked.

We will be starting a Timer in our beforeExecute override, which will then be used in our afterExecute override to get the total task execution time. To store a reference to the Timer we will introduce a new ThreadLocal field in our class.

Recording number of failed tasks due to uncaught exceptions

The second parameter to the afterExecute callback is a Throwable. If non-null, this Throwable refers to the uncaught RuntimeException or Error that caused the execution to terminate. We can use this information to partially count the total number of tasks that were terminated abruptly due to uncaught exceptions.

To get the total number of failed tasks, we must consider another case. Tasks submitted using the execute method will throw any uncaught exceptions, and it will be available as the second argument to the afterExecute callback. However, tasks submitted using the submit method are swallowed by the executor service. This is clearly explained in the JavaDoc (emphasis mine) -

Note: When actions are enclosed in tasks (such as FutureTask) either explicitly or via methods such as submit, these task objects catch and maintain computational exceptions, and so they do not cause abrupt termination, and the internal exceptions are not passed to this method. If you would like to trap both kinds of failures in this method, you can further probe for such cases, as in this sample subclass that prints either the direct cause or the underlying exception if a task has been aborted

Fortunately, the same doc also offers a solution for this, which is to examine the runnable to see if it's a Future, and then get the underlying exception.

Combining these approaches, we can modify our afterExecute method as follows -

Conclusion

In this article we have looked at a few monitoring-friendly customization to an ExecutorService implementation. Like always, any suggestions/improvements/bug fix will be highly appreciated. As for the example source code, it has been uploaded to Github.

Sunday, April 22, 2018

Introduction

ORM frameworks like JPA simplifies our development process by helping us to avoid lots of boilerplate code during the object <-> relational data mapping. However, they also bring some additional problems to the table, and N + 1 is one of them. In this article we will take a short look at the problem along with some ways to avoid them.

The Problem

As an example I will use a simplified version of an online book ordering application. In such application I might create an entity like below to represent a Purchase Order -

This one query will return all purchase orders that a customer has. However, in order to fetch the order items, JPA will issue separate queries for each individual order. If, for example, a customer has 5 orders, then JPA will issue 5 additional queries to fetch the order items included in those orders. This is basically known as the N + 1 problem - 1 query to fetch all N purchase orders, and N queries to fetch all order items.

This behavior creates a scalability problem for us when our data grows. Even a moderate number of orders and items can create significant performance issues.

The Solution

Avoiding Eager Fetching

This the main reason behind the issue. We should get rid of all the eager fetching from our mapping. They have almost no benefits that justify their use in a production-grade application. We should mark all relationships as Lazy instead.

One important point to note - marking a relationship mapping as Lazy does not guarantee that the underlying persistent provider will also treat it as such. The JPA specification does not guarantee the lazy fetch. It's a hint to the persistent provider at best. However, considering Hibernate, I have never seen it doing otherwise.

Only fetching the data that are actually needed

This is always recommended regardless of the decision to go for eager/lazy fetching.

I remember one N + 1 optimization that I did which improved the maximum response time of a REST endpoint from 17 minutes to 1.5 seconds. The endpoint was fetching a single entity based on some criteria, which for our current example will be something along the line of -

The id is the only data from the result that was needed for subsequent calculations.

There were a few customers who had more than a thousand orders. Each one of the orders in turn had a few thousands additional children of a few different types. Needless to say, as a result, thousands of queries were being executed in the database whenever requests for those orders were received at this endpoint.

The target DTO must have a constructor whose parameter list match the columns being seleected

The fully qualified name of the DTO class must be specified

Useing Join Fetch / Entity Graphs

We can use JOIN FETCH in our queries whenever we need to fetch an entity with all of its children at the same time. This results in a much less database traffic resulting in an improved performance.
JPA 2.1 specification introduced Entity Graphs which allows us to create static/dynamic query load plans. Thorben Janssen has written a couple of posts (here and here) detailing their usage which are worth checking out.

Wednesday, November 1, 2017

In my previous article I wrote about an input validation design which replaces hard-to-maintain-and-test if-else blocks. However, as some readers pointed out, it has a drawback - if the input data has more than one validation errors, then the user will have to submit the request multiple times to find all of them. From a usability perspective this is not a good design.

An alternative to throwing exceptions when we find a validation error is to return a Notification object containing the error(s). This will enable us to run all the validation rules on the user input, and catch all violations at the same time. Martin Fowler wrote an article detailing the approach. I highly recommend you to go ahead and give it a read, if you haven't done so already.

In this article I will refactor my previous implementation to use Error Notification object to validate user inputs.

As a first step, I will create an ErrorNotification object which encapsulates my application errors -

and then change all the implementations to adapt to the new return type as well.

Initially, I will change all the implementations to return an empty error object, so that I can get rid of the compilation errors. For example, I will change the ItemDescriptionValidator in the following way -

After fixing the compilation errors, I will now start replacing the exceptions with notification messages in each validator. To do this, I will first modify the related tests to reflect my intent, and then modify the validators to pass the tests.

I am a bit uncomfortable with the use of the ifPresentOrElse method above. The main reason I am using it here is because Optionals don't have something like an ifNotPresent method, which would have allowed me to take an action only when the value is not present (request to my readers - if you know a better way to do this, please comment in!).

After this refactoring, all tests in the ItemValidatorTest class pass with flying color. Great!

Making this change causes one of the tests in OrderServiceIT to fail, as it was specifically looking for an exception with cause set to NumberFormatException when the price is invalid. After our refactoring, we can safely remove this check as it is no longer relevant.

The full source code for this article has been pushed to GitHub (specific commit URL is here).

The validate method is not well written. It is very hard to test. Introducing new validation rule in the future is also hard, and so is removing/modifying any of the existing ones. From my experience I have seen that most people write a few generic assertions for this type of validation check, typically in an integration test class, touching only one or two (or more, but not all) of the validation rules. As a result, refactoring in the future can only be done in Edit and Pray mode.

We can improve the code structure if we use Polymorphism to replace these conditionals. Let's create a common super type for representing a single validation rule -

In order for this to work we will have to declare each of the validator implementations as a Spring Bean.

We could improve our abstraction even further. The OrderService is now accepting a List of the validators. However, we can change it to be only aware of the OrderItemValidator type, and nothing else. This gives us the flexibility of injecting either a single validator or any composition of validators in the future.

So now our goal is to change the order service to treat a composition of order item validators in the same way as a single validator. There is a well-known design pattern called Composite which lets us do exactly that.

Let's create a new implementation of the validator interface, which will be the composite -

The benefits of this approach are many. The whole validation logic has completely been abstracted away from the ordering service. Testing is easier. Future maintenance is easier. Clients only know about one validator type, and nothing else.

However, all of the above come with some problems too. Sometimes people are not comfortable with this design. They may feel like this is just too much abstraction, or that they will not be needing this much flexibility or testability for future maintenance. I'd suggest to adopt this approach based on the team culture. After all, there is no single right way of doing things in Software Development.

Note that for the sake of this article I have taken some short cuts here as well. These includes throwing a generic IllegalArgumentException when validation fails. You'd probably want a more specific/custom exception in a production-grade application to identify between different scenarios. The decimal parsing is also done naively, you might want to fix on a specific format, and then use DecimalFormat to parse it.

Sunday, March 5, 2017

A few days ago I ran into a problem while dealing with a LocalDateTime attribute in JPA. In this blog post I will try to create a sample problem to explain the issue, along with the solution that I used.

Consider the following entity, which models an Employee of a certain company -

It was evident that JPA was NOT treating the joiningDate attribute as a date or time, but as a VARBINARY type. This is why the comparison to an actual date was failing.

In my opinion this is not a very good design. Rather than throwing something like UnsupportedAttributeException or whatever, it was silently trying to convert the value to something else, and thus failing the comparison at random (well, not exactly random). This type of bugs are hard to find in the application unless you have a strong suit of automated tests, which was fortunately my case.

Back to the problem now. The reason JPA was failing to convert LocalDateTime appropriately was very simple. The last version of the JPA specification (which is 2.1) was released before Java 8, and as a result it cannot handle the new Date and Time API.

To solve the problem, I created a custom converter implementation which converts the LocalDateTime to java.sql.Timestamp before saving it to the database, and vice versa. That solved the problem -

The above converter will be automatically applied whenever I try to save a LocalDateTime attribute. I could also explicitly mark the attributes that I wanted to convert explicitly, using the javax.persistence.Convert annotation -

From the above example, it is clear that Subtyping in Java Generics works differently than the usual class based Subtyping. A list of numbers cannot point directly to a list of longs even though Long is a subtype of Number. In order to get around this restriction, we will have to use an upper bounded wildcard -

which will also allow us to refer to a list of floats as well.

A List<? extends Number> is then treated as something like a super type of both List<Long> and List<Number>. In fact, as long as a type X is a subtype of Number, List<? extends Number> will be able to refer to List<X> without any compilation errors.

Using an upper bounded wildcard makes our code much more flexible to future changes. Consider the following method which tries to find the sum of the longs -

If we change the method signature to use upper bounded wildcard, then we can also pass a list of integers to it -

Without the wildcard we would have to first convert the integers to long, and then pass it to the method.

An upper bounded wildcard brings its own set of restrictions though. We cannot add any new value to the list we are pointing to (except null). Allowing such assignments would have again let us break the type safety (see the first example). Also, retrieved values can only be treated as of type upper bound. Using an upper bounded wildcard thus results in a read-only list from which we can only read, but cannot store any meaningful values into it.

The above list will allow as to store any type which is a subtype of Number into it. However, we can only retrieve items from it as Object. Allowing the retrieval of any other type would have resulted in a ClassCastException at runtime as we would have no way of knowing exactly which subtype of Number was stored in the list.

Reference resolution also works the opposite way of the upper bound. A List<? super Number> can reference any list of type X, where X is a super type of Number.

To summarize, then, a List<? extends X> means -

We can use this reference to point to a list of type Y, where Y is a subtype of X.

We cannot store anything into the list other than null.

We can only refer to the retrieved items from this list as X.

whereas a List<? super X> means -

We can use this reference to point to a list of type Y, where Y is a super type of X.

We can store any value into it which is a subtype of X.

We can only refer to the retrieved items from this list as Object.

When I am trying to read/store values into these lists, I find it useful to read List<? extends X> as -

1. A list of items from where we get values of type X (when operating on it)2. A variable which can point to a list of subtype of X (during reference assignment)

Similarly, I read List<? super X> as -

1. A list of item where we might add values of type X (when operating on it)2. A variable which can point to a list of supertype of X (during reference assignment)

This is the reason the upper bounded wildcard references are sometimes called as Producers, since we can only read from them in order to do something effective. Similarly, the lower bounded wildcards are called Consumers. People sometimes use a small mnemonic for it, PECS, which basically translates to -