Discriminator Multitenancy with Spring and Hibernate (No @Filters)

2017-01-18 by

I'll try to keep this short so we can get right to the
code. Hibernate 4 onward has support for schema and database
multitenancy. If this sounds like Klingon, read this article
followed by the Hibernates
docs.

So what about discriminator based multitenancy, with all the data in a
single schema? This is pretty convenient for application maintenance.
It's also a great for startups, provided you have the flexibility to
change strategies as your business scales. Time and time again - from
StackOverflow questions to the open
Hibernate feature request - I see folks say "use Hibernate filters".

No, don't use Hibernate filters. Here's why:

Filters are activated and parameterized at runtime.
That means code changes, potential bugs, security holes, etc. Sure,
there are one
or two
ways to implement global filters. However, these approaches are
unwieldy and, as we discuss below, filters fall short in other areas.

Hibernate doesn't apply filters for CRUD operations.
I've heard folks say that this isn't important, since CRUD operations
work on identities. Perhaps the semantics have changed with the
addition of multitenancy. For instance, what's to stop a piece of
application code from providing a cross-tenant id to Session.get(...)?
If I spam your RESTful webservice with random id's, could I get
another tenants data?

JPA doesn't support filters. To activate a filter in
JPA you have to break encapsulation (i.e. EntityManager.unwrap(Session.class)).
I know ... this is a Hibernate article. But just saying.

Hibernate can't help you with native SQL. Separate
schema / database multitenancy works fine, because data is segmented
by connection. However, native SQL based packages - reporting tools,
ETL frameworks, even Spring JDBC - will bypass Hibernate and leak data
across tenants.

Legacy code. Again, this is a Hibernate article.
However, let's suppose your legacy app (non-Hibernate) already
implements schema / database multitenancy. Why not support
discriminators as well? The approach presented in this article makes
it trivial.

Enter the Tenant View Filter

The tenant view filter is a horizontal partitioning pattern,
implemented at the database level using views. It's described with
examples in the aforementioned
article. This approach addresses all the issues with Hibernate
filters. The tenant view filter is always active and our existing
application code (JDBC, Hibernate / JPA, external frameworks, etc) just
works.

Let's Get Started

Enough theory, let's work by example. Start by checking out the complete
code. The finished project is included (rjb-blog-multitenancy-complete),
so feel free to skip ahead.

In this article, we'll start with a simple single tenant application (rjb-blog-multitenancy-starter)
and retrofit it for database, schema, and discriminator based
multitenancy. The app defines a single entity - a Product - but this
approach works for any number of entities regardless of their
relationships. You should be familiar with the structure of the starter
app: it's a Spring / Hibernate project with JUnit tests over a mock
HSQLDB database. Just run ProductDaoTest and work your way
into the code. The only other classes of note are AbstractTest,
and AbstractHsqlDbTestEnvironment, which allow us to
initialize the database(s) before the Spring context initialization.

Tenant View Filter In Action

The first step is to implement the tenant view filter. Append the
following code to the database initialization script.

We've replaced the PRODUCT table with a filtered view. The
view filters records by matching the CURRENT_USER id on the tenant_id.
Records with a NULL tenant id are shared, a convention for global
data.

We've adjusted the SKU index to account for tenants i.e. SKU's
must be unique within a tenant, but not across tenants.

We've adjusted the user grants to protect T_PRODUCT from
native SQL.

Note

These database changes are trivial. In fact, you can
implement the tenant view filter pattern for an entire schema in one
shot, using a generic stored procedure. Perhaps I'll post one in the
future.

The CURRENT_USER function may differ across database vendors
(e.g. SUSER_SID() for SQL Server). If you can think of a vendor that
doesn't support it, please comment below.

That's it. Execute ProductDaoTest and everything passes. As
promised, the existing code can't tell the difference! We're now ready
to set up for multiple tenants.

DataSource Routing

This isn't anything earth shattering - Spring JDBC has supported
DataSource routing since 2.x via their AbstractRoutingDataSource
class. The examples
/ documentation on this is pretty extensive, so we won't rehash it
here. You can find tons of posts with implementations similar to this.

This method sets the tenant id for a test before a transaction is
started. In a secure production app, a TenantResolver
implementation would check a request or session scoped holder for the
current user. The Spring Security principle for example.
However, this is just a test, so we set it manually.

At this point, you can run ProductDaoTest again. Everything
will pass. That's because, although we're configured for DataSource
routing, we're defaulting to the original DataSource in our beforeTransaction
method.

Let's make the necessary adjustments to test two tenants at once!

A Second DataSource

If we want to support a second tenant, we need to connect them to
the database and create some mock data to test them against. First,
let's replace the original DataSource definition in our test
environment.

Note

This is a useful JUnit hack. Before a transaction is started,
we inspect the name of the current test. If it contains the text
"tenant_2", we set the corresponding tenant id. We default to tenant
1 so our original tests pass.

At this point, we could run ProductDaoTest again, and
everything would pass. But we're focused on multitenancy here! So let's
add some test data and unit tests for tenant 2. Append the following to
the end of the database setup script.

We can write similar "tenant_2" tests for getBySku, getByCategory
and the remaining CRUD operations. I'll leave that up to you - you can
start by copying the existing tests.

Finished? Wait We Forgot Hibernate!

You may be wondering why we haven't modified the Hibernate config. Not
to mention, why does Hibernate handle multitenancy without any config
changes? On the one hand, this is a testament to the tenant
view filter - it's so seamless, that Hibernate just works. But the
truth is, Hibernate is only working partially. Our test cases pass
because Hibernate executes statements against the MultiTenantDataSource.
The DataSource routing is hidden, but it's still occurring. However,
since Hibernate doesn't know about the multitenancy, we won't get the seamless
second-level cache support. Not good. To sort this out, we have to let
Hibernate in on the deal.

Updated Entity Mapping

Pretty trivial, but our entities need a new tenant id field, with
a corresponding entry in the Product Hibernate mapping.

Pretty simple right? That's because everything is already handled
under the hood of the DataSource. These Hibernate plugins are only
required to switch Hibernate into multitenancy mode. Let's add them to
the Hibernate config.

Running the test fails as expected - without an update guard,
tenant 2 can save records into tenant 1.

Implementing an Update Guard

To implement our guard, we can use database triggers or a Hibernate Interceptor.
Triggers have the advantage of protecting against native SQL updates.
So most legacy apps will require BEFORE INSERT / UPDATE / DELETE
triggers to check the CURRENT_USER against the inbound tenant_id,
and raise an error when they don't match. Triggers, however, have a
couple downsides: (A) triggers have to be added to each table (though
you can use a procedure to add them all at once), and (B) they slow
down low level batch operations.

If you're not concerned about native SQL updates, opt for the Interceptor.
A simple Interceptor can be configured as follows.

Finished? Yep.

At this point, I could wrap up the post with our seamless
discriminator multitenancy. The careful reader, however, may have noted
that I said we would "retrofit it for database, schema, and
discriminator based multitenancy" ...

The truth is, it's already done. Notice that we configured
Hibernate for DATABASE multitenancy. Every tenant has their own
DataSource, and the routing layer (Spring) ensures that Hibernate uses
the correct one. We've essentially just made discriminator multitenancy
work like database multitenancy (i.e. at the connection level).

In fact, we can use this single implementation for all 3
multitenancy methods at the same time. You could scale this app from a
single schema supporting multiple tenants, to a schema per tenant, to a
physical database instance per tenant. Perfect for maintaining your
small customers in a single schema, with larger tenants in a separate
schema, or in a separate database instance. All 3 approaches can be
supported simultaneously from the same code base.

You can test this yourself, but here's the approach:

Add another database definition (e.g.
"rjb-blog-mutlitenancy-2") in the getDataBaseNames method of
com.rjb.blog.multitenancy.test.config.TestEnvironmentImpl.java.

Create another database initialization script (e.g.
"setup-2.sql") and add it to the getDatabasePopulatorScripts
method of com.rjb.blog.multitenancy.test.config.TestEnvironmentImpl.java.
This script doesn't need to implement the tenant view filter - it's a
separate database.

Add another DataSource definition (referencing your new
database) in the getDataSources method of com.rjb.blog.multitenancy.test.config.TestEnvironmentImpl.java.
Use a unique JNDI name.

Add a new tenant id ⇒ JNDI name mapping in the getTenantJndiDataSourceMappings
method of com.rjb.blog.multitenancy.config.PropertiesConfig.java.

Edit the the beforeTransaction method of com.rjb.blog.multitenancy.test.AbstractTest.java,
adding logic to "tenant_3" in the test method name and set the tenant
id.