In Relation To Emmanuel Bernard

In Relation To Emmanuel Bernard

Red Hat has been exploring serverless (aka FaaS) through the Fabric8 Funktion project.
It has been long due for us to get serious on the subject.
We are turning all of our libraries into services starting today.
This is going to be a multi year effort but we are committed to it.

We thought hard and long about which service to start with.
In our performance lab, we realized that the slowest and most expensive part of serverless functions was CPU branch misprediction.
In a serverless approach, you want to squeeze as much operations per CPU cycle as possible.
Any misprediction has huge consequences and kill any of the mechanical sympathy effort we put in libraries like vert.x.

In our experiments, we found that the optimal solution was to get rid of if branches in serverless functions.
We are proud to introduce IF as a Service or IFaaS (pronounced aye face).
Your code changes from:

We have been using it for a year now and forked the javac compiler to convert each if branch into a proper if service call.
But we also completely re shaped how we write code to pure linear code, changing all if primitives with their service call equivalent.
This is great because it also fixed the tab vs space problem: we no longer have any indenting in our code.
Two stones in one bird !
I cannot emphasize enough how much development speed we gained in our team by fighting less on each pull request against these tab bastards.

The good thing about this external if service is that you can add a sidecar proxy to cache and simulate branch predictions on OpenShift and scale then horizontally ad nauseam.
This is a huge benefit compared to the hardcoded and embedded system that is a CPU.
We typically change the branch implementation 3 to 5 times a day depending on user needs.

FAQ

Is else a function too?

Else is not part of the MVP but we have it working in the labs.
We are currently struggling to implement for loops, more specifically nested ones.
We have HTTP error 310: too many redirects errors.

One thing you don’t hear enough about in the microservices world is data.
There is plenty of info on how your application should be stateless, cloud native, yadayadayada.
But at the end of the day, you need to deal with state and store it somewhere.

I can’t blame this blind spot.
Data is hard.
Data is even harder in a unstable universe where your containers will be killed randomly and eventually.
These problems are being tacked though in many fronts and we do our share.

But once you have dealt with the elasticity problem, you need to address a second problem: data evolution.
This is even more pernicious in a microservices universe where:

The data structure can and will evolve faster per microservice.
Remember this individual puppy is supposed to be small, manageable and as agile as a ballet dancer.

For your services to be useful, data must flow from one microservice to another without interlock.
So they share data structure directly or via copy, implicitly or via an explicit schema, etc.
Requiring to release microservices A, B and C together because they share a common data structure is a big no no.

My colleague Edson Yanaga has written a short but insightful book on exactly those problems.
How to deal with data in a zero downtime microservice universe.
How to evolve your data structure in a safe and incremental way.
How do do that with your good old legacy RDBMS.
And a few more subjects.

If you are interested in, or embarking in a microservices journey, I recommend you read this focused book.
This will make you progress in your thinking.

Recently, the team has been discussing improvements around Hibernate (ORM) usage within cloud based apps and microservices.
In particular the fundamental assumption that things will break regularly on these platforms and that services should be resilient to failures.

The problem

In microservices or cloud architectures, services are started in different orders (usually beyond your control).
It is possible for the app using Hibernate ORM to be started before the database.
At the moment, Hibernate ORM does not like that and will explicitly fail (exception) if it can’t connect to the database.

Another related concern is to consider what is happening if the database is stopped for a while after the app running Hibernate ORM has started and resume working shortly after.

Solution 1: Hibernate waits and retries at boot time

Some users have asked us to delay and retry the connection process in case the database is not present at boot time.
That would work and solve the bootstrap problem.
It would not solve the database gone while running the app but here at least you have your transaction and the error propagation mechanism covering you.
Plus at development time, the boot time problem gone would be quite nice already.

I understand that this is probably a quick win to implement this, but better be sure of the problem before adding that feature.
It feels to me that Hibernate ORM bootstrap is not the ideal area to fix that problem.
But at the end of the day if it helps enough, it would be worth it.

We are exploring that option and considering alternatives and that’s where we need your feedback.

Wait and retry vs platform notification

In this blog post, I mention the wait and retry approach.
It can be replaced by a notification from the cloud platform when a service is up / down.

This avoids the regular polling process at the cost of having to rely on various integrations from various cloud platforms.

Solution 1.b: The connection pool waits and retries

It probably would be better if the connection pool Hibernate ORM uses, implements that logic but there are more than one connection pool Hibernate supports.
That’s a minor variation on solution 1.

Solution 2: Hibernate boots in non-functioning mode

If Hibernate ORM cannot connect to the datasbase, it continues its bootstrap process.
If an EntityManager is asking for a connection while the database is still unavailable, a well defined exception is raised.
To not flood the system, a wait and retry system for connection checking would be in place to only try a few times even when lots of EntityManager are requested.

There are some subtle difficulties here on concurrency and on the fact that we use info from the bootstrap connection to configure Hibernate ORM.
The most visible option guessed from the connection is the dialect to use.
On the other hand, stopping the app boot process while waiting and retrying like solution 1 proposes is probably not without its challenges.

The exception raised by Hibernate ORM upon DB inaccessibility needs to be treated properly by the application (framework) being used.
Like a global try catch that moves the application in degraded mode or propagating the execution error to the client (e.g. HTTP error 500).
It might even be helpful if Hibernate ORM was exposing the not ready status via an explicit API.

This could be tied to a health check from the cloud platform.
The application would report the not ready but trying status via a /health endpoint that the orchestrator would use.

On database connection breaking

There are many reasons for failing to connect to a database:

Host unreachable

DB server denying access temporarily (e.g. load)

Incorrect port setting

Incorrect credentials

And many more

Should the system go into the wait and retry mode for cases 3, 4, 5?
Or should it refuse to deploy?

Solution 3: the smart app (framework)

Another solution is for the app to have a smart overall bootstrap logic.
It tries to eagerly start but if a Hibernate ORM connection error occurs, only the inbound request framework is started.
It will regularly try and boot and in the mean time return HTTP 500 errors or similar.

This requires an app framework that could handle that.
It embeds circuit breaker logic in the app and can better react to specific errors.
I wonder how common such frameworks are though.

This is in spirit the same solution as solution 2 except it is handled at the higher level of the app (framework) vs Hibernate ORM.

Solution 4: the cloud / MSA platform restart the apps

An arguably better solution would be for the cloud platform to handle these cases and restart apps that fail to deploy in these situations.
It likely requires some kind of service dependency management and a bit of smartness from the cloud infra.
The infrastructure would upon specific error code thrown at boot time, trigger a wait and retry deployment logic.
There is also a risk of a dependency circularity leading to a never starting system.

I guess not all cloud infra offer this and we would need an alternative solution.
OpenShift let’s you express dependencies to make sure a given service is started before another.
The user would have to declare that dependency of course.

Solution 5: proxy!

Another solution is to put proxies either before the app inbound requests and/or between the app and the database.
Proxy is the silver bullet that lots for cloud platforms uses to solve world hunger in the digital universe.

How many proxies and routing logic does it take to serve a "Hello world!" in the cloud?
Who proxies the proxies?

:)

This approach has the advantage of not needing customized apps or libraries.
The inconvenience is more intermediary points between your client and the app or data.

If the proxy is before the application, then it needs a health check or a feedback from the boot system to wait and retry the re-deployment of the application on a regular basis.
I’m again not certain cloud infrastructures offer all of this infrastructure.

If the proxy is between Hibernate ORM and the database (like HAProxy for MySQL),
you’re still facing some timeout exception on the JDBC side.
Which means the application will fail to boot.
But at least, the proxy could implement the wait and retry logic.

Some of the Hibernate team members all gather together next week in Paris.

If you are around, come join us for a Questions & Answers session at the ParisJUG.
It’s Tuesday December 2nd 2015 at 19:30.
We will discuss anything Hibernate, no slide, simply come with your questions on:

Every year at Red Hat, we organise a Red Hat Week to celebrate our culture.
And in good open source community way, each local office expresses how it pleases this event.
This year, I proposed to do a Devoxx4Kids for the children of French Red Hatters.

Red Hat 4 Kids (aka a copy paste of Devoxx 4 Kids) initiates children from 6 to 12+ to the notion of programming.
Sharing our knowledge to teach them what daddy or mummy does. Sounds cool.

I knew it was doable since the awesome Devoxx4Kids team has successfully declined these events around the world.
But my engineering spider-senses told me it would be quite a humongous task.
I was right but it’s one of those projects where you need to jump first and think later.

What did we do?

For the 6 to 10 years old boys and girls, we have done a Scratch workshop.
Scratch is awesome, it has all the basics of programming: blocks, loops, conditions, events, event sharing, etc…​
Here, not need to prepare much, explain the basics and let the kids go (see below).

For the 10+ kids, we have done the Arduino workshop: programming electronics for the win :)
We have reused the Devoxx4Kids one verbatim.

We were also lucky to have the Aldebaran team with us.
So the kids moved up from the basics of programming to full Nao robot programming.
Nao is a serious guest start and actually easier to program than Scratch :)

What are the challenges?

You need to prepare everything material wise

We installed a fresh Fedora 22 on all laptops to get everything set up the same:
this really helped as we did not have to fight different environments.
To be safe, we used ethernet and not WiFi: some WiFi routers don’t enjoy too many laptops at once.

Don’t go too long

For the 6-10 years old, they started to slowly drift after one hour.
Don’t go over 1h30 per workshops and do breaks between them.
For the 10+, they actullally went beyond our 1h30 and chose coding over cakes: success!

Limit the introduction and slides as much as possible

Developers don’t like slides.
It turns out kids disregard them after 4 mins top.
I had to cut the presentation quickly and instead…​

Do customized assistance

Show them by pair-kid-programming how to do the basic things and
let them do what they want: help them achieve their goal:
story, adventure, games etc…​
One grown up for one to two laptops, two kids per laptops. Max.
They will be much more engaged.

Special thanks

It’s quite a special feeling to see a good chunk of the kids being that engaged,
asking tougher and tougher questions over time and preferring coding to cakes.

I have many people to thank for this project.
Hopefully I won’t forget too many of them:

the Devoxx4Kids team for putting their workshop in open source

Audrey and Arun from Devoxx4Kids for giving me customized advice and reassuring me along the way

the Red Hat French facilities team for saying yes to this project and putting up with all the material challenges (room size, power outlets, laptop hunt, mouse chasing, etc.)

the local Red Hat techies for gathering the hardware, installing the machines, testing everything and helping out during the workshops

Writing queries using complex types can be a bit surprising in Hibernate Search.
For these multi-fields types, the key is to target each individual field in the query.
Let’s discuss how this works.

What’s a complex type?

Hibernate Search lets you write custom types that take a Java property and create Lucene fields in a document.
As long as there is a one property for one field relationship, you are good.
It becomes more subtle if your custom bridge stores the property in several Lucene fields.
Say an Amount type which has the numeric part and the currency part.

Unfortunately that query will always return 0 result.
Can you spot the problem?

It turns out that Hibernate Search does not know about these subfields
creationdate.year, creationdate.month and creationdate.day.
A FieldBridge is a bit of a blackbox for the Hibernate Search query DSL,
so it assumes that you index the data in the field name provided by the name parameter
(creationdate in this example).

We have plans in a not so future version of Hibernate Search to address that problem.
It will only require you to provide a bit of metadata when you write such advanced custom field bridge.
But that’s the future, so what to do now?

Use a single field

I am cheating here but as much as you can, try and keep the one property = one field mapping.
Life will be much simpler to you.
In this specific JodaTime type example, this is extremely easy.
Use the custom bridge but instead of creating three fields (for year, month, day),
keep it as a single field in the form of yyyymmdd.

In this case, it would even be better to use a Lucene numeric format field.
They are more compact and more efficient at range queries.
Use luceneOptions.addNumericFieldToDocument( name, numericDate, document );.

The query above will work as expected now.

But my type must have multiple fields!

OK, OK.
I won’t avoid the question.
The solution is to disable the Hibernate Query DSL magic
and target the fields directly.

Hibernate Search sends the indexing requests in the post transaction phase.
Until now.
The JMS backend can now send its indexing requests transactionally with the database changes.
Why is that useful? Read on.

A bit of context

When you change indexed entities,
Hibernate Search collects these changes during the database transaction.
It then waits for the transaction to be successful before pushing them to the backend.

Hibernate Search has a few backends:

lucene: this one uses Lucene to index the entities

JMS: this one sends a JMS message with the list of index changes.
This JMS queue is then read by a master which uses the Lucene backend.

and a few more that are not interesting here

Running the backend after the transaction (in the afterTransaction phase to be specific)
is generally what you want.
Just to name a few reasons:

you don’t want index changes to be executed if you end up rollbacking the database transaction

you don’t necessarily want your database changes to fail because the indexing fails:
you can always rebuild the index from your database.

and most backends don’t support transactions anyways

Hibernate Search lets you enlist an error callback
so that you are notified of these indexing problems when they happen
and react the way you want (log, raise an exception, retry etc).

So why make the JMS backend join the transaction

If you make the JMS backend join the transaction,
then either the database changes happen and the JMS messages are received by the queue,
or nothing happens (no database change and no JMS message).

The non transactional approach is still our recommended approach.
But there are a few reasons why you want to go transactional.

No code to handle the message failure

It eliminates the need to write an error callback and handle this problematic case.

Simpler exploitation processes

It simplifies your exploitation processes.
You can focus on monitoring your JMS queue (rates of messages coming in, rates of messages coming out)
which will give you an accurate status of the health of Hibernate Search’s work.

Transactional mass indexing

When doing changes to lots of indexed entities,
it is common to use the following pseudo pattern to avoiod OutOfMemoryException

Sanne is going to do a virtual JBoss User Group session Tuesday July 14th at 6PM BST / 5PM UTC / 1PM EDT / 10 AM PDT.
He is going to talk about Lucene in Java EE.

He will also describe some projects dear to our heart.
If you want to know what Hibernate Search, Infinispan bring to the Lucene table and how they use Lucene internally,
that’s the event to be in!

Apache Lucene is the de-facto standard open source library for Java developers to implement full-text-search capabilities.

While it’s thriving in its field, it is rarely mentioned in the scope of Java EE development.

In this talk we will see for which features many developers love Lucene, make some concrete examples of common problems it elegantly solves, and see some best practices about using it in a Java EE stack.

Today let’s discuss the interaction between multitenancy and the current session feature.

Multitenancy let’s you isolate Session operations between different tenants.
This is useful to create a single application isolating different customers from one another.

The current session feature returns the same session for a given context, typically a (JTA) transaction.
This facilitates the one session per view/transaction/conversation pattern and avoids the one session per operation anti-pattern.

Session session = sessionFactory.getCurrentSession();
// do some other work
[...]
// later in the same context (e.g. JTA Transaction)
Session session2 = sessionFactory.getCurrentSession();
// semantically we haveassert session == session2

The two features work well together,
simply implement
CurrentTenantIdentifierResolver.
That will give Hibernate ORM the expected tenant id when the current session is created.

How current is current?

When discussing with Florian and the ToulouseJUG,
we exchanged on a small case where things might not work as you expect.
Hibernate ORM considers that, for a given context (e.g. transaction),
there can only be a single current Session.

But you have to make sure to close these sessions.
If you are used to CDI or Spring handling sessions for you,
or if you rely on the current session feature to propagate the session across your stack,
this might be annoying.

Implement a custom CurrentSessionContext

The alternative is to implement your own version of CurrentSessionContext,
avoid raising the TenantIdentifierMismatchException,
and keep Session instances per both context and tenant id.

In practice, current sessions are stored in a ConcurrentHashMap keyed by context identifier,
you just need to improve that part.
Start with JTASessionContext
and hack away!

What now?

We are currently discussing whether the default CurrentSessionContext implementations
should partition by tenant id or raise the exception.
If you have your opinion, chime in!