Drools & jBPM

Monday, February 26, 2018

Overview

The purpose of the executable model is to provide a pure Java-based representation of a rule set, together with a convenient Java DSL to programmatically create such model. The model is low level and designed for the user to provide all the information it needs, such as the lambda’s for the index evaluation. This keeps it fast and avoids building in too many assumptions at this level. It is expected higher level representations can layer on in the future, that may be more end-user focused. This work also highly compliments the unit work, which provides a java-oriented way to provide data and control orchestration.

Details

This model is generic enough to be independent from Drools but can be compiled into a plain Drools knowledge base. For this reason the implementation of the executable model has been split in 2 subprojects:

drools-canonical-model is the canonical representation of a rule set model which is totally independent from Drools

drools-model-compiler compiles the canonical model into Drools internal data structures making it executable by the engine

The introduction of the executable model brings a set of benefits in different areas:

Compile time: in Drools 6 a kjar contained the list of drl files and other Drools artifacts defining the rule base together with some pre generated classes implementing the constraints and the consequences. Those drl files needed to be parsed and compiled from scratch, when the kjar is downloaded from the Maven repository and installed in a KieContainer, making this process quite slow especially for large rules sets. Conversely it is now possible to package inside the kjar the Java classes implementing the executable model of the project rule base and recreate the KieContainer and its KieBases out of it in a much faster way. The kie-maven-plugin automatically generates the executable model sources from the drl files during the compilation process.

Runtime: in the executable model all constraints are defined as Java lambda expressions. The same lambdas are also used for constraints evaluation and this allows to get rid of both mvel for interpreted evaluation and the jitting process transforming the mvel-based constraints in bytecode, resulting in a slow warming up process.

Future research: the executable model will allow to experiment new features of the rule engine without the need of encoding them in the drl format and modify the drl parser to support them.

Executable Model DSLs

One goal while designing the first iteration of the DSL for the executable model was to get rid of the notion of pattern and to consider a rule as a flow of expressions (constraints) and actions (consequences). For this reason we called it Flow DSL. Some examples of this DSL are available here.
However after having implemented the Flow DSL it became clear that the decision of avoiding the explicit use of patterns obliged us to implement some extra-logic that had both a complexity and a performance cost, since in order to properly recreate the data structures expected by the Drools compiler it is necessary to put together the patterns out of those apparently unrelated expressions.
For this reason it has been decided to reintroduce the patterns in a second DSL that we called Pattern DSL. This allowed to bypass that algorithm grouping expressions that has to fill an artificial semantic gap and that is also time consuming at runtime.
We believe that both DSLs are valid for different use cases and then we decided to keep and support both. In particular the Pattern DSL is safer and faster (even if more verbose) so this will be the DSL that will be automatically generated when creating a kjar through the kie-maven-plugin. Conversely the Flow DSL is more succinct and closer to the way how an user may want to programmatically define a rule in Java and we planned to make it even less verbose by generating in an automatic way through a post processor the parts of the model defining the indexing and property reactivity. In other terms we expect that the Pattern DSL will be written by machines and the Flow DSL eventually by human.

Programmatic Build

As evidenced by the test cases linked in the former section it is possible to programmatically define in Java one or more rules and then add them to a Model with a fluent API

Model model = new ModelImpl().addRule( rule );

Once you have this model, which as explained is totally independent from Drools algorithms and data structures, it’s possible to create a KieBase out of it as it follows

KieBase kieBase = KieBaseBuilder.createKieBaseFromModel( model );

Alternatively, it is also possible to create an executable model based kieproject by starting from plain drl files, adding them to a KieFileSystem as usual

will work with lambda expression based constraint as described in the first section of this document. In the same way it is also possible to generate the executable model from the Flow DSL by passing a different project class to the KieBuilder

kieBuilder.buildAll( ExecutableModelFlowProject.class );

but, for what explained when discussing the 2 different DSLs, it is better to use the pattern-based one for this purpose.

Kie Maven Plugin

In order to generate a kjar embedding the executable model using the kie-maven-plugin it is necessary to add the dependencies related to the two formerly mentioned projects implementing the model and its compiler in the pom.xml file:

An example of a pom.xml file already prepared to generate the executable model is available here. By default the kie-maven-plugin still generates a drl based kjar, so it is necessary to run the plugin with the following argument:

-DgenerateModel=<VALUE>

Where <VALUE> can be one of three values:

YES
NO
WITHDRL

Both YES and WITHDRL will generate and add to the kjar use the Java classes implementing the executable model corresponding to the drl files in the original project with difference that the first will exclude the drl files from the generated kjar, while the second will also add them. However in this second case the drl files will play only a documentation role since the KieBase will be built from the executable model regardless.

Future developments

As anticipated one of the next goal is to make the DSLs, especially the flow one, more user friendly, in particular generating with a post-processor all the parts that could be automatically inferred, like the ones related to indexes and property reactivity.
Orthogonally from the executable model we improved the modularity and orchestration of rules especially through the work done on rule units, This focus around pojo-ification compliments this direction of research around pure java DSLs and we already have a few simple examples of how executable model and rule units can be mixed to this purpose.

Wednesday, February 07, 2018

NOTE: The instructions below apply only to the old version of the gwt-maven-plugin

At some point in the past, IntelliJ released an update that made it impossible to run the Workbench using the GWT plugin. After exchanging ideas with people on the team and summing up solutions, some workarounds have emerged. This guide provides information to running any Errai-based applications in the latest version of IntelliJ along with other modules to take advantage of IntelliJ's (unfortunately limited) live reloading capabilities to speed the development workflow.

Table of contents

1. Running Errai-based apps in the latest IntelliJ2. Importing other modules and use live reload for client side code3. Advanced configurations 3.1. Configuring your project's pom.xml to download and unpack Wildfly for you3.2. Alternative workaround for non-patched Wildfly distros

1. Running Errai-based apps in the latest IntelliJ

As Max Barkley described on #logicabyss a while ago, IntelliJ has decided to hardcode gwt-devclasses to the classpath when launching Super Dev Mode in the GWT plugin. Since we're using the EmbeddedWildflyLauncher to deploy the Workbench apps, these dependencies are now deployed inside our Wilfdfly instance. Nothing too wrong with that except the fact that gwt-dev jar depends on apache-jsp, which has a ServletContainerInitializer marker file that causes the deploy to fail.

To solve that issue, the code that looks to the ServletContainerIntitilizer file and causes the deploy to fail was removed in custom patched versions of Wildfly that are available in Maven Central under the org.jboss.errai group id.

The following steps provide a quick guide to running any Errai-based application on the latest version of IntelliJ.

1. Download a patched version of Wildfly and unpack it into any directory you like- For Wildfly 11.0.0.Final go here

2. Import the module you want to work on (I tested with drools-wb)- Open IntelliJ, go to File -> Open.. and select the pom.xml file, hit Open then choose Open as Project

3. Configure the GWT plugin execution like you normally would on previous versions of IntelliJ

2. Importing other modules and using live reload for client side code

After being able to run a single webapp inside the latest version of IntelliJ, it might be very useful to have some of its dependencies be imported as well, so that after changing client-code on that dependency, you don't have to wait (way) too long for GWT to compile and bundle your application's JavaScript code again.

Simply go to File > New > Module from existing sources.. and choose the pom.xml of the module you want to import.
If you have kie-wb-common or appformer imported alongside with another project, you'll most certainly have to apply a patch in the beans.xml file of your webapp.

For drools-wb you can download the patch here. For other projects such as jbpm-wb, optaplanner-wb or kie-wb-distributions, you'll have to essentially do the same thing, but changing the directories inside the .diff file.

If your webapp is up, hit the Stop button and then hit Play again. Now you should be able to re-compile any code changed inside IntelliJ much faster.

3.1. Configuring your project's pom.xml to download and unpack Wildfly for you

If you are used to a less manual workflow, you can use the maven-dependency-plugin to download and unpack a Wildfly instance of your choice to any directory you like.

3.2. Alternative workaround for non-patched Wildfly distros

If you want to try a different version of Widlfly or if you simply don't want to depend on any patched versions, you can still use official distros and exclude the ServletContainerInitializer file from the apache-jsp jar on your M2_REPO folder.

If you're working on a Unix system, the following commands should do the job.

By excluding it manually from the apache-jsp jar, Maven won't try to download it again after you remove the file. That makes this workaround permanent as long as you don't erase your ~/.m2/ folder. Keep in mind that if you ever need the apache-jsp jar to have this file back, the best option is to delete the apache-jsp dependency directory and let Maven download it again.

New instructions for the new version of the gwt-maven-plugin are to come, stay tunned!

It was great to share our cumulative experience over the years building the workbench and the web tooling for the Drools and jBPM platform and both talks had great attendance (250+ people in the room).

In this series of posts, we’ll detail our "5 Pillars of a Successful Java Web Application”, trying to give you an overview of our research and also a taste of participating in a great event like Java One.

There are a lot of challenges related to building and architecting a web application, especially if you want to keep your codebase updated with modern techniques without throwing away a lot of your code every two years in favor of the latest trendy JS framework.

In our team we are able to successfully keep a 7+ year old Java application up-to-date, combining modern techniques with a legacy codebase of more than 1 million LOC, with an agile, sustainable, and evolutionary web approach.

More than just choosing and applying any web framework as the foundation of our web application, we based our web application architecture on 5 architectural pillars that proved crucial for our platform’s success. Let's talk about them:

1st Pillar: Large Scale Applications

The first pillar is that every web application architecture should be concerned about the potential of becoming a long-lived and mission-critical application, or in other words, a large-scale application. Even if your web application is not exactly big like ours (1mi+ lines of web code, 150 sub-projects, +7 years old) you should be concerned about the possibility that your small web app will become a big and important codebase for your business. What if your startup becomes an overnight success? What if your enterprise application needs to integrate with several external systems?

Every web application should be built as a large-scale application because it is part of a distributed system and it is hard to anticipate what will happen to your application and company in two to five years.

And for us, a critical tool for building these kinds of distributed and large-scale applications throughout the years has been static typing.

Static Typing

The debate of static vs. dynamic typing is very controversial. People who advocate in favor of dynamic typing usually argue that it makes the developer's job easier. This is true for certain problems.

However, static typing and a strong type system, among other advantages, simplify identifying errors that can generate failures in production and, especially for large-scale systems, make refactoring more effective.

Every application demands constant refactoring and cleaning. It’s a natural need. For large-scale ones, with codebases spread across multiple modules/projects, this task is even more complex. The confidence when refactoring is related to two factors: test coverage and the tooling that only a static type system is able to provide.

For instance, we need a static type system in order to find all usages of a method, in order to extract classes, and most importantly to figure out at compile time if we accidentally broke something.

But we are in web development and JavaScript is the language of the web. How can we have static typing in order to refactor effectively in the browser?

Using a transpiler

A transpiler is a type of compiler that takes the source code of a program written in one programming language as its input and produces equivalent source code in another programming language.

This is a well-known Computer Science problem and there are a lot of transpilers that output JavaScript. In a sense, JavaScript is the assembly of the web: the common ground across all the web ecosystems. We, as engineers, need to figure out what is the best approach to deal with JavaScript’s dynamic nature.

A Java transpiler, for instance, takes the Java code and transpiles it to JavaScript at compile time. So we have all the advantages of a statically-typed language, and its tooling, targeting the browser.

Java-to-JavaScript Transpilation

The transpiler that we use in our architecture, is GWT. This choice is a bit controversial, especially because the GWT framework was launched in 2006, when the web was a very different place.

But keep in mind that every piece of technology has its own good parts and bad parts. For sure there are some bad parts in GWT (like the Swing Style Widgets, multiple permutations per browser/language), but keep in mind that for our architecture what we are trying to achieve is static typing on the web, and for this purpose the GWT compiler is amazing.

Our group is part of GWT steering committee, and the next generation of GWT is all about JUST these good parts. Basically removing or decoupling the early 2000 legacy and keeping only the good parts. In our opinion the best parts of GWT are:

Java to JavaScript transpiler: extreme JavaScript performance due to compiling optimizations and static typing in the web;

JS Interop: almost transparent interoperability between Java <-> Javascript. This is a key aspect of the next generation of GWT and the Drools/jBPM platform: embrace and interop (two way) with JS ecosystem.

Google is currently working on a new transpiler called J2CL (short for Java-to-Closure, using the Google Closure Compiler) that will be the compiler used in GWT 3, the next major GWT release. The J2CL transpiler has a different architecture and scope, allowing it to overcome many of the disadvantages of the previous GWT 2 compiler.

Whereas the GWT 2 compiler must load the entire AST of all sources (including dependencies), J2CL is not a monolithic compiler. Much like javac, it is able to individually compile source files, using class files to resolve external dependencies, leaving greater potential for incremental compilation.

These three good parts are great and in our opinion, you should really consider using GWT as a transpiler in your web applications. But keep in mind that the most important point here is that GWT is just our first pillar implementation. You can consider using other transpilers like Typescript, Dart, Elm, ScalaJS, PureScript, or TeaVM.

The key point is that every web application should be handled as a large-scale application, and every large-scale application should be concerned about effective refactoring. The best way to achieve this is using statically-typed languages.

This is the first of three posts about our 5 pillars of successful web applications. Stay tuned for the next ones.

[I would like to thank Max Barkley and Alexandre Porcelli for kindly reviewing this article before publication, contribute with the final text and provided great feedback.]

Wednesday, September 13, 2017

Red Hat is organizing a Drools, jBPM and Optaplanner Day in New York and Washington DC later this year to show how business experts and citizen developers can use business processes, decisions and other models to develop modern business applications.

Wednesday, September 06, 2017

With the renewed interest in AI the same conversations are starting to come up again, about what is or isn't AI. My recent discussion was on whether optimisation products, such as OptaPlanner, are considered AI as some considered it more Operations Research (OR). For some background, OptaPlanner started out as a Tabu Solver implementation, but has since added other techniques like Simulated Annealing.

Although I'd like to add that no single technique is AI, they are all tools and techniques that are quite typically used together in a blended, hybrid or integrated AI solution. So it's the right tool or tools for the job.

The answer is that Optimisation is both an AI and an OR problem. It is a technique used and researched by both groups, the two different disciplines tend to take different approaches to the problem, having differing use cases and have historically used different techniques, with a lot of cross pollination from both sides.

I'll start with a consumer oriented answer to the question. StaffJoy has a nice blog article on the overlap of OR and AI, and I'll quote from that below:https://blog.staffjoy.com/how-operations-research-and-artificial-intelligence-overlap-b128a3efee2e
"Startups are using OR techniques in products like OnFleet, Instacart, and Lyft Line. However, when similar techniques are being exposed externally as services, they are often described as AI — e.g. x.ai, Atomwise, and Sentient. Very few companies describe algorithms that they sell as optimization (with the exception of SigOpt) because the end goal of customers is automating decisions. With StaffJoy, we have found that customers better understand our product when we describe it as an “artificial intelligence” tool rather than an “optimization” or “operations” tool. We think that this is because customers care more about what a product achieves, rather than the means it uses to achieve it."

In short consumers do not see the difference between OR and AI, when applied to real world problems and it is commonly marketed as AI.

I'll go a little more technical now, to further demonstrate it's more than just marketing - as that side is only touched on in the above blog post.

While the two groups (OR and AI) may have once been distinct, it's been well established that the OR and AI groups overlap in this space and have collaborated for years. Glover (1986) states them as "the recent remarriage of two disciplines that were once united, having issued from a common origin, but which became separated" - see final paper link at end.

A cursory google with terms "operations research" and "artificial intelligence" will more than prove this. Some techniques, like Linear Programming, are strongly on the OR side, others like Local Search (which OptaPlanner falls under) are shared. Optimisation, and local search (along with other techniques), is a core fundamental taught in every AI course without fail, and will be covered in every general AI book, used in schools - such as "AI: A Modern Approach"- see chapter 4, page 120http://aima.cs.berkeley.edu/contents.html

Lastly I'll quote directly from the original Tabu Solver paper "These developments may be usefully viewed as a synthesis of the perspectives of operations research and artificial intelligence... ... The foundation for this prediction derives, perhaps surprisingly, from the recent remarriage of two disciplines that were once united, having issued from a common origin, but which became separated and maintained only loose ties for several decades: operations research and artificial intelligence. This renewed reunion is highlighting limitations in the frameworks of each (as commonly applied, in contrast to advocated), and promises fertile elaborations to the strategies each has believed fruitful or approaching combinatorial complexity." Glover (1986). Note the paper is from "The Center of Applied Artificial Intelligence". http://leeds-faculty.colorado.edu/glover/TS%20-%20Future%20Paths%20for%20Integer%20Programming.pdf

Tuesday, August 08, 2017

Rule engines, like Drools, typically make use of a custom languages to define a set of rules. For example, he Drools compiler translates a drl file to an internal representation (the KiePackages) that is subsequently used to generate the ReteOO/Phreak network that will perform the rules evaluation.

This internal representation was never really intended to be generated or consumed by end users. This complication makes it difficult to write rules programmatically and the suggestion is instead to generate text rules at the DRL level. This means that drl itself is currently the only practical formal notation to define a set of rules in Drools.

Drools internals were developed with several assumptions at the time, that are no longer true or desirable. Prior to Java 8, perm gen was a concern, so various solutions were utilized to address this - such as MVEL for reflective based evaluation. Java 8 now puts code on the heap, so this is no longer necessary. A the engine level it also inspected and produced indexes, which tied it to the expressions produced by DRL - this makes polyglot impractical. Lastly it liberally uses classloaders and reflection, which makes it difficult to transpiler for execution on different environments.

An engine independent rule model

To overcome this limitation and offer the possibility of programmatically define a set of rule in pure Java we developed a model aimed to provide a canonical representation of a rule set and a fluent DSL to conveniently create an instance of this model. The model itself is totally independent from Drools and could in theory be re-used by other engines, It also introduces layers that fully separates the engine from having to be aware of any language. For example it will not inspect and generate indexes, instead it expects to be provided those indexes, from the layers above. Another advantage is it now means Drools has a developer friendly view of what it's executing, as it's all just pure low-level pojo rules.

This model, other than giving to all Java developers a clean way to write rules in plain Java, will also enable our team to experiment with new features faster, freeing us from the burden of also implementing the corresponding parser and compiler parts that integrate the new feature with the drl notation.
Let's see a practical example of the model of a rule defined using the before mentioned DSL:

It is evident that the DSL version is much more verbose and requires to specify many more details that conversely are inferred by the drools compiler when it parses the rule written in drl.
As anticipated this has been done on purpose because the model has to explicitly contain all the information required to generate the section of ReteOO/Phreak network intended to evaluate that rule, without the need of performing any bytecode inspection or using any other complex, brittle and non-portable introspection technique. In those simple examples the index contains the same logic as the expr, as per DRL expressions can contain more than just the index. The index would be inferred and implicitly added by the upper language layers.
To clarify this aspect let's analyze a single LHS constraint in a bit more detail:

In this statement you can notice the following parts
[1] This is a label for the constraint used to determine its identity. Two identical constraint must have the same label and two different ones must have different labels in order to let the engine properly implement the node sharing where possible. This label can be optionally omitted and in this case it will be generated from an automatic introspection of the bytecode of the lambda expression implementing the constraint. However, as anticipated, it's preferable to avoid any introspection mechanism and then to explicitly provide a constraint label whenever is possible.
[2] This is the Variable defined before creating the rule and used to bind an actual fact with the formal parameter of the lambda expression used to evaluate a condition on it. It is the equivalent of the variable declaration in rule written with the drl notation.
[3] The lambda expression performing the constraint evaluation.
[4] The specification of the type of this constraint and how it has to be indexed by the engine.
[5] The name of the properties of the Person object for which this constraint has to be reactive.

Building and executing the model

You can programmatically add this and other rules of your rule base to a model:

Model model = new ModelImpl().addRule( rule );

Once you have this model, which again is totally independent from Drools algorithms and data structures, you can use a second project, this time dependent both on Drools and on the model itself, to create a KieBase out of it.

KieBase kieBase = KieBaseBuilder.createKieBaseFromModel( model );

What this builder internally does is recreating the KiePackages out of the model and then building them using the existing drools compiler. The resulting KieBase is identical to the one you may obtain by compiling the corresponding drl and then you can use it exactly in the same way, creating a KieSession out of it and populating it with the facts of your domain. Here you can find more test cases showing the Drools features currently supported by the model and the DSL constructs allowing to use them, while here you can find a more complete example.

Embedding the executable model inside the kjar

At the moment a kjar contains some pregenerated classes implementing the constraints and the consequences. However all of the drl files need to be parsed and compiled from scratch, making limited savings on building it from source only.The executable model will be also useful to speed this process up. Embedding the classes implementing the model of the rule base inside the kjar, makes possible to quickly recreate the KiePackages, instead of having to restart the whole compilation process from the drl files. A proof concept of how this could work is available here.

DRL compiler to Executable Model

We are working on a new compiler that will take existing DRL and directly output the executable model, which is stored in the kjar. This will result in much faster loading and building times. We are also looking into other ways to speed this process up, by pre-caching more information in the kjar, which can be determined during the initial kjar build.

Pojo, Polyglot and DRLX Rules

The executable model is low level and because you have to provide all the information it needs, this makes it a little verbose. This is done by design, to support innovation in the language level. While it is technically still pojo-rules, it is not desirable for everyday use. In the near future we will work on a less verbose pojo rules layer, that will use an annotation processor to inject things like indexes and reactOn. We also hope that it will result in several rule languages, not just one - scala rules, closure rules, mini-dsl rules etc. Nor is it necessary for a language to expose all of the engine capabilities, people can write mini-dsls exposing the subset.

Finally we are also working on a next generation DRL language (DRLX), that will be a superset of Java. This is still in the design phases and will not be available for some time, but we will publish some of the proposals for this, once an early draft spec is ready.