Over the past couple of years Liferay has invested a considerable amount of effort to re-engineer Liferay Portal version 7.0 with modularity as the key focus. As part of this effort we've adopted OSGi as the modularity layer.

In September of 2013 we decided that in order to provide ourselves the most direct access to OSGi expertise we should join the OSGi Alliance as a Contributing Member. In this way we would sit in the very room with those engineers who've devised the standards, engineers with the most experience in applying their works. It would prove to be a fruitful opportunity for Liferay to learn as well as bring our own experiences with enterprise application development to the table.

After a successful freshman year in the Alliance Liferay decided it was the appropriate time to take a more significant role. In our second year (2014) we decided to graduate all the way to Strategic Member (the highest membership level within the Alliance). We would have the privilege (besides many others) to nominate our own employees into roles as officers of the Alliance.

Finally, in December 2014 Liferay was elected a member of the Board of Directors of the OSGi Alliance, and at the first board members Face-To-Face meeting of 2015, January 23, Liferay officially took it's seat as a Board Member.

This is a great honour because the OSGi Alliance has worked extremely hard and against an increasingly challenging and evolving software engineering industry. All the while, it has stayed relevant and effective, proving itself to measure up to it's core values.

Liferay is eager to participate in a long lasting exchange of experiences and support of the OSGi Alliance.

I invite individuals and organizations (no matter how small) to bring their knowledge and expertise to join Liferay in this great organization.

We've been doing a lot of work to improve modularity of Liferay the past while. Liferay 6.2 has experimental support for OSGi but there's not much in the way of integration points. You can certainly do some cool things, but it's not integrated to the point of making the developers life much simpler than before.

However, in Liferay 7.0, one of the things we're working on is making it possible to reach many more integration points as OSGi services deployed into Liferay's module framework.

If you know anything about OSGi, if you've built Eclipse plugins, if you've worked on any OSGi framework at any time, you can use that knowledge to speed you down the path of writing Liferay plugins.

OSGi plugins are deployed into Liferay just like any other legacy plugins (by placing them into the deploy folder). The deployment framework will recognize if an artifact is an OSGi bundle and hand it over to the module framework.

What follows is a list of current integration points.

Each lists the Java interface as well as any service properties which need to be set in order for those to be properly connected with their target. If no properties are mentioned then none are required but the interface.

There should be many more integration points before long and that should include every one available previously by other means, as well as many more not previously reachable.

One significant observation to be made is that it's now possible to collaborate with other plugins without modifying their code.

For instance, if you have a third party portlet, let's call it 1_WAR_coolportlet, and you wish to collaborate with it by giving it a custom URLEncoder, given it's FQPN, you could create an OSGi service which has the service propertyjavax.portlet.name=1_WAR_coolportlet and that's it.

Compile the impl against the API, assemble it as an OSGi bundle (using whatever build technology suites you) and deploy it to the deploy folder.

As I said, there's still lots of work to be done. But more is on the way.

I've been working with JMH (Java Microbenchmark Harness) [1] written by the OpenJDK/Oracle compiler guys. JMH is a microbenchmark test harness which takes care of all the concerns that most developers can't really wrap their head around and often neglect to take into consideration. It's discussed here [2] by it's primary developer. And here [3] is a good write-up about it.

JMH is certainly a tool that you'll want to bring into your toolbox if you have any care at all for understanding the performance of your applications (more importantly down a the algorithm and language level).

It's a little tricky getting JMH setup in a pure Ant environment but I can talk about that in another post.

Meanwhile, we've been working on an implementation of a generic "registry" library (a.k.a. liferay-registry) which is backed by the OSGi Service Registry.

The source of my interest in JMH has been to make sure that this new registry implementation is close to being as fast as the one(s) currently in the portal. My goal was to reach at least 90% equivalent performance given that the new registry has more features, but those should not impose a significant performance degradation.

In order to baseline the result, I compared all implementations (existing and new) with that of a plain old java array, and also a plain old ArrayList (list). The serviceTrackerCollection is the impl from the liferay-registry which is backed by the registry itself for tracking registered impls. Finally, the two uses of the EventProcessorUtil were tested:

when a list of classNames are passed (eventProcessorUtil_process_classNames)

when a list of impls are pre-registered (eventProcessorUtil_process_registered)

Here are the outcomes of the JMH "throughput" (max number of operations per iteration) benchmark over 200 iterations with a concurrency of 4 (multi-threaded):

We achieved a pretty significant improvement in performance over the original, thus ensuring that when we integrate the registry into the Liferay core shortly, it won't cause a performance degradation (and may actually bring a slight improvement).

Conclusion

JMH allowed us to deeply understand implementation details which were impacting execution and concurrency of our new implementation. It would have been extremely painful to try to achieve the same type of analysis without this type of tool. Thanks to the OpenJDK team for publishing it.

Finally, I've done all the heavy lifting necessary to integrate JMH in our build with the goal to continue to create more benchmarks and ensure we are providing the very best implementations we can to our community. So look for that to be introduced into the core in the coming weeks.

Notice there is no more leading *, the suggested version matches the current version, the warning is gone (replaced by -).

You should also notice that below the class change, the version change on the package is represented as removing version 6.2.0 and adding version 6.3.0:

- version 6.2.0
+ version 6.3.0

Nice! The package is no longer dirty.

If we had not resolved it properly the report would continue to indicate an issue for the package.

Silencing resolved semantic version reports

Once you get used to resolving issues, you may want the report to focus only on the dirty packages that come up while developing. To accomplish that there is another build property which you can set in your build.${user.name}.properties.

As mentioned in a previous post, you may have noticed the reports that are spit out when doing portal builds in the last couple of weeks.

These reports show API changes based on a baseline API version which would have been established automatically the first time you built the portal after semver went in.

This is fine for getting a basic grasp of ongoing changes as I explained in that post.

However, in order to get a more concrete picture and be able to accuratly declare version changes we need to have a proper baseline.

Since 6.2.0 GA was recently released with the semantic versioning tools in place we have the first opportunity to set a concrete baseline; the 6.2.0-ga1 git tag.

Here are the steps needed to setup a baseline based on the official 6.2.0 release (windows users will have convert these commands to whateer will work on that platform).

1) Enter the root of the portal source tree

cd <portal_src_root>

2) Delete the current auto generated baseline repo

rm -rf .bnd

3) Checkout the 6.2.0-ga1 tag

git co -b 6.2.0-ga1 6.2.0-ga1

4) Rebuild the entire portal

ant all

5) Checkout master

git co master

6) Once again, rebuild the entire portal

ant all

The outcome of 6) should be lots of output for at least 2 of the portal jars; util-taglib.jar, portal-service.jar.

Not all the output is bad news, but most of it shows that we have not properly declared API changes. Some even require more than a simple version increase. We may have to make descisons on the usage scenario of the specific API in order to reduce the magnitude of the change from a MAJOR/breaking change to a MINOR/compatible change. Of course there are some for which we won't be able to do that, and these are changes which cannot be backported to a maintenance branch.

Recaping the build settings you should be most interested in, if you notice the report output but hate that you have to scroll back through thousands of build lines (if your shell even gives them to you) then you may want to change the default from:

baseline.jar.report.level=diff

to

baseline.jar.report.level=persist

This setting persists the report output to files under <portal_src_root>/baseline-reports/*.

Now you can read the reports in your fav text editor or IDE.

In the next post we'll dig into specific changes found in the reports and how to address each scenario.

MAJOR version when you make incompatible API changes, MINOR version when you add functionality in a backwards-compatible manner, and PATCH version when you make backwards-compatible bug fixes. Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

Liferay will be applying package level semantic versioning which will allow for very granular API versioning. This is the same versioning strategy prescribed in the OSGi Semantic Versioning whitepaper. This will help developers, our support, and our documentation teams easily track API evolution and increase the stability of our APIs.

Adding semantic versioning to such a large project will be a challenge. One of the greatest difficulties will be educating all contributing developers on how to properly handle it. Hopefully, we can begin to address that here.

Leaving semantic versioning in the care of humans can easily become a nightmare. So much so that there are even theories which state that semantic versioning is not worth the effort due to the human limitations. The ideal scenario then is to not leave it up to humans at all. Or rather, to not only leave it up to humans. Machines are much more suited to dealing with this sort of thing.

However, a big problem was that until recently there were no good libraries which could produce, with reasonable heuristics, accurate reports about how APIs had changed. This is no longer the case. The open source bnd library ("the Swiss army knife of OSGi") originally developed by OSGi grand master Peter Kriens (http://www.aqute.biz/Bnd/Bnd) and now maintained (still with Peter, but in a more community driven effort) under the umbrella project BndTools http://bndtools.org (an impressive OSGi tooling suite for the Eclipse platform) is available for doing all the necessary work programatically.

But humans don't like being told what to do by machines. So the compromise is to make the machine do the hard work, and have it report it's findings to humans, letting humans decide how to react.

Let's outline the steps involved. (The implementation details discussed are specific to Liferay's tool built around bnd, but the process could be applied to any project were an equivalent build integration tool available.)

Step 1 - Setup

Let's assume that we're starting from a pristine working copy of our source code, checked out from the repository and positioned at a version tag we want to use as the foundation of our semantic versioning efforts. We'll call this the baseline version.

The next assumption is that we have enabled semantic version reporting in our build configuration. In the case of Liferay's reporting tool, this is configured by editing the build.${user.name}.properties file and choosing a value for the following property:

diff = include with standard a granular, differential report of all changes in the package

persist = in addition to diff persist the report to a file inside the directory identified by the property baseline.jar.reports.dir.name which defaults to baseline-reports

Step 2 - The First Build

Before any code changes take place we will perform the first build. During the build java source code is compiled and then packaged into jar files. With the baseline engine enabled, just before the jar file creation process completes, it's contents are analyzed against a previously existing version of the jar. This previously existing version is located in what we call the baseline repository.

Since this is the first build two distinct operations will take place.

The baseline engine's workspace is initialized and the baseline repository is created. This takes place in the folder identified by the property baseline.jar.bnddir.name which is .bnd by default (note the directory is hidden). Note also that if you want to restart this entire process from scratch, the only step is to delete this directory.

Since there is no previously existing version of the jar in the baseline repository, the current jar is added. It effectively becomes the baseline version. Also recall that earlier we chose a well define version tag as the starting point. This represents the effective version to which all subsequent changes will be compared.

Finally, no reports should result from the first build.

Step 3 - Making Changes

When we perform a change to the java source code such as developing new features, fixing bugs, etc. we'll need to execute a build with our changes, execute tests, etc.

Execute the build. We'll refer to it as a subsequent build to distinguish it from the first build.

Step 4 - Reporting

While the subsequent build is executing, each new jar (assuming of course the jar does in fact have some change) is analyzed against the previously existing version jar (a.k.a. old jar) obtained from the baseline repository (.. placed there during the first build).

The baseline engine performs a tree based comparison between the new jar and the old jar. The engine will then produce output based on the current reporting level.

If no API changes were detected there should be no new output. The build should look like it always has.

Reports

Now, let's assume that some change was in fact detected. For now, let's consider a very simple change. We'll add a new method to the concrete type:

com.liferay.portal.kernel.cal.Duration

In it's current form this class, and in fact it's package, are versioned based on the version of the jar in which it resides. Therefore in the case of Liferay 6.2.0, it's effective version is '6.2.0'.

Let's add a new method to this concrete class:

public void newMethod() {
System.out.println("executing the new method");
}

Rebuilding the jar containing this class should produce the following report:

This indicates the task is executing the baseline-jar task (a.k.a. the baseline engine) on portal-service.jar and there are 4826 resources in the jar (err.. that's a large jar, it's also the first hint that we should probably break it into pieces to improve maintainability, but that's a topic for another time).

Line 2:

[Baseline Report] Mode: diff

This indicates the current report mode or level, diff.

Line 3:

[Baseline Warning] Bundle Version Change Recommended: 6.3.0

This indicates that the report suggests a version change should be applied to the jar in question.

These are header lines and are only here to clarify the details of the details which follow. The report is broken down by package, and so the header reflects this.

The first column (un-named, 1 char width) is for indicating a dirty state of the package (represented as * = dirty, empty = not dirty).

When dirty, the package requires the attention of the developer. It usually reflects that either the changes or the package version require review and certainly that the API has been subject to a non-trivial change and the version no longer represents it accurately.

The second column (PACKAGE_NAME) indicates the package in question. Note that each sub package is treated uniquely, defining it's own version and being reported on independently.

This is the overall package report. Following the column descriptions above, you can see that the package is in a dirty state. The package com.liferay.portal.kernel.cal has incurred a MINOR change. Considering the current version and the baseline version, the report suggests that an increase of the package version to 6.3.0 is required to resolve the dirty state.

Lines 7-8:

< class com.liferay.portal.kernel.cal.Duration
+ method newMethod()

Note: These lines will only be included in report if the levels is diff or persist.

Packages often contain more than one class. To the developer making the change it's simple enough to understand why the package might be dirty. However, as the number of changes increase, or to an outsider, like a support or documentation engineer, understanding the specific nature of the change is often interesting, and quite often necessary to making life easier. Therefore, when in diff or persist level, for each package we show a differential view of every single change in the package. This is not a traditional diff output, but rather an API diff. For this reason the report requires a little bit of getting used to.

The first line indicates the class in question.

< class com.liferay.portal.kernel.cal.Duration

The leading character (in this case <) indicates the magnitude of change for the specific class. It's worth noting that each class may have different degree of change, and the package will reflect only the highest degree of change.

Possible leading characters are:

+ = ADDED

~ = CHANGED

> = MAJOR

µ = MICRO

< = MINOR

- = REMOVED

Lines which follow indicate each change which occurred in the class, and may even show lower levels in cases where the change occurred on inner elements of the class. The depth increases with each inner element.

+ method newMethod()

In this case we can see that a method called newMethod was added.

This raises the question: "How do we resolve the dirty state of the package?"

Reacting to Reports

There are several ways to react to these reports, each of which is dependent on a couple of different factors and so it's important to have a good understanding of them so as not to be intimidated by their granular nature.

Considering the change we already made, the first reaction we could have might be to revert the change completely. Re-running the build would clear up the report. Obviously this also wouldn't result in any progress. If we are attempting to fix a bug however, it's actually very bad practice to change public API if at all possible. So the report might be indication that we tried to solve the bug using an approach which was too aggressive to the API and we should reconsider the change.

However, it may happen that a developer finds that it's impossible to solve a bug without changing an API. But such a change should be a MINOR change at most (which is backward compatible) and only when whoever is responsible for ongoing maintenance is in agreement. This type of change may require the preparation of documentation since the public API was changed, even if it was a minor change.

In this case however, the most common reaction to such a report is more often than not to simply set the package version to the new version.

Package Versions (packageinfo)

Granularly managing package versions is achieved in one of two ways according to the OSGi Semantic Versioning whitepaper. For our purposes, we will follow the packageinfo text file approach.

Locating the directory of the package com.liferay.portal.kernel.cal containing the Duration.java file, we will create a new text file called packageinfo (with no extension). This file should contain a single line following the pattern:

version <version>

where <version> is the actual version we want to assign to the package following the syntax defined for semantic versioning (see the Summary above).

For instance:

version 6.3.0

and re-run the build.

Line 6 should change to:

com.liferay.portal.kernel.cal MINOR 6.3.0 6.2.0 6.3.0 -

Note that the package is no longer in dirty state and the current version matches the recommended version. Lastly, there are no warnings (represented by -).

The remain report lines reveal more information about the package changes:

As you can see, at the same level as the class, we see that the package version change is represented by the removal of the 6.2.0 version and the addition of the 6.3.0 version. We now see the sum of all changes since the baseline and also that all appropriate actions have been taken to resolve the dirty state.

A MAJOR change

Minor changes will and should occur relatively frequently. This is just common to pretty much all every day development. So, let's make a change which will produce a MAJOR change to see how this affects our report.

At the package level, we can see that once again the package is dirty indicating that we have to take action with respect to versions. If we had any intention of putting this change in the hands of developers on maintenance releases we'd have some very angry developers since this is a breaking change that is NOT backward compatible.

Deleting a method from an established API is always a breaking change. So, unless this is in fact targeted at a major release we should not make this change. At most, if we need to indicate that the method should no longer be used, we can and should deprecate during maintenance releases and warn of removal in some future major release.

Other challenges

Several other challenges exist in very large projects like ours.

Over-versioning

A rule of thumb in semantic versioning is that the jar version should reflect the highest version from among all the package versions found within the jar. We can extrapolate some insight from how quickly a jar's version increases over time in relation to other packages in the jar.

For instance, if some package within the jar increases by more than one major version while others do not, this is a good indication that this API is more volatile and should probably be extracted into it's own jar, increasing maintainability and reducing it's area of effect.

The flip side of that is if a particular package within a jar rarely changes in relation to other packages in the same jar it's also an indication that it should be extracted and from the rest. It's stable and safe code that should not be subjected to version increases like the rest. That's good code.

Split packages

Split packages occur when the same packages are defined in different jars. This causes problems since package versions may not coincide across those jars. Secondly, controlling access to the API using OSGi package dependencies is more difficult.

Massive packages

When a package contains many, many classes, it's more likely than not that not all classes still comprise the same API. The result is over aggressive API versioning when changes occur to some classes within the package which are not related to other classes in the package.

Take for example a well known Liferay package:

com.liferay.portal.kernel.util

This package contains many utility classes most of which are not related to each other.

These two classes are completely unrelated. However, if any single class within this package is subject to a version change, the change must be reflected on all the classes in the package even though they are unrelated. This is a clear indication that they should have their own packages.

What's Next

Addressing all of the challenges above will be a considerable task. However, this tool provides a mechanism to progressively apply semantic versioning to our existing largely un-versioned code base. It will also provide is with hints as to how we can continue to improve our APIs in ways other than semantic versioning, such as isolating APIs from each other to reduce API over-versioning.

Our developers will learn about semantic versioning in an almost passive way which shouldn't greatly impede their day to day activities. Furthermore, they will learn to develop a responsibility for the degree of change they cause and hopefully together we will grow to be better developers.

Well, immediately it means very little, technically. However, what it means philosophically is that we really care about everyone who uses our products. It means we will continue to strive with every ounce of our being to make Liferay better.

What's correlation between being a member of the OSGI Alliance and the irrational desire to make our products better?

The OSGI Alliance has been successfully solving difficult problems for a long time. Learning and sharing knowledge across these two fantastic communities of developers can only increase the likelihood of continuing that success.

As a member of the OSGI Alliance, Liferay's 60000+ developers and it's whole community gets a voice in those solutions.

Ever since the Faceted Search API came out there have been a ton of great questions about how to go about creating very specific filters.

A recent one is: "find both journal-articles and only PDF-files from Documents and Media Library"

This kind of requirement is not suited to being implemented as Facets.

Facets are "metrics" calculated across an entire result set. As such, using Facet's ability to perform drill down as a means of "filtering" will likely lead to poor performance and overly complex facet configurations.

However, there is an API available for doing precisely this type of filtering.

Unfortunately, there isn't currently a way to configure this available from any UI (Marketplace opportunity??).

The com.liferay.portal.kernel.search.SearchContext class has a method:

public void setBooleanClauses(BooleanClause[] booleanClauses)

With this method you can pass an arbitrary number of filter criteria as an array of boolean clauses.

Here is an example which supports the requirements describe above ("find both journal-articles and only PDF-files from Documents and Media Library"):

Filtering implemented in this way is several times more efficient than anything done via the Facet API.

Another advantage of this API is support for things like exclusions "(-field:not_this_value)" which you can't do with Facets at all (you can only specify limited values to be included in facet calculations).

Lastly, I mentioned that this isn't available from the UI, but as you can see it would be extremely simple to add an advanced configuration option to the Search Portlet to store a string version of the filterQuery, enabling the filter to be set per search portlet instance.

Every year a project grows more complex the difficulties of managing backward compatibility become more complex.

The strain imposed on APIs as time passes is tremendous. Developers want APIs to be extended to exract every ounce of functionality. Bugs require fixing, often requiring APIs to be modified or sometimes requiring complete removal or deprecation of a subset of the API. Sometimes the changes are to internal behavior, causing the original expectations to no longer be as originally described.

This increased complexity causes strain between organizations designing and publishing these APIs and those consuming them. Every time a change occurs it affects consumers and may break their applications.

Old school methodology for dealing with API compatibility was founded in statically linking an API provider to an API consumer. Thus, there was never a conflict between different versions of an API since the consumer simply bundled the exact API version needed.

Later, dynamic linking came along and using metadata and version schemes, consumers could describe specific versions (even ranges) of APIs which would suitably satisfy the API dependency. Dynamic loading tricks allowed the "system" to match API providers with consumers. Often these tricks were as simple as file system path and file naming conventions. This is still a very common method to this day.

API version management complexity increased significantly when dynamic languages, particularly Java, arrived on the scene. Naming conventions (and language features) were still in use for aggregating APIs into "packages" and further into jars (fancy libraries + metadata).

However, the notion of matching API by "version" was practically lost. The workaround or rather the argument for this loss was based on the idea that a given application should specify, at time of execution, which libraries would be available by what is known as the "classpath". Furthermore, at runtime, another Java feature called a "classloader" could further create a "context" within an application within which a new (different) set of libraries would be accessible. However, version matching is still missing.

What does it mean that version matching is missing from Java API management?

Without version matching, when APIs change as described earlier it's virtually impossible for developers to account for those changes in a declarative way. Developers must largely resort to trial and error in order to identify and resolve API compatibility problems. If they are lucky the changes are evident during build time due to compile errors (stemming from calls to some API which no longer match the API's definition), or worse; runtime errors (those pesky, unpredictable errors which could end up costing you real money).

As API developers, it would be ideal to have an enforceable mechanisms in place which would warn about backward compatibility problems and their severity, at build time. The result of these warnings would be action by the developer to properly declare the API version and rational, or even to re-engineering their change so the API is not affected.

The declaritive portion of this type of change would allow for semantically rationalizing APIs between consumers and providers using version matching.

This notion of "semantic versioning" is not a new concept and has been defined and put to use by various organizations and technologies over the years. One major proponent of semantic versioning is the OSGi Allaince. The OSGi Allaince has defined their definition of semantic versioning in a technical white paper by the same name [1].

But what about software products which are not following the OSGi model? How could they leverage "build time" analysis of semantic versioning as proposed above?

The key is to rely on at least one aspect of the OSGi model; package versioning. By appling this very simple declaritive practice, using some existing build tools, adding in a few relatively simple modifications, and it's possible to build a basic model for any build system to deliver semantic version warnings to developers.

Following is an outline of the process and assumptions:

1) When a developer first obtains the sources of a project, all the source code is declared as some version. This version is either an aggregate version for the entire project, it may be per package. Either way, from the point of view of the developer the version of the code is accurate (since it's from the origin repository).

2) The build tool used to build the project can deliver semantic version build warnings.

3) One build must occur in order bootstrap semantic version checking (libraries are build and initial versions are depositited into a repository, called the "baseline repository").

4) The project now stands in a state in which the build can check againts the baseline repository and give semantic version warnings (this process is now called "baselining"). At this stage there should be no warnings.

5) The developer makes some change. The build should be able to detect whether the change requires a version update.

6) The unit of change will be the java package. The change will occur within the "packageinfo" file. Correctly altering the file will silence the build warning. Depending how sever the warning is, MAJOR or MINOR, and which type of work the developer is doing (e.g. bug fixing vs. new development) should reflect how the developer solves the wraning.

7) Anytime a warning occurs which causes the developer to make a version change, it is almost a guarantee that documentation needs to be updated. Exactly what type of change (javadoc, developer docs) depends on the type of change.

A prototype

is currently in development and can be found in the following github repository [2] in the "semantic-versioning" branch. The code uses the bnd [3] library and extends the default ant build task [4] into a self-initializing semantic version warning system.

I welcome anyone's thoughts on the subject!

[Updated Aug 8, 2013: 17:43]

I built Liferay 6.1.0-ga1 with the semantic version warning tools added. I then baselined liferay-portal:master@266cc47216 (+ semantic versioning) against it. You can see the outcome here [5]. You can see from lines 168-175, 236-380, 397-404, 496-797 where API changes.

Also, what follows is not just a story about Sharepoint. If you have any HTML site of any kind you can gain advantages from using Liferay.

Liferay has this crazy simple mechanism called "widgets". Yup widgets!

The term has been used, we know, but Liferay has seriously had this feature since at least version 5.1.0 (possibly earlier) and yet I NEVER hear anyone talking about, or taking advantage of it.

Sure you can use Liferay as a WSRP provider and/or use other thick integration technologies like Web Services. You could even build Open Social apps on Liferay as full fledged gadget server (and also a client on whatever site you want to ingrate into).

However; "widget" are dead simple! AND they require only an HTML client! Nothing else!

Q:"Where do I sign up???"

It's extrelemly simple. Let's build an example.

Imagine you have a Liferay portal running somewhere (this is of course required).

Now, suppose on that portal you have a portlet on a page, and let's say the page is:

http://host.my.tld/web/site/page

and on this page is portlet named: XYZ

It even happens to be instanceable, so the fully instanced portlet id is: XYZ_INSTANCE_1234

Wait! It also happens to be a plugin, it's not a native Liferay portlet, it's in a war called: XYZ_portlet.war (and so the context is XYZ_portlet), and so the fully full instanced, plugin portletId is: XYZ_portlet_WAR_XYZ_INSTANCE_1234

Ok, so given the page url, and the portletId above, we change the url to:

If everything is working as it should what you should see is a HTML page with ONLY the portlet in question; no headers, no footers (you can even tune the portlet's "look and feel" to show the portlet in borderless mode (Show Portlet Borders: uncheck, Save) so it integrates more smoothly with the host site).

You can even navigate around the portlet, clicking links, performing actions, and again if everything is working as it should, the portlet should remain in widget mode throughout. Even non-native (or plugin) portlets!!!

Now, back to this Sharepoint server you have, or the legacy website you can't, for whatever reason, get out of using. But you want to plug in the shinny new portlet you created (or existing one you want to use 'cause it's already designed and doing what you need/want).

On this host site, all you need is an iframe! That's it! Just an iframe which references the widget url you created earlier.

(In fact, you could probably even get out of using an iframe with some clever js which loads the content of an ajax request to the same url into some HTML container on the page. Adding a url or button event listener to handle the clicks via ajax requests can round out a pretty darn smooth integration.)

Now, you want to have that portlet behave in context of the user logged in from this other site? Sure thing!

Just plug in Liferay to whatever SSO you are using on the host site and BAM! you have integration!

Liferay has been slowly integrating an OSGi framework into it's core; not in the sense that Liferay will nessecarily run "within" an OSGi runtime itself (there are arguments for doing that, and it's possible to a certain extent but the gain just isn't in doing that just yet), more that Liferay will "host" and OSGi runtime within itself that Liferay will utterly control.

Why do this? Well, as I've stated many times before, Liferay is a complex beast with an intricate network of features, so many features in fact that the ocassionaly have such indistinct lines such as finding the where one feature ends and another begins can be difficult.

OSGi will hopefully play a part in 1) providing us with a lot of re-use of features we had ourselves partially implemented but for which there existed a fully specified implementation our in the larger java eco system. The savings those existing features can bring is one of many benefits. 2) A specified lifecycle for "modules" (a.k.a. bundles) also is a huge benefit, one we can simply use. 3) Dependency resolution and management is another piece we had started to implement but it was clear to me that this had already been done much more effectively than Liferay alone could implement, again as part of an OSGi specification.

The number of benefits is almost too great to list. However, one of the greatest advantages can't be discussed enough: Modularity.

Modularity is something that Liferay is now learning to embrace. Hopefully in the future we'll see many more examples like the one I'm about to illustrate.

One of the things I spend a lot of time doing is running and debugging code. There are a number of challenges to doing this with such a large application as Liferay. It takes time to get up and running, the vast number of features sometimes makes it tricky to isolate just the right service calls in order to test some specific interaction. I'm most often running the portal in debug mode executed from my IDE (Eclipse). A favourite thing to do is, place a break point and then use the context evaluation to invoke some specific code (using the Display view). However, my main purpose for doing this is to be able to fire some operation often unrelated with the actual breakpoint position. Effectively I just want to execute some arbitrary code in the JVM.

What would be ideal is if I had a command line from which I could do the same thing.

The OSGi echo system provides that out there somewhere a bunch of smart people have specified some behaviors, how to build properly interacting bits of code, and provided I have the appropriate runtime for those bits, I can interconnect them with my own bits and compose a greater feature.

Place these jars in a common, convenient location (it need not be within the Liferay install).

(Note: We could have even avoided this pre-download step, but due to a small logical bug in Liferay's integration, we'll have to download the bundles ahead of time. I'll be fixing this bug in the days to come.)

Step 2)

Enable Liferay's Module Framework

Place the following in your portal-ext.properties (or equivalent props file):

module.framework.enabled=true

Step 3)

Configure some initial bundles to start when the OSGi runtime starts in Liferay (still in portal-ext.properties):

Step 4)

Set the telnet port on which we'll access the shell (still in portal-ext.properties):

module.framework.properties.osgi.console=11311

Step 5)

Start Liferay

Step 6)

Connect to the telnet service from the command line (or from your favourite telnet client):

]$ telnet localhost 11311

You should see something like the following:

We have sheel access!

But what can we do with this?

The first thing I wanted to do was investigate how the shell works and how to customize the configuration.

Passing System Properties to OSGi bundles

One thing I should add is that OSGi bundles often have system properties specific to them, and in order to isolate these from potentially other osgi runtimes in the same JVM, Liferay provided a means of passing these as namespaced portal properties.

An example of this is passing a Gogo specific system property called gosh.home. And so I added that property to my portal-ext.properties (prefixed by module.framework.properties.):

e.g.

module.framework.properties.gosh.home=${module.framework.base.dir}

This places the Gogo shell's home directory within Liferay's osgi directory inside ${liferay.home}.

Gogo is then trained to look inside an /etc directory within it's gosh.home for configuration files.

Executing commands

The shell offers many build in commands. Executing help will reveal many of the extra commands namespaced by providers. However these are executable without the namespace (the namespace is used for conflict resolution of two namespaces provide commands of the same name).

A few choice commands are:

ss (short bundle status)

diag (diagnose a problem bundle)

However, it's possible to add commands dynamically as well as use closures. This is where the real power begins.

You may note that we added a custom set of commands based on the java.lang.System class in the gosh_profile:

addcommand system (loadclass java.lang.System)

What this does is loads the class and assigns the public methods of that class under the system prefix. Executing:

system:getproperties

outputs all the system properties. In a similar fashion any class can be added as a set of shell commands.

Conclusion

Modularity via OSGi allowed me to easily add functionality not previously available in Liferay. I can now build out some commands that will enable me to quickly take actions which previously required me to find round about ways to achieve.

Well, that's all for today. It was simple and fun to extend Liferay with shell access, I solved my problem. It also demonstrates the power of OSGi since we didn't have to write a single line of code in order to benefit from it's well defined modularity and lifecycle specifications.

I'll talk more about shell commands, closures, and how to add and use custom Liferay commands in more detail in a future blog post. I'll also start using this shell to show how to start building small Liferay extensions using OSGi.

Eventually, more of the Module Framework features (such as the web extender) that Miguel and I have been working on will reach master and we'll start to discuss those as well.

I had planned to speak at this years North American Symposium about our ongoing work with OSGi. However, sadly due to several different factors that didn't happen.

It's too bad because as it turns out at least two dozen people asked me:

What happened to the OSGi plans?

Was it dead?

When would it come?

Was there a talk on it disguised as something else?

some variation thereof

Likewise, I was asked:

What does OSGi and Liferay together even mean?

What is the relation?

What is the value?

Clearing the air

1. The plan

Liferay's OSGi plans are not dead. My current focus, as far as development is concerned, is almost 100% OSGi. A usable form of OSGi integration will arrive in 6.2. There was no talk on OSGi at NAS, so you didn't miss anything.

2. The reasoning (why OSGi?)

Liferay is a large, complex application which implements it's own proprietary plugin mechanisms. While Liferay has managed well enough dealing with those characteristics over it's history, it's reached a point on several fronts where these are becoming a burden which seem to bleed into every aspect of Liferay: development, support, sales, marketting, etc.

(Disclaimer: I'm not part of support, sales or marketting. However, as a senior architect I see how each of those teams deal with the limitations imposed by those previously mentioned characteristics and how they impact the effectiveness of each team, and at times their level of frustration.)

How can OSGi make such a broad impact?you ask.

The impact doesn't actually come from OSGi at all. The impact comes from the outcomes resulting from designing a system using a modular approach.

"Modularity is the degree to which a system's components may be separated and recombined." - Wikipedia

Further to this general definition, you might consider that regardless of which context is used for obtaining a more specific one, it becomes quickly apparent that "modularity" is a benefit rather than impediment.

However, when we do look at a specific definition in the context of software design we see how it clearly applies and might relate to the aspects above:

"In software design, modularity refers to a logical partitioning of the "software design" that allows complex software to be manageable for the purpose of implementation and maintenance. The logic of partitioning may be based on related functions, implementation considerations, data links, or other criteria.” - Wikipedia

Plainly, a modular system allows that the interal details each module be altered without affecting other modules in the system. Imagine if this were true for Liferay on a wider scale.

potentially smaller deployment footprint due to ability to limit functionality to desired modules

greater robustness and resilience due to higher degree of care required in designing for consumption and re-use

OSGi is simply the best existing implementation of a Java modularity framework. So many of the less obvious concerns which become inherent needs in modular systems are already implemented in OSGi. There exists a number of very good implementations. They are widely adopted. They are heavily supported. The number of supported libraries grows daily, while even unsupported libraries are easily fixed.

In short, the benefits far outweigh the costs. There will surely be difficulties along the way, the largest being the fact that learning to implement in a modular way is often counter to how many of us have worked for so long.

I'd be very interested in any feedback that people have about this topic, so please let yourselves be heard.

The same example code also demonstrates how you might tee off the indexed result of these documents to store a copy in a separate location in HDFS so that it might be consumed as input for some MapReduce calculation in order to extract insight from it.

It demonstrates how to use the MapReduce API from a java client all the way from creating the job and sending it to submitting it for processing and then (basically) monitoring it's progress. The actual logic applied in the example is trivial, however the most important part is actually showing how you could use the APIs in making Liferay and Hadoop talk.

The code is on github as a standalone Liferay SDK (based on trunk but easily adaptable to earlier versions):

[update] I should also add a few links to resources I used in setting up hadoop in hybrid mode (single cluster node with access from a remote client, where my Liferay code is assumed to be a remote client):

I'm almost back home from San Francisco after yesterday's completion of Liferay's 2012 North American Symposium. I'm quite exhausted but what a great event to be a part of. For this small town Canadian boy, travelling to historic big cities to meet with old friends I rarely get to see face to face is an experience I will never take for granted. Every turn through the crowd there seemed to result in either heartfelt handshake and a great "Nice to see you again!" with a community member or client I'd met once or twice or even, dare I say, in a big hug from someone I've been working with for years but have really only met in person a dozen times, perhaps less. And the pride that comes from seeing what people have done and dream to do with the humble project I began participating in and was later allowed to earn a living from those many years ago is sometimes... overwhelming. But you can't help but stay grounded when you see the deep intensity and passion people bring. You want to make sure that doesn't get lost, and between the two there seems to be little room for pretentiousness. I'm honoured to be given the opportunity to do the work I do and I hope to keep doing it for a long time. Once again it was great to see you all. To those I already knew, I wish you continued good health and happiness. To those I met for the first time, I wish you luck in your endeavors and hope to see you again at other Liferay events or even elsewhere and happy. Thanks for having me.

In order to maintain its extensive deployment matrix, Liferay up to 6.1 has been locked to Java 5 APIs.

However, Java 5 started its EOL cycle on April 8th, 2007 and completely it on October 8th of 2009.

Furthermore the last app server that still required Java 5 has finally been EOL'd itself at the end of 2011.

This means that Liferay 6.2 is now free to adopt Java 6.

It's true that Java 6 itself is almost EOL (extended through Feburary 2013), but due to the transition period lasting an estimated 2 years for clients with support contracts, it means that many enterprise applications will remain Java 6 dependent untill the very end of the transition period which also demands that Liferay remain complient if it wants to retain those applications as part of the extensive deployment matrix it currently enjoys.

This of course in no way prevents 3rd parties from developing plugins/extensions against Java 7 or later as long as they accept the limitation this will have on deployment.

It was also suggested that the community should be informed (as the documents have not yet been updated) that building Liferay 6.2+ requires Ant 1.8 or greater. This is a minor point but can cause frustration if one is not aware.

It's been so long since I've written anything, I've been fealing rather guilty. Luckly recently we've been undertaking a huge effort to document features of Liferay, old and new.

One interesting but highly understated feature of Liferay 6.1 is the new Faceted Search support that I was lucky enough to get to work on. As I finished the first round of documentation (for my more eloquent peers to turn into a more polished and finished product) I thought that this would be a great bunch of info to place here for comment. It's a little more formal than a blog post should be, and probably much longer as well (there is even a toc!!!).. but what the hey!

Definitions

Before going through features, let us outline a set of definitions that are commonly used in discussion Faceted Search (or search in general).

indexed field: When we store documents in a search engine, we classify aspects of the document into fields. These fields represent the metadata about each document. Some typical fields are: name, creation date, author, type, tags, content, etc.

term : A term is a single value that can be searched which does not contain any whitespace characters. Terms may appear more than once in a document or appear in several documents, and are typically considered atomic units of search. Within the search engine, each indexed field (for example name) will have a list of known terms found within all the documents having that particular indexed field.

phrase: A phrase is a series of terms separated by spaces. The only way to use a phrase as a term in a search is to surround it with double quotes (").

multi value field: Some fields store more than one term at a time. For instance the "content" field may in fact contain hundreds of unique terms. Such fields are often referred to as "text" or "array" fields.

single value field: In contrast to multi-value fields, we have to logically assume that there is such a thing as single value field. Such fields always only contain a single term. These fields are often referred to as "token" or "string" fields.

frequency: The frequency value indicates how many times a term appears within a set of documents.

facet: A facet is a combination of the information about a specific indexed field, it's terms and their frequency. Facets are typically named by the field in question.

term result list: When a facet displays it's data, we call this the term result list.

frequency threshold: Some facet have a property called frequency threshold. This value indicates the minimum number for frequency of terms we want to show. If the frequency threshold of a facet is set to 1, a term appearing 1 or more times will appear in the term result list.

max terms: Some facet have a property called max terms. This value indicates the maximum number of terms that will be included in the term result list regardless of how many actual matching terms are found for the facet. This is done to keep the user interface under control and not to overwhelm the user with too much information.

order: The order property determines the default ordering used for the term result list. There are two possible modes: Order Hits Descending, or Order Value Ascending. The first, Order Hits Descending , means that results will be ordered by frequency in a descending order. The second, Order Value Ascending, means that the results will be ordered by value (i.e. "term") in ascending order. Both modes will fall back to the other mode as a secondary sort order when there are duplicates. (i.e. many terms with the same frequency will always be sorted by "value").

range: A range defines an interval within which all the matching terms' frequencies are summed. This means that if a facet defines a term range for the "creation time" field between the year 2008 to 2010, and another for 2011 to 2012, all matching documents having a creation time within one of these specified ranges will be returned as a sum for that range. Thus you may find 7 documents in the first range, and 18 documents in the second range. Ranges cannot be used with multi-value fields.

For End Users and Portal Administrators

Faceted search is a new feature in Liferay 6.1 (although some of the APIs were first introduced in Liferay 6.0 EE sp2). As such there is little relevance with previous versions other than in direct comparison with the old implementation of the Search Portlet which had no facet capabilities of any kind.

Although the new Faceted Search APIs are used transparently throughout the portal, primary exposure is surfaced through the new Search Portlet implementation.

What follows is a list of features provided by Faceted Search via the Search Portlet.

Aggregation of assets into a single result set: Results from all portlets are returned as a single set and the relevance is normalized among the entire set, regardless of type (i.e. the best results among all types will be at the top). Searching has a more linear cost due to the fact that only a single query is performed. Searching is therefore faster, more intuitive, and more relevant.

In previous versions of the portal, each Portlet implemented it's own search and returned a separate result set which resulted in several issues:

Each portlet invoked it's own query, and each portlet was called in turn resulting in a single portal request generating potentially N queries to the index each with it's own processing time. This lead to increased time to produce the final view.

Depending on the order of how portlet searches were called, the results near the bottom may be the most relevant and due to positioning could appear to have less value than those of portlets positioned physically higher up on the page. i.e. the relevance of results was not normalized across all the total results of all portlets.

Default facets: Asset Type, Asset Tags, Asset Categories, and Modified Time range facets are provided by default. These defaults make finding content on the most common facets simple and powerful. Facets details are displayed in the left column of the search portlet and provide information in context of the current search.

Asset Type: Performing a search for the term "htc" may return Asset Type facet details which appear as follows:

The value in parenthesis is the frequency with the term appearing on the left.

You may notice that as you perform different searches, the Asset Type terms may disappear and re-appear. When a term does not appear it means; a) it was not be found among the results, b) it did not meet the frequency threshold property, or c) it was beyond the maxTerms property (these properties will be discussed more later).

Asset Tags: If tags have been applied to any document which appear in the result set, they may appear in the Asset Tag facet:

Note: Not all tags may appear. In the example above, there are many more than the 10 tags that are listed, but the default configuration for this facet is to show the top 10 most frequently occuring terms as set by it's maxTerms property.

Asset Categories: If categories have been applied to any document which appear in the result set, they may appear in the Asset Categories facet:

Note: Not all categories may appear. In the example above, there are many more than the 10 categories that are listed, but the default configuration for this facet is to show the top 10 most frequently occuring terms as set by it's maxTerms property.

Modified Time: All documents appearing in the result set should have an indexed field called "modified" which indicates when the document was created (or updated). The Modified Time facet is a range facet which provides several pre-configured ranges as well as an option for the user to specify a range. All results in the subsequent query should then fall within this range.

Drill down: The next feature allows refining results by selecting terms from each facet thereby adding more criteria to the search to narrow results (referred to as "drilling down" into the results).

Clicking on terms adds them to the search criteria (currently only one term per facet). They are then listed in what is known as "token style" just below the search input box for convenience and clarity. Clicking the any token's X removes it from the currently selected criteria.

e.g. Selected the tag "liferay":

e.g. Additionally, selected the type "Web Content":

Advanced operations: These are supported directly in the search input box. Most of the advanced operations supported by Lucene are supported with only slight variations.

Note: Many of the descriptions bellow are copied (almost word for word) from the above reference to account for the similarities but also to highlight the slight variations found between the two.

Searching in specific fields: By default, searches are performed against a long list of fields (this is different from Lucene which searches a single specific field by default). Sometimes you want results for a term within a particular field. This can be achieved using the field search syntax:

<field>:<term>title:liferay

Searching for a phrase within a field requires surrounding the term with double quotation marks:

content:"Create categories"

Note:The field is only valid for the term that it directly precedes, so the query

content:Create categories will search for the term "Create" in the content field, and the term "categories" will be searched in "all" the default fields.

Wildcard Searches: The Search Portlet supports single and multiple character wildcard searches within single terms not within phrase queries.

To perform a single character wildcard search use the "?" symbol.

To perform a multiple character wildcard search use the "*" symbol.

The single character wildcard search looks for terms that match that with the single character replaced. For example, to search for "text" or "test" you can use the search:

te?t Multiple character wildcard searches looks for 0 or more characters. For example, to search for test, tests or tester, you can use the search:

test* You can also use the wildcard searches in the middle of a term.

te*tNote: You cannot use a "*" or "?" symbol as the first character of a search.

Fuzzy Searches : Search supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a single word term.

For example to search for a term similar in spelling to "roam" use the fuzzy search:

roam~ This search will find terms like foam and roams.

An additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example:

roam~0.8 The default that is used if the parameter is not given is 0.5.

Range Searches: Ranges allow one to match documents whose field(s) values are between the lower and upper bound specified by the range. Ranges can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically.

modified:[20020101000000 TO 20030101000000] This will find documents whose modified fields have values between 2002/01/01 and 2003/01/01, inclusive.

Note: Liferay's date fields are always formatted according to the value of the property index.date.format.pattern. The format used should be a sortable pattern. The default date format pattern used is yyyyMMddHHmmss. So, when comparing or searching by dates, this format must be used.

You can also use ranges with non-date fields:

title:{Aida TO Carmen} This will find all documents whose titles are between Aida and Carmen, but not including Aida and Carmen.

The OR operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the OR operator is used. The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets. The symbol || can be used in place of the word OR.

To search for documents that contain either "liferay portal" or just "liferay" use the query:

"liferay portal" liferay or

"liferay portal" OR liferay

The AND operator matches documents where both terms exist anywhere in the text of a single document. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.

To search for documents that contain "liferay portal" and "Apache Lucene" use the query:

"liferay portal" AND "Apache Lucene"

The "+" or required operator requires that the term after the "+" symbol exist somewhere in a field of a single document.

To search for documents that must contain "liferay" and may contain "lucene" use the query:

+liferay lucene

The NOT operator excludes documents that contain the term after NOT. This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT.

To search for documents that contain "liferay portal" but not "Apache Lucene" use the query:

"liferay portal" NOT "Apache Lucene"Note: The NOT operator cannot be used with just one term. For example,the following search will return no results:

NOT "liferay portal"

The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.

To search for documents that contain "liferay portal" but not "Apache Lucene" use the query:

"liferay portal" -"Apache Lucene"

Grouping: Search supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query.

To search for either "liferay" or "apache" and "website" use the query:

(liferay OR apache) AND website This eliminates any confusion and makes sure that website must exist and either term liferay or apache may exist.

Field Grouping: Search supports using parentheses to group multiple clauses to a single field.

To search for a title that contains both the word "return" and the phrase "pink panther" use the query:

title:(+return +"pink panther")

Proximity Searches and Term Boosting are not supported.

Portlet Configuration

[Updated: Oct 16 2012] Search portlet configurations are currently scoped to the sitepage, which means that all Search Portlets used in the same site will have the same settings, regardless of their location or positionon different pages will have their own configurations; this also includes any instances of the portlet embedded in themes, or other templates.

Display Settings:

Basic : This represents the most basic way of controlling the visible facets.

Advanced: This mode gives ultimate control over the display of facets and is where the true power lies in the Search Portlet. However, it is not for the faint of heart and requires creating a configuration in JSON format. (Future versions of Liferay will include a user friendly user interface for configuration of facets.)

In it's default configuration, the Search Portlet configuration would equate to the following JSON text:

"className": This field must contain a string value which is the FQCN (fully qualified class name) of a java implementation class implementing the Facet interface. Liferay provides the following implementations by default:

"data": This field takes an arbitrary JSON "Object" (a.k.a. {}) for use by a specific facet implementation. As such, there is no fixed definition of the data field. Each implementation is free to structure it as needed.

"displayStyle": This field takes a value of type string and represents a particular template implementation which is used to render the facet. These templates are normally JSP pages (but can also be implemented as Velocity or Freemarker templates provided by a theme if the portal property theme.jsp.override.enabled is set to true). The method of matching the string to a JSP is simply done by prefixing the string with /html/portlet/search/facets/ and appending the .jsp extension.

e.g. "displayStyle": "asset_tags"

maps to the JSP

/html/portlet/search/facets/asset_tags.jsp Armed with this knowledge a crafty developer could create custom display styles by deploying custom (new or overriding) JSPs using a JSP hook.

"fieldName": This field takes a string value and indicates the indexed field on which the facet will operate.

e.g. "fieldName": "entryClassName"

indicates that the specified facet implementation will operate on the entryClassName indexed field.

Note: You can identify available indexed fields by checkmarking the Search Portlet's Display Results in Document Form configuration setting and then expanding individual results by clicking the [+] to the left of their title.

"label": This field takes a string value and represents the language key that will be used for localizing the title of the facet when rendered.

"order": This field takes a string value. There are two possible values:

"OrderValueAsc" This tells the facet to sort it's results by the term values, in ascending order.

"OrderHitsDesc" This tells the facet to sort it's results by the term frequency, in descending order.

"static": This field takes a boolean value (true or false). A value of true means that the facet should not actually be rendered in the UI. It also means that, rather than using inputs dynamically applied by the end user, it should use pre-set values (stored in it's "data" field). This allows for the creation of pre-configured result domain. The default value is false.

Image Search Example: Imagine you would like to create a pre-configured search that returns only images (i.e. the indexed field "entryClassName" would be com.liferay.portlet.documentlibrary.model.DLFileEntry and the indexed field "extension" should contain one of bmp, gif, jpeg, jpg, odg, png, or svg). We would need two static facets, one with "fieldName": "entryClassName" and another with "fieldName": "extension". This could be represented using the following facet configuration:

"weight": This field takes a floating point (or double) value and is used to determine the ordering of the facets in the facet column of the search portlet. Facets are positioned with the largest values at the top (yes it's counter intuitive and perhaps should be reversed in future versions).

Other Settings

Display Results in Document Form: This configuration, if checked, will display each result with an expendable section you can reach by clicking the [+] to the left of the result's title. In Document Form, all of the result's indexed fields will be shown in the expandable section. This is for use in testing search behavior.

Note: Even if enabled, for security reasons this ability is only available to the portal Administrator role because the raw contents of the index may expose protected information.

View in Context: This configuration, if checked, will produce results which have links that target the first identifiable application to which the result is native.

For example, a Blog entry title will link (or attempt to link) to a Blogs Admin, Blogs, or Blogs Aggregator portlet somewhere in the current site. The exact method of location is defined by the result type's AssetRenderer implementation.

Display Main Query: This configuration, if checked, will output the complete query that was used to perform the search. This will appear directly bellow the result area, like this:

Display Open Search Results: In previous versions of the portal, the Search Portlet was implemented as a collection of com.liferay.portal.kernel.search.OpenSearch implementation classes which were executed in series. Due to the subsequent re-design of the Search Portlet, the portal itself no longer relies on these implementations for it's primary search. However, third party plugin developers may yet have Open Search implementations which they would like to continue to use. This configuration, if checked, will enable the execution of these third party Open Search implementations and results will appear bellow the primary portal search.

Note: It is highly recommended that third parties re-design their search code to implement com.liferay.portal.kernel.search.Indexer or more simply to extend com.liferay.portal.kernel.search.BaseIndexer. Thus it will be possible to aggregate custom assets with native portal assets.

For Developers

Key Classes

When implementing a customized search, many of following API classes are important:

We'll briefly go through the general organization of the above to understand where each class fits into the greater scheme.

SearchContext

The first thing required is to setup a context within which to perform a search. The context defines things like company instance to search, the current user invoking the search, etc. This task is handled by the com.liferay.portal.kernel.search.SearchContext class. Since this class has a wide variety of context properties to deal with, the most effective way to get one is to call the getInstance(HttpServletRequest request) method of the com.liferay.portal.kernel.search.SearchContextFactory class.

There are number of other SearchContext properties that can be set. See the javadocs for a complete list.

Setting up Facets

After we have setup all the appropriate SearchContext properties, we are ready to add the Facets for which we want to collect information. We can add Facets either programatically or through configuration. Programatically adding facets allows the developer to tightly control how the search is used. The following example shows how to add two facets using some provided Facet classes:

Note: The above two Facet implementations are not re-usable in that they always operate on specific indexed fields; entryClassName, and groupId (and scopeGroupId) respectively. Other implementations can be re-used with any index fields as demonstrated previously in the Image Search Example.

As shown previously, facets can also be setup using a JSON definition. Using a JSON definition allows for the highest level of flexibility since the configuration can be changed at run-time. These definitions are parsed by the static method load(String configuration) on the com.liferay.portal.kernel.search.facet.config.FacetConfigurationUtil class. This method reads the JSON text and returns a list of com.liferay.portal.kernel.search.facet.config.FacetConfiguration instances.

Facets as Filters

It should be noted that Facets are always created with reference to the SearchContext. Since facets also behave as the dynamic filter mechanism for narrowing search results, having the SearchContext allows a Facet implementation to observe and react to context changes such as looking for specific parameters which affect it's behavior.

Indexer Implementations

The next step involves obtaining a reference to an indexer implementation. The implementation obtained determines the type of results return from the search.

Asset Specific Searchers

As the name implies, Asset Specific Searchers always deal with only one specific type of asset. These are the implementations that are provided by developers when creating/designing custom Asset types. Liferay provides the following Asset Specific Searchers:

Aggregate Searchers

Obtaining a reference to an Asset Specific Indexer requires calling either the getIndexer(Class<?> clazz) or getIndexer(String className) methods on the com.liferay.portal.kernel.search.IndexerRegistryUtil class.

Aggregate Searchers can return any of the asset types in the index according to the SearchContext and/or facet configuration. Liferay only provides a single aggregate searcher implementation:

com.liferay.portal.kernel.search.FacetedSearcher

Obtaining a reference to this searcher simply involves calling the static getInstance() method of the same class.

Indexer indexer = FacetedSearcher.getInstance();

Note<: When implementing Indexers it is highly recommended to extend the com.liferay.portal.kernel.search.BaseIndexer class.

SearchEngineUtil

Internally each Indexer will make calls to the SearchEngineUtil which handles all the intricacies of the engine implementation. For the purpose of this document, we won't delve into the internals of SearchEngineUtil. But suffice it to say that all traffic to and from the search engine implementation passes through this class, and so when debuging problems it is often beneficial to enable debugging level logging on this class.

Performing the Search

Once an Indexer instance has been obtained, searches are performed by calling its search(SearchContext searchContext) method.

Hits & Documents

The result of the search method is an instance of the com.liferay.portal.kernel.search.Hits class.

Hits hits = indexer.search(searchContext);

This object contains any search results in the form of an array (or list) of com.liferay.portal.kernel.search.Document instances.

Document[] docs = hits.getDocs();

OR

List<Document> docs = hits.toList();

The results display typically involves iterating over this array. Each Document is effectively a hash map of the indexed fields and values.

Facet Rendering

Facet rendering is done by getting Facets from the SearchContext after the search has completed and passing each to a template as defined by the FacetConfiguration:

Facet Details (Terms and Frequencies)

A Facet's details are obtained by calling it's getFacetCollector() method which returns an instance of com.liferay.portal.kernel.search.facet.collector.FacetCollector class.

FacetCollector facetCollector = facet.getFacetCollector();

The primary responsibility of this class is to in turn provide access to TermCollector instances primarily by calling the getTermCollectors() method, but also by getting a TermCollector by term value using the getTermCollector(String term) method. There will be a TermCollector for each term that matches the search criteria, as well as the facet configuration.

Rendered facet views (i.e. non-static facets) should result in UI code which allows dynamically passing facet parameters the interpretation by the implementation (see Facets as Filters). There are a number of examples in the /html/portlet/search/facets folder of the Search Portlet.

============================================================

Well, I hope that was useful information.

At the core of all content management lies search and so I'm really excited about the potential of this new search API. As we work on 6.2 and introduce even more innovative new search features, I hope to see Liferay become the most feature rich and extensible search integration platform available in the market.

Liferay has long had the ability to embed portlets in themes. This is convenient for implementors to get highly functional design into the look and feel of a site. In my years at Liferay I've seen and heard many different attempts at doing this with various levels of success. There are a number of things to consider when embedding portlets in the theme and the same method does not apply in all cases.

The original motive behind embedded portlets was for integrating WCM content or simple functional features such as Search or Language selection. As such the complexity of these portlets was understood to be low and without significant performance costs either due to not having any direct service calls (search & language selection portlets don't have any initial service calls), or having service calls which were highly cached (such as is the case with web content).

There is more than one technique for embedding portlets into the theme, and several different issues to consider when choosing any of those.

Method One: Using $theme.runtime()

This is by far the most common method. There are of course a few gotchas with this method.

1) it renders the portlet synchronously in the theme regardless of the ajaxable settings on the portlet ( via liferay-portlet.xml).

This means if the portlet is expensive to render because it a) has lots of data to process on render, b) uses synchronous web service calls, or c) calculates the fibonacci sequence up to a million, DON'T EMBED IT IN THE THEME! You're just killing the performance of your portal whenever that theme is on any page.

Rule of Themes #1: This is a good rule of thumb to follow regardless of embedded portlets or not: "Themes should be SUPER FAST! They should be the fastest component of your portal view at any given time."

If you don't follow this rule how will you expect the experience to be once you start actually putting more things on the page? It's a different topic entirely, but you should generally performance test your theme with no portlets on the page. Once you know it's blazing fast then proceed to test your portlets' performace.

But I digress! Back to portlets embedded in the theme.

2) portlets rendered in the theme don't have their js/css resources loaded at the top of the page with all the rest of the portlets. Liferay doesn't yet implement "streaming portlet" mode (this is optional in the spec, this doesn't mean Liferay is not fast, it's just a name they chose for the feature of calling the portlet's header altering methods separately from the rendering methods). So the issue with this is that if you embed a portlet that uses something like JSF dynamic javascript features and there happens to be more than one of this type of portlet on the page, the ordering of these dynamic resources may get fouled up and cause general mayhem to their behavior.

On the other hand, I would argue that such portlets are already, by their very nature, TOO complex and expensive to be embedded in the theme. They are quickly diverging from the rule above that the theme should be SUPER FAST!

Rule of Themes #2: "Only use $theme.runtime() to embed portlets that are extremely fast. If they make expensive service calls, MAKE SURE those are not calls that happen all the time, and MAKE SURE they use great caching."

Method Two: Ajax / Iframe!

This is a "currently" seldom used method, but one that I would HIGHLY recommend using! Why?

1) It's Asynchronous with the rendering of the theme. This is HUGE! It means that regardless how slow your portlet(s) is(are), the general experience of the portal will remain fast!

2) It means that you can still put that fairly complex portlet into the theme without killing the overall performace.

3) It doesn't suffer from the limitation of the resource loading that the $theme.runtime() method suffers from.

So how do I do it?

Ok, so generally what you do to embed a portlet via ajax or iframe is to create a portlet url (either server side or client side) and then request for it, placing it somewhere in the page. With an iframe you can of course interact with the portlet in place. If you use an ajax call and embed the output of the portlet directly it will of course cause page refresh.

Here is the code for doing it with the iframe:

One thing you'll note with this code is that it's making the assumption that the portlet is at the current page! But it's not on the current page yet!

There are a couple of ways of handling this.

1) The portlet is specifically designed to be in the theme and should be visible to anyone seeing the page.

In this case I recommend setting these attributes in your liferay-portlet.xml file:

These will allow your portlet to safely added to any page in the portal automatically (otherwise you may get permission issues because the user viewing the portlet may not have permission and all that jazz.)

2) You can't do as above because you still want to control permissions of the portlet using roles.

In this case, you'll have to add the portlet to something like a hidden page, from which you can manage it's permissions. When creating the url in the code above, you'll then use the page "plid" of that target page instead of the current one.

Well, I hope that helps!

I hope that I hear far less talk about $theme.runtime()issues and Liferay performance problems that end up being directly related to expensive operations embedded in the theme.

Spring's dependency injection framework is the picture of ubiquity in the java world. As such it aims to drive developers toward declarative, component driven design and lose coupling. But, it is largely a static container. The class space is flat and once started there is little means for dynamic change.

OSGi is "the" modularity framework for java. It supports dynamic change via strict classloader isolation and packaging metadata. It also aims to drive developers toward modular design. But although it is very powerfull it is also complex and learning it's intricacies can be daunting.

These two paradigms have recently undergone a merge or sorts in such that within OSGi frameworks one can use the declarative nature of Spring (via Spring DM and more recently standardised as Blueprint) to configure and wire together the interdependencies of components as they come and go from the OSGi framework.

So how does this project, Arkadiko come into play?

First allow me to explain briefly the current state of the world as I understand it (this is my personal assessment).

Given the vast number of Spring based projects and the most recent up-surge in the desire to benefit from OSGi's dynamic modularity it has become clear that there is a understandable difficulty in moving projects of any complexity to OSGi from a pure Spring architecture. The issue is that there are some design changes that must be made in moving traditional java applications, including Spring based ones, to OSGi. OSGi has a puritanical solution to a vast number of problems caused by the traditional architectures, but in order to gain from those solutions a considerable amount of redesign has to be done.

What to do?

There are several well known projects and individuals promoting methodologies and best practices to help undertake the considerable amount of work that such a migration could potentially involve.

A recent presentation by BJ Hargrave (IBM) and Peter Kriens (aQute) (http://www.slideshare.net/bjhargrave/servicesfirst-migration-to-osgi) defines the concept of "services first" migration methodology. This methodology suggests that the first steps in the migration is to re-design the existing architecture such that it's based on a services oriented design. They offer some insight into how that is accomplished paraphrased over several different articles about μservices (micro-services) and I won't go into detail here about all that. Suffice it to say that once this has been accomplished it becomes far simpler to

isolate portions/groupings of code that form logical components into modules which use or publish services in well defined manner to subsequently be turned into OSGi bundles.

Also, a core developer of the Apache Felix project, Karl Pauls, has recently released a library called PojoSR (http://code.google.com/p/pojosr/) based on some of the Apache Felix project code that itself does not implement a full OSGi framework container, but essentially provides an active registry which scans the class space for recognizable "services" and effectively tries to wire those together or a the very least provide a central access point for those services.

I have no in-depth knowledge on either of those two topics but I highly suggest taking some time to review both because they present options for anyone entertaining the notion for undertaking such a migration to OSGi, and several

options are always welcome.

Finally we come to Arkadiko!

Arkadiko is a small bridge between Spring and OSGi.

Arkadiko is an attempt to provide an new migration option, and borrows ideas from both of the above and tries to marry into those the concept of "quick wins" (such as immediate access to OSGi features) and "time to evolve".

Quick Wins: Often the light at the end of the tunnel seems awfully far away. When reviewing the scope of the migration, it may seem like an eternity before the benefits will pay off. So, Arkadiko gives you OSGi right now! How does it do

that? It simply provides a means to wire an OSGi framework into your current spring container. But that in itself doesn't help and so it also dynamically registers all your beans as services and at the same time registers a OSGi ServiceTracker for each one. It does this so that if matching services are published into the OSGi framework they are automatically wired in place of those original beans. You get OSGi right away! It also means that the OSGi framework has access to all your beans and can use those as regular services.

Time to Evolve: The other benefit is that you now have time to evolve your platform into OSGi as you see fit, and as time allows, moving components slowly from outside the OSGi framework, into the OSGi framework as re-design is completed by component. This way you gain the benefit where you can get to it quickly. Also, those nasty libraries which have yet to be ported or are still known to not live happily inside of OSGi framework can remain outside thr framework, consumed and wired in from within the container, until such a time as they evolve their own OSGi solutions.

Arkadiko is very small and very simple! It only comprises 5 classes in total (one is an exception, one is constants, one is a util class, the real work is done by two classes).

Adding Arkadiko to your existing spring configurations is as simple as adding a single bean:

<bean class="com.liferay.arkadiko.BridgeBeanPostProcessor">
<property name="framework">
<!-- some factory to get an instance of org.osgi.framework.launch.Framework -->
</property>
</bean>