Saturday, January 21, 2012

Many Google
AppEngine developers have been waiting for the Full Text Search feature,
especially coming from Google the biggest search engine on the Web. I
was quite happy to see that Google team is working on it as you can
check in the Google I/O 2011 session : Full Text Search by Bo Majewski, Ged Ellis . As far as I know the very promising indexing service is not yet available.

In
this article I will explain how you can provide some kind of full text
search in your application using services available App Engine services.

In
my specific use case I do not ask for a lot of feature, I just need to
have simple search a string in various attributes of my entities
independently of the case, and possible special characters (such as è,é,
... ). I am far of being an expert of Google Datastore API but I did
not find any simple way to achieve this directly using the Java API.
What I have done to solve this issue is to duplicate a part of my data
into the Google Cloud SQL to use the MySQL fulltext search capabilities.

If you look in the Datastore API, or even JDO or JPA
you have no simple way to look for all the articles that are related to
Triathlon, or Database, or Entities. Google DataStore does not support
clause where with a "OR" between different fields; and I do not want to
mention the fact that it is not possible to ignore the text case in a
simple way.

This is why we need to have some full text
features. Some of you are surely thinking about using Apache Lucene to
do the trick, and yes it is possible. You can use for example the
GAELucene project : http://code.google.com/p/gaelucene/. I use another approach, may be less advanced in term of "indexing/searching" options but sufficient for my use case:

I store the text values on which I want to do some search in Google Cloud SQL and use the Full Text features of MySQL.

When
using Google AppEngine, the Cloud SQL instances are accessed using a
specific driver and configuration that we will see later. For now, we
are still in development environment, this is where you have to use your
local MySQL instance.

In this specific use case we
will copy in a table the two fields and add a new unique key based on
the entity key. So here the SQL to create this:

I manage the connection directly in my code (I have not looked yet if I can use datasources/connection pool in the context of Google AppEngine)

Line #3: registering the AppEngine driver that is responsible of
managing the connection, expecially work in development -local MySQL- or
production mode -CloudSQL-.

Line #4 : Get the connection. It is interesting to mention that in
development the connection URL is grabbed from the environment variable
Drdbms.url you have set previously. We will see later how we move this
to the cloud. This is the magical part of the AppEngineDriver that
manages different connection types Local MySQL or CloudSQL depending of
the context

If you want to test this code just put this lines in a
servlet to "sycnhronize" the data from datastore into the MySQL table.
Obviously this code is just here for learning propose and should be
integrated in a better way in a real application; starting with pushing
the data in the database when entities are created/updated (and deleted
;) ). The sample code available from GitHub contains these methods.

5. Implement a search method

The goal is simple return a list of entities returned by a simple search criteria :

In this method, the system connect to the
database and then execute a query to search data using any type of
SQL/MySQL query. In this exampe I am using the full text function with
the "WITH QUERY EXPANSION". You can obviously use any type of SQL queries for example simple LIKE statement if this is enough four your application.

With this approach when I search for :

"database" : the method returns all the articles concerning database, mysql, RDBMS independently of the case.

"index" " the method returns all the articles talking about indexing/indexes or search.

6. Deploy to GAE

Once you have created your application, and activated and configure your CloudSQL instance (here), you can deploy your application and enjoy an easy way of using Full Text Search with GAE.

Conclusion

In
this article I explained how you can use Google Cloud SQL to easily
supports Full Text Search queries, based on the Full Text support of
MySQL.

The code snippets that I have shared in this
article are really basic and not ready for real life usage but still a
good starting point. For example I have been using this in my
application with GAE Queues to manage my indexes on larger volume of
data.

Thursday, January 19, 2012

eXo Platform 3.5 provides many extension points and API for developers, allowing them to create very cool stuff.

I have developed a small extension that allows any user to associate his Twitter account to his eXo Platform account. This extension simply post on your Twitter account when you write a message with a special hashtag (#tw).

This is a very quick development that I have done while waiting for my kids, so I still have things to integrate to provide complete feature, but this is a good example to show how you can extend the platform.

You can view it in action in this video:

You can download the source code and the binaries from this GitHub project. I hope to find some time to complete the feature, and document this use case.