MidCOM search engine in production

M-P.org needs 242 MB (Apache+PHP+data) to do a complete reindex, which means
that PHP used around 220 MB of memory for this. It didn’t bother the indexer
much, it stayed at around 20 MB memory load.

During the complete reindex, which ran around 10-15 minutes we indexed 1,583
articles and perhaps one or two hundred attachments.

Not bright, but could have been worse, especially if a reindex should be
neccessary very rarely.

Once the initial reindexing has been run all new changes in the Midgard database will be automatically indexed. This means that the search tool will always have fresh results, even if new documents have been added, or old ones have been changed or deleted.

Other advantages from this new search system include the fact that the search results are component-aware, enabling us to display thumbnails for image results etc. The searches are also aware of the Midgard permission system, ensuring that users will get results from all documents they’re allowed to see. The possibilities of the Lucene search syntax are also promising.

Today we secured funding from two clients for developing an integrated search engine into the Midgard CMS. The project will be undertaken by Torben Nehmer and will also improve Midgard's metadata capabilities.

About Midgard

Midgard2 is a content repository library that can be used in both web and desktop applications. It is built as by Midgard Project, an international free software community. I've been an active part of the group since its beginnings in late 90s.

Thanks to GObject Introspection, the Midgard2 content repository can be used from almost any programming language, including PHP, Python, and JavaScript.