Pentaho Data Integration Mongo Steps

Another new update for all you Big Data fans out there. We have just completed some additional steps for Pentaho Data Integration to work with your data stored in MongoDB.

So what are these steps and what do they do?

MongoDB Map Reduce

The MongoDB Map Reduce step does exactly what you’d think. It allows you to execute Map Reduce functions against your MongoDB Collection to extract data. This step is a lot more flexible that the MongoDB Aggregation Framework however it might not be the best for performance. Its possible output the result as individual fields or a single JSON document

A sample Reduce FunctionOutputting the result as individual fields

MongoDB Lookup

The MongoDB Lookup step acts much like the existing lookup steps in Pentaho Data Integration but with a small difference, you can lookup data from collections directly from your MongoDB database! At the moment there is no easy way to do this in PDI unless you can write Java code let alone possible in MongoDB itself.

Defining the key for the lookup and fields to get from the collection.

This is just a small contribution to our long list of Big Data steps planned for release in the near future.

About the author

Harris Ward is the Managing Director of Ivy Information Systems. He has been working in Open Source Business Intelligence for a number of clients developing solutions and providing training spanning 10 years