20081111

Today I'm going to demonstrate some of the functionality of MetaModel 1.1, which provides a datastore-transparent querying API for Java. Actually what I'll show you isn't all that new, we've been able to do most of the querying stuff for CSV files (and other datastores) roughly since MetaModel 1.0, but it seems to me that too few realize what a powerful tool we have with MetaModel, so I'm just going to repeat myself a bit here ;) Also, I'll demonstrate on of the new cool features of MetaModel 1.1 - column type detection, narrowing and value transformation.

For this example I'll use some real data: I've extracted this CSV file from the eobjects.org trac system. It contains a list of all the tickets (issues, bugs, tasks, whatever) that are active at the time of writing... You'll notice if you take a look at the file, that it's not exactly a simple CSV file - a lot of text spans multiple lines and there are quite a lot of different data types.

Ok, so let's get started. I've saved the data to a file: tickets.csv. Now I want to read the file and let MetaModel generate a metadata model based on it. I will also let MetaModel try and detect the column types (since CSV only contains text-types natively) which will automatically transform all values to the most fitting and narrow type that MetaModel can find (this is indicated by the 'true' parameter in the code below). Here's how we get a hold of the datastore model for the file:

Once we have a DataContext object we are ready to go for our datastore-transparent way of querying. What we do is: We get a hold of the schema of our CSV file and we locate the table of interest. Since CSV is a single-table datastore type, getting the table of interest can be done in two ways:

Now we can go ahead and investigate the structure of the CSV file. Since we turned on automatic column type narrowing you will see that the 'ticket' column have been converted from a text-based column to an INTEGER type. Also, as MetaModel can verify that 'ticket'-values are never missing, it is asserted that the column is not nullable:

//SELECT _reporter AS rep_name, COUNT(*) AS num_tickets FROM tickets GROUP BY rep_nameColumn reporterColumn = table.getColumnByName("_reporter");q = new Query().select(reporterColumn,"rep_name).selectCount().from(table).groupBy(reporterColumn);

To execute the queries is very simple - just ask your DataContext object to execute the query object. MetaModel will then analyze the query and process the result behind the scenes:

Thank you Kasper for your update but when I use CsvDataContext in 4.6 version to execute query it doesn't offer the same feature.? may be I am missing something here. do you have any code snippet that illustrat this in details.

Hmm I don't think the CSV adaptor in MetaModel ever did auto-conversion to numbers. Not sure why you have that impression?

The SUM function is capable of parsing the strings as numbers, but I believe it will throw if a non-number occurs in the data set. A CSV gives no guarantees like that so our choice so far was to not try and assume anything.

You may however decorate your CsvDataContext explicitly using the conversions available in org.apache.metamodel.convert.Converts.

Do we support any logical/ mathematical operator other than "=" operator in Joins as we have below query that includes "<>" operator which gets failed in parser gets failed in finding the indexOf "=" from private FromItem parseJoinItem() function in FromItemParser.java. So it seems exiting implementation doesn't supports any other operator than "=".