Google has announced a major update to BigQuery, adding real-time support to its service for analyzing large amounts of data.

BigQuery has already been updated this year to provide support for big scale JOINs and GROUP BYs and unlimited result sizes. The new improvements add the option of querying subsets of the latest data, more functions and browser tool improvements alongside the ability to run queries in real time. The browser has been improved with a buttons for common tasks in the query history panel, and more information made visible about the queries.

(click in screenshot for larger version)

The real-time support has been added through a simple API call, tabledata().insertAll(), that lets you store data as it comes in and query it instantly. Talking about the improvements on the Google Developer’s Blog, Felipe Hoffa, Developer Programs Engineer at Google said this feature is ideal for time sensitive use cases like log analysis and alerts generation, and that you can use it simply by calling the new endpoint with your data in a JSON object (with a single row or multiple rows of data).

Streaming data into BigQuery is free until January 1st, 2014, after when it will be billed at a flat rate of 1 cent per 10,000 rows inserted. The traditional jobs().insert() method will continue to be free.

The second improvement is the ability to define queries that only scan a range or spot in the previous 24 hours. Traditionally BigQuery has always done a "full column scan" when querying data, while the new syntax will allow you to focus only on a specific subset of the latest data, so lowering the costs of queries. You can query only the last hour of inserted data, or what was inserted before that hour, or get a snapshot of the table at a specific time.

Google has also added new window functions, namely SUM(), COUNT(), AVG(), MIN(), MAX(), FIRST_VALUE, and LAST_VALUE(), along with the statistical functions COVAR_POP(), COVAR_SAMP(), STDDEV_POP(), STDDEV_SAMP(), VAR_POP() and VAR_SAMP(). Window functions let you carry out calculations on a specific partition, or "window", of a result set.