Lean & Maintainable Reports with Heroku Dataclips

Many organizations are hungry for data to help the team understand how customers
are using their services. On a recent project the business owners needed a
report but intended to hold off building an admin-y report-packed hub until core
customer-facing functionality was complete.

All reports take effort: Writing the database query, maintaining it as data
structures change, and allowing business users to customize the report all cost
time. Layer on supporting different output formats & security and one can see
the gulf between cobbling together a casual database query and building a
reporting tool with ongoing value.

The ideal solution for the client also needed to be self-service — let’s face
it — no one wants to be the guardian of a jumble of SQL scripts and field
ad-hoc requests to run them.

Although the Dataclips interface features a handy built-in editor, one doesn’t
have to immediately jump to writing SQL. One point to bear in mind is keeping
future maintenance costs small. If a database column is renamed during a
migration this will break your report and a CSV-starved colleague will seek you
out.

One approach to protect ourselves from brittle reports is to generate the SQL
query with an ORM and test it at the same time. This has a double benefit.
First, testing helps us verify a complex query returns the correct data. Second,
a breaking migration will cause the test to fail, in turn prompting our diligent
selves to update the query and the dataclip.

# app/reports/reports/bookings.rbmoduleReportsclassBookingsdefto_sqlquery.to_sqlendprivatedefqueryBooking.joins(:user).select("bookings.id").select("users.name AS client_name").select("users.email AS client_email").select("bookings.price AS price_paid")endendend

The public interface of Reports::Bookings deliberately exposes only #to_sql,
rather than the ActiveRecord Relation object from #query. That’s because the
raw SQL is all that’s required to satisfy the Rake task.

There are benefits to the spec using the raw SQL too. It helps us verify the
query can be executed standalone. Also the expectations are then written against
an array of hashes, which are closer to the rows of strings in a CSV output,
without the hassle of executing the Rake task and reading a CSV file in the
tests.

What do you think would make Dataclips even more useful? Top of my wishlist
would be updating dataclip queries via the Heroku CLI.

Picture your deployment script: Pushing changes to Heroku, running migrations,
and updating dataclips could all be tidily bundled together. The upside is that
reports and the database schema should remain in sync.

Preferably the CLI would allow upserts — providing one combined command to
create dataclips for new queries, and update dataclips for modified queries.

A minor quandary would be associating SQL files with dataclips. Perhaps the
dataclip ID could be derived from the account name + SQL filename, or a YAML
manifest could map the two sides. Hopefully this is a stumbling block worth
navigating!

Want to level up your testing game?
Learn about testing Rails applications and TDD
in our new book
Testing Rails.
The book covers each type of test in depth,
intermediate testing concepts,
and anti-patterns that trip up even intermediate developers.