Setting up a Data Warehouse with AWS Redshift and Ruby

Most startups eventually need a robust solution for storing large amounts of data for analytics. Perhaps you’re running a video app trying to understand user drop-off or you’re studying user behavior on your website like we do at Credible.

You might start with a few tables in your primary database. Soon you may create a separate web app with a nightly cron job to sync data. Before you know it, you have more data than you can handle, jobs are taking way too long, and you’re being asked to integrate data from more sources. This is where a data warehouse comes in handy. It allows your team to store and query terabytes or even petabytes of data from many sources without writing a bunch of custom code.

In the past, only big companies like Amazon had data warehouses because they were expensive, hard to setup, and time-consuming to maintain. With AWS Redshift and Ruby, we’ll show you how to setup your own simple, inexpensive, and scalable data warehouse. We’ll provide sample code that will show you to how to extract, transform, and load (ETL) data into Redshift as well as how to access the data from a Rails app.

I created a tutorial for Redshift and Ruby here and wrote about it on our new engineering blog.