Presented for your enjoyment, here are our total SQL ConstantCare® hosting costs (including both production and development environments) from AWS Billing Explorer for a rolling 30-day period as of when I’m writing this post:

Relational Database Services: $1,418.59 – even using relatively small AWS Aurora PostgreSQL clusters, the database layer is still the most expensive. Production is a pair of db.r4.xlarges (4 cores, 30GB RAM) and dev is a pair of db.r4.larges (2 cores, 15GB).

DynamoDB: $6.80 – key/value database where we store the list of incoming files and what we’ve done with them. Databases don’t get a whole lot cheaper than that.

I joke with clients that managers are magnetically drawn to the largest number to ask, “How can we make that number smaller?” In this case, roughly 90% of our costs come from the relational databases, so that’s the obvious focus for cost reductions.

Side note – DBAs: if you want your new server to have monitoring software, just include that as a line item in the purchase, but also include the SQL Server licensing costs. The manager will focus on that $50,000 licensing cost and start asking questions there instead of asking about your $2,000 Idera SQL DM, Quest Spotlight, or SentryOne license. (This same trick works with our own AWS bill for this same rolling 30-day period, where I rarely look at the SQL ConstantCare hosting costs – the top costs in this 30-day period were the $4,454.93 cost of running VMs for last month’s Mastering Index Tuning class, hahaha. But back to the focus of this post, SQL ConstantCare’s costs.)

Cutting RDS costs with Reserved Instances

By default, RDS instances are paid per-hour with no up-front costs and no commitment. That made sense when we first started the project because we had no idea what our performance needs would look like (and indeed, we changed instance sizes a couple times along the way.)

As your project matures and you can start to predict hosting horsepower requirements, you can switch to Reserved Instances: making a commitment as to how much you’ll use, and scheduling a regular (discounted) price for that resource. Our production servers are a pair of db.r4.xlarge’s – each with 4 cores and 30GB RAM, priced at a whopping $0.58 per hour. We can commit to reserving them and get a discount:

Reserved Instance pricing for db.r4.xlarges

I think a 3-year term doesn’t make sense here because of what’s coming with Aurora Serverless, but more on that in a second. A 1-year term is a no-brainer though – we can cut our costs by 34% right out of the gate by committing to a 1-year term, or get an even bigger savings if we prepay the year’s bill outright.

Keep in mind that the costs shown in that screenshot are for just one instance. If I prepaid a 1-year term for our production and development clusters, we’d be looking at a credit card charge of around $8,310 – doable, just something you want to think about when you’re a small business. (Without getting to finance-y about the time value of money, the zero-upfront method will cost me about $1,676 extra over the span of a year.)

For the production cluster, a 1-year commitment with no upfront costs was a no-brainer given our current project status – saving us thousands of bucks over the course of a year. I hit Purchase on that while writing this post.

Other ways to cut database costs

RDS metrics from production

Here are the less-common ways of doing it, arranged in a rough order of easiest to most complex:

Switch to AWS Aurora Serverless – where the database doesn’t even run unless you’re querying it. Unfortunately, that’s only available in MySQL flavor right now, but the instant it’s available for PostgreSQL, we’ll probably switch the development environment over to this for starters and whack a couple hundred bucks out of our bill right away. (That’s why I didn’t reserve instances for the dev cluster.)

Downgrade production to smaller RDS VMs – right now we’re running 4 cores, and based on our CPU workloads, I’m pretty confident we could drop even further to 2 cores and still churn out customer emails rapidly. However, when I do ad-hoc queries for analysis, I still want fairly quick response times, and my ad-hoc queries are terribad. I’ve become that manager I’ve always complained about. (Richie makes sure I only have read access to production though.)

Host PostgreSQL ourselves on even smaller VMs – given that our workload is really bursty, we might even be able to get by on $70/mo t2.larges (2 cores, 8GB). Hell, we could prepay for a year and get it down to $40/mo. However, then we’d have to start managing our own PostgreSQL, and there’s a labor cost to that.

Store data in even cheaper databases – I tell you what, when you look at that $7 bill for DynamoDB and realize that it includes both production and development, you start wanting to put a whole lot more data in there. For example, as we start analyzing query plans, it’s obvious that the XML plans don’t belong in a (relatively) expensive database: we knew that already from PasteThePlan, where we store metadata in DynamoDB and the plans in S3. Furthermore, the business logic of analyzing them doesn’t belong in the data layer either.

I find it hilarious that the relational database portion of our bill is the most expensive, while simultaneously being the least flexible. It’s just like every other project I’ve ever worked on: relational databases are expensive, and that’s why we’re all here, dear readers. Thanks to the cloud, relational databases are getting less expensive – but for production-scale deployments, they’re still not getting truly cheap in the way serverless code is.

Forgive me asking a question out of ignorance, but I haven’t played a lot in the cloud space and know only a bit about Azure – I just wanted to posit a scenario and have some expert feedback, even if it’s not particularly relevant once you’ve committed to a contact 😉

Is it possible to scale up the compute side of the databases, so you’re paying for a lower tier compute while you’re not working on it, and then once you’re done scale it back down? I know this sounds like what Aurora will eventually be automatically – but with the product as it currently is?

The other thing I’d have thought about is using one of the analytics packages to perform compute functions on a service outside the database – although, given there’s a cost in time and probably traffic in getting data in and out I don’t know how feasible that would be.

“Is it possible to scale up the compute side of the databases?” Yes, but not rapidly – for example, in our business, we have large changes a couple of times per day. In today’s PaaS databases, you don’t really wanna shift up/down a couple times per day.

About using one of the analytics functions – yep, absolutely, that’s what the last section is about (storing data in cheaper forms of databases.)

It’s really fun to dig into this stuff – you’re asking great questions that mean you’d really have a lot of fun looking at videos from Amazon’s re:Invent, Google’s Cloud Next conference, or Microsoft Build. All those conferences make their videos freely available online, too.