Using Kinesis and Lambda: A Review of the Cost of Big Data With Amazon Web Services

Written by John Oxford

At SOLTECH, we’re always thinking of ways to be more efficient with our clients’ time and money. A topic of discussion lately has been focused around using Kinesis and Lambda. Is it best to use both versus several severs running 24/7? We understand wanting to know what is best for your project, so in this article, let’s review the cost and benefits of big data with Amazon Web Services (AWS) – specifically using Kinesis and Lambda

Quickly becoming one of the most common approaches to processing big data, Amazon Web Services’ Kinesis and Lambda products offer a quick and customizable solution to many companies’ needs. The ability to both vertically and horizontally scale in this environment either automatically or with a couple of clicks, is something that Big Data developers love.

Using Kinesis and Lambda

Kinesis offers a great approach to allowing your data to be consumed by many different applications and sources, all of which can work independently of one another in completely different ways if you so choose.

On the processing side of things, Lambda allows you to code serverless bits of logic, in a few different languages, that can either consume these Kinesis streams or other event sources like Amazon’s SNS for messaging. Chances are you have already heard of these services being used by someone or some company you know, but may be concerned about the price tag.

The Benefits of Amazon Web Services

The great thing about Amazon Web Services (AWS) is you pay for what you use, you aren’t ever forced into contracts or agreements where you may not be utilizing every penny you’re spending. This makes using Kinesis and Lambda one of the more cost effective tools.

Estimating Kinesis Costs

With Kinesis, this is no different, you pay for the amount of shards you use and for how long you use them. Shards represent throughput units; you calculate the number of shards you need by how much data you expect your Kinesis stream to handle as well as how many consumers of this stream you need.

If you need to increase or decrease the number of shards, you can now easily do so in the AWS console. One shard supports up to 1MB/sec of input data; 2MB/sec of output data as well as supporting 1000 records per second.

The costs associated per shard are $0.015 / shard hour and $0.014 / 1 million PUT payload units (these are always rounded up to the nearest 25KB).

At the lowest possible configuration, that is 1 shard with less than 1 million PUT units per month, you would be charged per month (assuming a month with 31 days) $11.16 for the shard plus a negligible amount for the PUT units. In an actual scenario where you may collect data across thousands of devices every so many seconds, this cost will quickly increase.

A Real Example of Kinesis Costs

Let’s break this down to help better understand the costs. A great example is, say you have 200 devices that in aggregate send an average of 10000 records per second. Each record is 250 bytes with one consumer of the stream (remember the 2MB output per second), you would need 10 shards. This leads to a cost of:

Your new monthly total would merely be $40.98, that’s almost an order of magnitude of difference. Now, understandably, the ability to aggregate the data in such a manner may not be possible. However, it would be well worth your time trying to package up your data as much as possible.

Estimating Lambda Costs

The costs associated with Lambda depend on the amount of memory you dedicate to it. In addition, it depends on the time it takes to run (charged in durations of 100ms), and the number of invocations per month.

Generally, you want your Lambdas to run no more than a couple of seconds but there is a tradeoff in cost as the more memory you allocate (which also increases its CPU power proportionally) increases the cost per 100ms that it runs.

This time associated cost is charged in GB-seconds. The good news is that every AWS account gets so many free seconds per month of Lambda execution time. (This is depending on your configuration.) Every account gets 1 million free requests per month.

A Real Example of Lambda Costs

Here’s a great example, let’s say your Lambda on average takes about 2 seconds. With our 3 shards from above this will equate to 3 Lambdas running concurrently (one per shard). Therefore the request rate is 1.5 requests per second (3 shards / 2 seconds). At this rate, 4,017,600 requests per month. The pricing level for a 256MB Lambda is $0.000000417 per 100ms with 1,600,000 free seconds per month. Requests are charged at $0.20 per 1 million.

Conclusion

Thus, for a half of a second in processing time savings, you spend more than 10x the amount. Such a decrease in time may actually be worth the extra cost as many projects that solve Big Data problems often need to report in as real time as possible.

Whatever solution you’re designing, writing, or even maintaining, it’s well worth your time in estimating your costs. Remember, minor changes can lead to a difference in price tag.

The Secret to Efficiency: Custom Software Solutions

Aug 09, 2017

Today’s markets are highly competitive. The only way to stay ahead is to create a business structure that facilitates productivity, spends money wisely, and creates a work environment that is conducive to happy, successful employees. …