During initialization the scheduler requires the bucket and folder in which to keep the actual scheduling details. Remember, each event is a separate file, therefore there is a need to save them somewhere. When to schedule is a simple datetime object.

Using S3 as a scheduler

S3 is a powerful tool and it can be used for more than elastic persistent layer. You can read more about it on hackernoon.com

In the following post I’m going to demonstrate how to use S3 as a scheduling mechanism to execute various tasks.

Overview

S3 alongside a Lambda function creates a simple event base flow, e.g. attach a Lambda to S3 PUT event, create a new file and the Lambda function is called. In order to create a schedule event all you have to do is to write the file you want to act upon on the designated time, however AWS only enables you to create recurring events using cron or rate expression, what happens when you want to schedule a one time event? You are stuck.

The S3-Scheduler library enables you to do just that, it uses S3 as a scheduling mechanism that enables you to schedule one time event.

How it works

Each event is a separate file, behind the scenes the library uses the recurring mechanism to wake up every 1 minute, scan for the relevant files using S3’s filter capabilities and if the scheduled time had passed move the file to the relevant bucket + key.

The library, in order to function properly has to know the answer to three questions:

The content to save.

Where to save it (bucket + key) → will trigger the appropriate Lambda function.

When to move it to the appropriate bucket.

Encoding details

The content to save is left unchanged, points 2 and 3 mentioned above are encoded in the key’s name and use | as a separator between the parts, for example to copy the relevant content on the 5th of August to a bucket called s3-bucket and a folder named s3_important_files the scheduler will produce the following file 2018–08–05|s3-bucket|s3_files-important . By keeping the meta data outside the actual content we achieve couple of benefits:

Speed up the process, no need to read the entire content in order to decide when and where to copy.

It allows the content to be binary, not only text based.

By using S3 filter capabilities it reduces the cost to fetch the correct files.

Easier debugging, just view the file name in order to understand when and where to copy.

Fin

Scheduling in the AWS serverless world is a bit tricky, right now AWS only provides CRON like capabilities, this post demonstrated a technique that can be used to create a more robust scheduling capability.