Main menu

Post navigation

Introduction

In today’s post, we’re going to be doing the fun part of putting everything together to get our project into actions. We’ll be uploading our code to S3, creating our Lambda function, and then creating an event subscription that will trigger the function when we’ve uploaded an mp4 file to the bucket being used. As well already are aware, this should then begin processing of the file and create a transcription using Transcribe.

Uploading our Code to S3

When creating a Lambda function that runs Go code, we need to provide a zipped file of the Go code. This can either be carried out via an upload of the file on your development system, or alternatively by uploading the zip file to S3 and providing the location information.

Of the two options, the latter is the most flexible for us since we can update our code and upload a new zip file to this location without needing to change any configuration.

Note that I’m doing these steps on OSX and using Bash. For Windows systems you may need to use slightly different context.

Now, if you login to the AWS console, and take a look at the S3 bucket, main.zip will be there. Click on the file to get its properties and copy the URL under Link into the clipboard. We’re going to be using this in the next step.

Creating the Lambda Function

From the AWS console:

Click Services on the black bar, and then Lambda

Click Create Function

Now, in the Author from scratch section, enter the following values.

Name : transcribe

Runtime: Go 1.x

Role: Create new role from template(s)

Rolename: transcribe_role

Click Create function

The Designer window appears, featuring the transcribe function, and with a role already defined to allow access to Cloudwatch.

Next, we tell Lambda where to get the code.

In the Function code section, select:

Code entry type : Upload a file from Amazon S3

Runtime: Go 1.x

S3 link URL : <paste the link that you copied into the clipboard in the previous steps>

Hander : main

Create an S3 Trigger

Now that the basic function is in place, we need to configure it for our specific needs. its As previously mentioned, we want it to run when a new .mp4 file is created in our S3 bucket.

On the left hand side, click S3

This will add it onto the console, and a Configure triggers dialog will appear.

Change Suffix to .mp4 and select Add

Click Save

Give the Lambda function access to Transcribe

We need to give transcribe_role permissions to access amazon transcribe in addition to S3 and Cloudwatch

Click Services, IAM

Click Roles

Click transcribe_role

Click Attach policies

Find and put a check next to AmazonTranscribeFullAccess

Attach Policy

With this complete, the policy is now attached to the role.

A return to the Lambda function will also now show Amazon Transcribe on the right hand side, indicating it has permission to access this service.

Upload our movie

We’re ready to test the functionality out! Let’s find an mp4 video file and upload it. In my case, I’ve a file, movie.mp4, which is going to be used.

As before, you can choose either to manually upload the file via the AWS console, via the AWS CLI, or using the Upload program we created earlier.

The Results

The function should kick in pretty much as soon as the file has finished copying to S3. Let’s have a look from the console by going to Machine Learning, Amazon Transcribe.

And it’s there. We’ve now got an end-to-end mp4 to transcript file.

Conclusion

In this post, we’ve created our Go package, uploaded it to S3, setup the Lambda function, made an event subscription for when an MP4 file is uploaded to our bucket, configured the role associated with the function to allow it to use Transcribe, and verified its operation.

At this point there are further steps we could think of.

During the time of putting this series of blogs together, AWS added CloudWatch events for Transcribe. We could write another Lambda function, which ran once either a Transcribe completed or failed event occurred. Using this, we could do things like notify us when a job has completed, or even do something like converting the output to .srt format.

Add an endpoint and some code to allow us to query the status of one of the jobs.

We could even look into using a completely different way of getting a file into S3, such as passing a link to an MP4 file on a website and getting the Lambda function to download the file and store directly in S3 prior to creating the job.

Via the above, we could also look at adding in additional event sources, such as via API Gateway.

There’s lots and lots of possibilities, and a forthcoming blog series will cover one or more of these.