How to generate PDF in AWS Lambda

One of the advantages Serverless architecture has is that each function execution has its own environment, hence it scales infinitely well on a function level. It makes Lambda a good solution for the long running and resources heavy computing tasks, such as generating PDF.

In our AWS development environment running on t2.micro, we had problems with server crash due to out of memory when it is running wkhtmltopdf to generate PDF, as we are moving towards serverless, it’s a good candidate to dress in the new fashion.

There are a few posts online regarding how to do it but none of them seems complete, perhaps due to the fast-evolving of serverless, they are out of date, I figured I should write this to help people like me who were lost.

We use Serverless framework to manage the stack, the event source is API Gateway, ie the function will be fired when a request hits the API endpoint, this is the serverless.yml for the function part.

And the function code

In the function code, there are a couple of places you need to pay attention to.

isBase64Encoded: true, this is to indicate if the applicable request payload is Base64-encode. Obviously, you need to make sure your response body is base64.

process.env[‘PATH’], this is the key to the success, we all know in order to run wkhtmltopdf, the running environment needs to have wkhtmltopdf executable / binary in the PATH, it becomes tricky with AWS Lambda setup, you need to 1. package the correct binary, 2. set the correct permission & upload to the correct location. 3. configure API Gateway

Configure API Gateway

This one also got me for quite a few hours, because the response is application/pdf, API Gateway must be configured to support the response type.

I had to add application/pdf manually in AWS API Gateway dashboard, there is a plugin https://www.npmjs.com/package/serverless-apigw-binary, but again it does not work for some reason. After it’s added you must save the changes, and you MUST deploy the API changes.

Without making this change, you most likely will find the PDF generated not readable.

Conclusion

It again proves that the AWS Lamba learning curve is steep, community support is still not great. But hey isn’t it why we need coders like you and me, let’s make serverless community stronger!

UPDATE 26/09/2018:

Originally I had */* as binary type, it worked but later on I had a problem with API Gateway converting PUT request body from json to base64 encoded string, you don’t want to API Gateway to convert to base64 no matter what the content type is, you only want it when it is application/pdf. One thing very important is API Gateway relies on the request header Accept: application/pdf to determine convert or not. So make sure you have the correct header set.