Streaming File Uploads to Storage Server with Node.js

In this tutorial I will show how to stream a file uploaded by user to a file storage server(such as Amazon S3 or any other custom server) without storing file in a temporary directory. The advantage of this is that in case there is a lot of parallel uploads of large files then it prevents your filesystem from hanging. We also won’t be storing the file in primary memory therefore we are also preventing memory getting hanged.

We don’t let the user to directly upload files to storage server when we want to verify user authentication and also modify or scan the files.

While the time of writing this tutorial, the latest stable version of Express framework is 4.13.3. Since express version 4, express doesn’t inherit connect module anymore. In earlier versions, express used connect for middleware and route definition functionalities and was only responsible for rendering views and some other functionalities.

Now Express 4 has its own middleware and route definition functionality. Express 4 now directly inherits from built-in HTTP module of Node.js. Earlier Express inherited connect and connect inherited the HTTP module.

But the good news is that you still will be able to use middleware and callback functions provided by connect because the pattern of these functions remain the same in express.

Parsing “multipart/form-data” Content-Type

Express doesn’t parse body of POST requests. Earlier versions of express used the body-parser module that came built-in with connect to parse body of POST requests. But now body-parser needs to be installed separately because express doesn’t use connect anymore.

But the problem is body-parser doesn’t parse HTTP POST request body if its type is multipart/form-data. Therefore body-parser is useless when you are using multipart/form-data to upload files.

connect-multiparty stores the uploaded file in a temporary directory. Whereas multiparty also provides readable stream objects for every uploaded file to read their content. Actually connect-multiparty is built on top of multiparty. So for our use we need multiparty module.

Streaming Uploaded File

Let’s start building the server to stream uploaded files to a storage server.

In the above code the part event is triggered when a file is encountered in the body of the request. part parameter of the callback is a Readable Stream that reads content of that particular file for which the part event was triggered.

We are then constructing an HTTP POST request using request module to send the file to the storage server. The second parameter of the form.append method takes a Readable Stream and pipes the data to the writable stream, which continuously sends data to the storage server without saving the data anywhere. Whenever a chunk of data arrives from the client for the file then data event of the readable stream is fired. The buffer that holds the chunk is removed automatically when the data event’s callback is finished executing,

As we don’t know the size of the files therefore we are setting transfer-encoding:'chunked' header instead of content-length header.

To run this code make sure you are running a storage server at URL http://localhost:7070/store.

If you are streaming the file to Amazon S3 then you will need to use `x-amz-decoded-content-length` header. More on it here.

To create code blocks or other preformatted text, indent by four spaces:

This will be displayed in a monospaced font. The first four
spaces will be stripped off, but all other whitespace
will be preserved.
Markdown is turned off in code blocks:
[This is not a link](http://example.com)