Block-Level File Copying and the Cloud

One of the more consistent remarks we hear from our readers here at Cloudwards.net is how much faster Dropbox syncs files than some of our other best cloud storage services, including Box, Google Drive and OneDrive. There’s a reason for that, and it’s called “block-level file copying,” also sometimes called “differential sync” or “delta sync.”

Sync speeds are a big deal, particularly for those that use cloud storage to improve work productivity. That’s because faster sync speeds mean users can inch closer to near real-time collaboration with their coworkers and produce content more quickly, while not stepping on each other’s toes.

In this article, we’ll discuss the basic mechanics behind block-level file copying, then talk about some of the cloud storage and online backup services that use this technique. We’ll also talk some about why some services don’t use it, and why.

Basics of Block-Level File Copying

The concept of block-level file copying is a pretty simple one to understand. When you make a change to a file, rather than copying the entire file from your hard drive to the cloud server again, only the parts of the file that changed (called the delta) get sent. That’s it.

When scanning your files, the cloud storage service evaluates the file in terms of blocks. For example, Dropbox partitions every single file it stores into 4MB blocks. Each block is hashed with SHA-256 and a list of these hashes gets stored in what’s called a “blocklist” for reference.

Prior to uploading a file, the Dropbox sync client checks whether or not hashed blocks are already stored in the server containing the blocklist (which is a separate server from where your content is stored). If it finds a block or blocks missing, it returns a call to go ahead and upload just those blocks that are missing.

Of course, if the file isn’t yet stored on the Dropbox servers, it’ll have to upload the entire file. Initial file uploading and downloading is where you might see competing services claim a speed advantage over Dropbox. Such claims are specious at best.

The below table shows some file transfer tests we ran a while back for our Dropbox vs Google Drive vs OneDrive comparison article. These tests were performed using a 250MB compressed folder over an Internet connection with up- and download speeds of 12Mbps and 160Mbps, respectively.

Upload Time:

Download Time:

1. Dropbox

4:32

:20

2. Google Drive

3:07

:21

3. OneDrive

3:45

:15

Google Drive and OneDrive both outperformed Dropbox when it came to uploading our test object, while download times were more even.

That’s only part of the story, though. With any subsequent changes made to that object, both Google Drive and OneDrive would just copy the entire object all over again, taking over three minutes to do so. Dropbox, however, would only transfer the parts that changed.

To see how much of a difference that really makes, we made a small change to our compressed file by deleting one of the files inside of it. Then, we checked to see how quickly that change was reflected in the Dropbox cloud. The answer was 13 seconds.

The reduction from minutes to seconds is a huge advantage for those who need to work side-by-side on the same file with coworkers in different locations. It also reduces the bandwidth consumed by the sync process.

Cloud Services That Use Block-Level Copying

Given such advantages, you might expect that many other cloud storage services would be following in Dropbox’s footsteps. Surprisingly, though, almost none of the major cloud storage services have.

There are only two traditional cloud storage services we’ve identified that incorporate block-level copying for all file types: Egnyte Connect and Memopal.

Egnyte Connect ranks as one of our favorite cloud storage platforms for SMB users, along with Dropbox Business, and its speedy sync capabilities are one of the many reasons why we’ve also compared the service to another competitor in our Egnyte vs Box article.

We should also point out that OneDrive, while it doesn’t support block-level copying for all file types, does support it for Microsoft Office Documents. Previously, it had been noted that Microsoft had differential sync on its roadmap for Q2 2017. However, Q2 has come and gone, the feature was never rolled out and it no longer shows up on the roadmap.

In fact, all of the best cloud backup services that we cover process and move files block-by-block. The list includes Backblaze, CrashPlan, Carbonite and IDrive. Doing so greatly reduces bandwidth needed and the amount of time it takes for backups to run.

Of these backup services, only IDrive also includes file-syncing capabilities. Curious whether or not IDrive’s developers had incorporated block-level copying into the backup tool’s sync processes, we reached out and were told they had.

It’s interesting to note that, while not a traditional cloud storage service, IDrive has a more advanced approach to sync than most of the cloud storage field, something you’ll see reflected in our IDrive review. Some free cloud storage services do support sync as well, but not as sophisticated as IDrive.

Block-Level Copying and Zero-Knowledge Encryption

While block-level copying is certainly advantageous, some services simply can’t add it. The reason stems from the fact that in order to perform block-level analysis on a file, the service must be able read it, which it can’t without knowing the file’s encryption key.

In particular, we’re talking about zero-knowledge services. With a zero-knowledge provider, only you, the account holder, hold the encryption key. That doesn’t excuse Box, Google Drive or OneDrive, which aren’t zero-knowledge services, but it does let several of the best zero-knowledge cloud services off the hook. That list includes Sync.com, SpiderOak, MEGA and, to a degree, pCloud, which offers optional zero-knowledge encryption with its Crypto feature.

For the same reason, if you use a client-side encryption service like Boxcryptor to encrypt your files before sending them to Dropbox or Egnyte, neither service will be able to perform block-level copying and will have to resort to to copying the whole file.

Final Thoughts: Speed or Security

All the above means that you as a user have one of two choices open to you: either you go with block-level copying or with zero-knowledge encryption. Do you want to keep a cloud storage service from being able to read your files or do you want to be able to sync content as quickly as possible and with minimum bandwidth used?

That’s a decision that going to depend on your own priorities, but we would recommend a bit of both. Take advantage of block-level sync for active projects to improve collaboration, while protecting confidential and closed projects with a zero-knowledge solution. That way, you can get the best of both worlds.

Sign up for our newsletter to get the latest on new releases and more.

To get an overview of what’s available to you, we recommend you take a look at our best cloud storage comparison chart; you can then narrow your search from there. Thanks for reading and be sure and let us know your own thoughts about this feature and services that use it (or don’t use it) in the comments below.

2 thoughts on “Block-Level File Copying and the Cloud”

Maybe worth checking: Spideroak should combine zero-knowledge and block-level sync/backup. They do block-level deduplication on the server, so it would make sense to also use it for speeding up the transfer. (When I tried to test this however, I ran once again into stalled uploads…).

Basically, in order to combine zero-knowledge and block-level sync all you need is block-wise encryption with a constant or block-derived key.

does combine block level copying and end-to-end encryption. The downside is that the result is fragile… not too much has to go wrong before the software gets lost and it can be a struggle to figure out what to do next.

It appears that the maintainers never intend to call it anything but “Beta” software. Yet, there’s nothing else like it for making backups that are both efficient and secure. It’s a brilliant design, surely worth using as one (not one’s only) backup strategy.

Joseph Gildred

A technophile with a love for words, Joseph Gildred utilizes his degree in comparative literature and background as an information technology analyst to ponder the future of human ingenuity. Not one to sit still for too long, Joseph joined the team because cloud technology and hopping from place to place go hand in hand. He has roots in Belgrade, Maine.