Secure Storage with Tahoe-LAFS

Brian Warner, one of the original founders of Tahoe-LAFS, describes it with several different terms. "Host-proof security" is one; "Can't be evil" software is another. But it is the second half of the name, "Least Authority Filesystem" that best encapsulates the project's purpose: to store files on cloud servers so that they are private even from the service providers, and, secondarily, to offer a RAID-like resistance to accidents.

Development on Tahoe-LAFS was begun in 2006 by All My Data, a company that offered online backup services. When All My Data closed in 2009, Tahoe-LAFS became a free software project, with its code available under either the second version of the GNU General Public License or The Transitive Grace License, which allows owners of the code twelve months to profit from their work before releasing it.
The original design of Tahoe-LAFS was inspired by Mojo Nation, a peer-to-peer file-sharer from the turn of the millennium that envisioned users paying for its services by offering storage space to others. The idea proved impractical because the variations in availability and transmission speed were too high, but such problems did lead Tahoe-LAFS to allow storage among multiple servers, which lessens the reliance on a single provider.Such concerns reflect Tahoe-LAFS's emphasis on architectural security -- security based on software design -- rather than on reactive security, which includes techniques such as running anti-virus software. In this case, this concept of architectural security is best reflected in Tahoe-LAFS's focus on the Principle of Least Authority, which is sometimes called the Principle of Least Privilege.

Under either name, Warner defines the principle as the idea "that any component of the system should have as little power of authority as it needs to get its job done. The motivation is that software components get compromised, have bugs, and can be tricked into doing the wrong thing. But if the component is as minimally powered as possible, then the attacker doesn't get enough power to do anything interesting. Or at least you're minimizing the damage they can do." He likens Least Authority to giving a valet a car key that doesn't allow them to open the trunk, or allowing a teenager a credit card with a small, fixed spending limit, adding, "it's about giving a component what it needs to do its job, and not giving it anything else."

Warner admits that, "You sound paranoid when you say you shouldn't trust the server. A lot of people hear that and say, 'Just how crazy are you? You're telling us to use this server, but not trust the server. Make up your mind -- do you want us to use the server, or not?' But I think it's useful to think just what you are relying on the server for."

Besides, when you use cloud storage, you are not making yourself dependent upon the company selling space, but everyone who has access to the servers. "It's not just [the company] you're trusting," Warner points out. "There's the technician hired to work on the machine, the janitor that has access to the office at night, and the the people who have discovered how to get access."

By contrast, with Tahoe-LAFS operating under the Principle of Least Authority, you only need to trust the provider to give you space. "You're not constrained to only servers you can trust your private data with," Warner says. You could, for example, choose to store files redundantly across Amazon, DropBox, Google Drive,OpenStack and every other cloud provider. "You're not dependent on any one server for retrieving your data. That gives you more flexibility." In this sense, Tahoe-LAFS is a modern application of a long-standing principle of architectural security.

Putting Least Authority to Use

Using Tahoe-LAFS involves setting up a grid, or a collection of servers with client gateway nodes to access them. Once this initial startup is complete, implementing Least Authority is only slightly more complicated than the theory. Each implementation of Tahoe-LAFS is described as a grid, or a collection of servers with client gateway nodes that access them. Before each file is uploaded to a file storage server, Tahoe-LAFS encrypts it with its own 2000-bit RSA public key. The step is not an option, but compulsory; as Warner put it, "there is no insecure mode."

When each file is uploaded, users have the option to set how many shares the file is divided into -- the erasure coding, in Tahoe-LAFS's jargon. The default is ten shares, any three of which are required to successfully retrieve the file -- which also means that you can have up to seven server failures and still have access to the file. However, you adjust the shares to as many as 256, the tradeoff being that fewer shares mean less security and convenience, while more shares mean greater reliability and accessibility, but increased storage space.

After a file is successfully uploaded, Tahoe-LAFS sends the upload a "file cap" -- a file that is used for encryption key integrity checking, as well as for locating shares for use.Anyone with the file cap can use it to download the file. Unlike standard filesystems, Tahoe-LAFS contains no concept of users or permissions, which might give the storage provider some sense of how uploaded files are being used. Instead, the file cap is needed for any retrievals. This, Warner says, "isn't always what people want. It isn't what people expect from other filesystems. But it's necessary to have a system without any centralized points of control."

Tahoe-LAFS also includes a repair program that informs users when the shares fall below the number specified. However, stored files are immutable, meaning that they cannot be changed. Should you change a file, you must upload it again.

Off-shoots and Future Directions

With security concerns in the news, Tahoe-LAFS is thriving as a project. Today, over two dozen spinoff projects exist, many of them written in Python, like Tahoe-LAFS itself. These spinoffs include extensions and utilities, as well as tools to integrate Tahoe-LAFS into other applications and port them to other programming languages.

Another spinoff from the project is Least Authority, a company founded by long-time Tahoe-LAFS developer Zooko Wilcox-O'Hearn. The company's products include S4, a cloud backup service, and RAIC ("Redundant Array of Independent Clouds,"), which Wilcox-O'Hearn describes as "putting Amazon S3, Rackspace, Cloudfiles, Google Storage, and Microsoft Azure as four backends for a Tahoe-LAFS storage service." In addition, the company is improving user interfaces to give what Wilcox-O'Hearn describes as "a Dropbox-like user experience."

Warner himself is working on adding one small bit of user-awareness, so that server providers can monitor how much space each user has filled. "At the moment, access to a Tahoe server is kind of an all-or-nothing proposition. The server has no idea who's putting data on it, or who's fetching data from it. The feature we're working on reintroduces a bit the notion of who is doing an upload and setting up payment arrangement."

Other future plans for Tahoe-LAFS include improving the repair utility and installation and providing a GUI, as well as using multiple encryption, and altering the protocol between client and server to make it easier for languages other than Python to work with it.

Tahoe-LAFS has come a winding way, from home-user to commercial tool, and from backup provider to cloud security. As Warner says, "We needed a backup tool, but we ended up building a filesystem." Yet the times seem to be catching up to the project, and Tahoe-LAFS seems well-poised to provide the security and privacy that people are starting to expect in the cloud.