My mission: Find technology for Early Adopters. Follow me: on Twitter @danwoodsearly on LinkedIn @ www.linkedin.com/in/danwoodsearly/ on myBlog @ http://www.CITOResearch.com. I am a CTO, writer, and consultant. For tech vendors, I help explain their technology. For users, I help find, select, and deploy new solutions that have explosive business value. I love to speak and share ideas.

Can Symform's P2P Cloud Storage Platform Really Work?

The explosion of electronically stored data represents the fastest growing segment of IT. Storage media and disk drives are becoming cheaper, and yet many backup services charge thousands of dollars a year to back up an $80 disk drive; Amazon costs close to $250 a month to back up 2 TB, notes Praerit Garg, president of Symform, a new cloud network that uses sharing to offer low-cost storage and backup services.

This disconnect didn’t sit well with Garg and Bassam Tabbara, who came up against it after leaving Microsoft to build out a new application. On a start-up budget, they didn’t want to build a data center and were hoping to use cloud storage. But even the cloud storage costs were prohibitive, so they abandoned the project. Recognizing that other people were surely having the same issue, they changed their development objective to lowering the cost of storage. Soon, they were joined by Matt Schiltz, the former CEO of DocuSign, who was flummoxed by the storage industry’s high cost and fixation on centralization.

But is the world ready for this type of P2P cloud storage? Based on the initial adoption of Symform by certain segments of eager users, it seems that many people are ready. But for the rest of the world to follow, Symform must clearly establish a sense of its sweet spot and of the risks and responsibilities involved. In this article, I take a look at the key questions that will determine how big the market for P2P storage will be.

Virtual Data Centers at a Discount

In essence, Symform is out to create the world’s largest data center for storage, without actually building one. Using a patented distributed computing model and supporting algorithms, Symform’s network strings together the spare capacity of its members’ disk drives into a unified cloud storage facility. The appeal of such a solution is clear in economic terms — why pay thousands per month for cloud storage on someone else’s dedicated hardware, when you can pay with the spare capacity you already have? It also has green appeal, particularly in light of the negative attention in the press given to the energy consumption of some of the world’s largest data-center operators.

For the first 10 GB, the service is free. For each additional 1 GB, customers can pay $0.15 per month, or they can contribute 2GB of their own excess local drive capacity to get 1GB of new cloud capacity in return. Support plans run from $15 to $200, or 300 GB to 4 TB a month, depending on the level of service required.

Symform is aiming at the “mid market” of small-to-medium-sized businesses (SMBs) that have more data than free services will support, but which are beginning to hit cost thresholds with services like Box, Dropbox, and Amazon S3.

“Once you approach hundreds of gigabytes of data, you’re in the market to buy a lot of disks or you’re looking for a cloud solution,” Garg says. “The groups that have growing data are likely to hit the cost threshold sooner.”

At a simplistic level, that means SMBs are likely to be the sharing/trading customers, while individual consumers are the paying customers.

Naturally, the first objections to a sharing scheme are around security. Symform claims to be safer than traditional data center-based storage. It deposits a small agent on the members’ network, and the member uses a Web-based interface to select files for backup, which then go into the cloud through an encrypted 256-bit connection (the Federal security standard is 128-bit).

Symform uses RAID (Redundant Array of Independent Disks) 96, which encrypts, shreds, and geographically scatters each fragment of data to 96 different devices. Ordinary data centers use RAID 5 or 6 storage, which distributes data across 5 or 6 disks in the same server. That means data can be lost in the event of only two or three simultaneous disk failures. With RAID 96, each 64 MB block of data in a customer’s file is protected from up to 33 unrelated, geographically separate disk failures. For extra security, each 64 MB block, encrypted with its own key, is split into 96 fragments. To restore data, only 64 of the 96 fragments are required – Symform can identify the “fastest 64” which reassemble the data in usable format on the source device, Schiltz says.

Each user who participates in the byte-contribution program can see a “contribution folder” in his dashboard, which Symform calls “Cloud View.” The folder contains the encrypted fragments of their peers so they know how much they are contributing, and how much space Symform is using on their drives, but they can’t read their peers’ fragments.

The Symform “low overhead” model seems to work well as a basic concept, but enterprise users will also be naturally concerned about reliability in a network of nodes ultimately owned by many parties. Garg says he believes Symform has found the sweet spot of “achieving three nines of availability with only 50 percent overhead.”

To do this, Symform requires that “contributor” devices supplying storage have 80 percent or more uptime, which typically will be desktops, servers, and network attached storage (NAS) devices. During downtime, Symform’s management platform reassigns data fragments to other running machines.

To rebuild redundancy, Symform takes steps to reduce bandwidth overhead. If a node goes down, Symform’s system is set up to reload the missing fragments from the original source, which takes up much less bandwidth than trying to rebuild the whole block from scratch. For this approach to work, it’s critical that the contributors keep up their end of the deal by keeping machines running.

Post Your Comment

Post Your Reply

Forbes writers have the ability to call out member comments they find particularly interesting. Called-out comments are highlighted across the Forbes network. You'll be notified if your comment is called out.

Comments

I am a fan of P2P technology, and I am sure that distributed storage can work, technically. My only concern about p2p storage is whether people will be willing to contribute their bandwidth and storage in sufficient numbers to make the network work. That is, if I want to store 1 TB, there have to be enough people in the network to store 1.5TB, who are willing to leave their computers on and contributing storage and bandwidth.

I assume all the data must be encrypted and checksummed to guarantee my data is private and so I won’t get garbage back when restoring, but I would not trust this system without knowing a lot more about it. Who holds the decryption keys, and how many redundant copies of my data are stored in order to guard against data loss if contributors’ storage devices fail or go offline? I could see many people finding new life for faulty drives as storage for this system.