I'm wondering what are some of the architectural details that make the two systems reliable? I'm not too sure of the design of S3 on whether they have more than one location at which they store their data.

Data is "safe" when its backed up on a regular basis, and the backups are stored in a separate location to the online systems. That's it.
–
RobMFeb 26 '12 at 10:40

even if the data at s3 won't be lost, what do you do if you can't get to it? having an off-site backup is a good idea in every scenario.
–
Niko S PFeb 26 '12 at 10:48

Im aware obviously a data backup is necessary for both!what im trying to go for is understanding which one is built taking this into account, s3's design is not very well detailed.
–
AkshatFeb 26 '12 at 13:38

1

The question was getting close votes, so I edited it to try to make it match the good answer Steffan gave.
–
WardFeb 26 '12 at 16:55

2 Answers
2

As you said, this isn't exactly an apples to apples comparison (in addition there is agreement already, that decent data backup procedures must be in place for both, so I'm not going to address this). Therefore the question cannot be answered as such, rather one should be aware of the architectural details of each offering and apply those in respect to a particular use case at hand.

In particular, the ZFS based storage system from Joyent is a local storage system designed to deliver carrier-grade storage and data reliability, see Data Resiliency and Reliability:

We put ZFS on top of a high performance local storage subsystem to
ensure that your data is safe, consistent, and always accessible and
recoverable.
ZFS is a combined file system and logical volume manager designed for
pooled local storage. Unlike other file systems deployed for cloud
storage, ZFS’ copy-on-write capability guarantees that your image will
not be lost. [emphasis mine]

In contrast, EBS is a network block storage system designed to provide highly available, highly reliable storage volumes that can be attached to a running Amazon EC2 instance and exposed as a device within the instance, see section Features of Amazon EBS volumes within Amazon Elastic Block Store (EBS) for details, e.g.:

Amazon EBS volumes are placed in a specific Availability Zone, and
can then be attached to instances also in that same Availability Zone.

Each storage volume is automatically replicated within the same Availability Zone.
This prevents data loss due to failure of any single hardware component.

Amazon EBS also provides the ability to
create point-in-time snapshots of volumes, which are persisted to
Amazon S3. These snapshots can be used as the starting point for new
Amazon EBS volumes, and protect data for long-term durability. [...]

[emphasis mine]

The latter point highlights that EBS does not store its data on S3 in itself, rather provides an easy to use backup mechanism for long-term durability via S3, which implies you will need to assess both scenarios separately in terms of durability and availability though.

[...] Amazon EBS volume data is replicated across multiple servers in an
Availability Zone to prevent the loss of data from the failure of any
single component. The durability of your volume depends both on the
size of your volume and the percentage of the data that has changed
since your last snapshot. [...]

Because Amazon EBS servers are replicated within a single Availability
Zone, mirroring data across multiple Amazon EBS volumes in the same
Availability Zone will not significantly improve volume durability.
However, for those interested in even more durability, Amazon EBS
provides the ability to create point-in-time consistent snapshots of
your volumes that are then stored in Amazon S3, and automatically
replicated across multiple Availability Zones. [...]

Each availability zone runs on its own physically distinct,
independent infrastructure [...].
Common points of failures like generators and cooling equipment are
not shared across Availability Zones. Additionally, they are
physically separate, such that even extremely uncommon disasters such
as fires, tornados or flooding would only affect a single Availability
Zone. [emphasis mine]

Please note, that an availability zone is still constraint to a single region (see Using Regions and Availability Zones for details on this architecture), and their have been respective incidents already, triggering discussions whether region and/or provider redundancy is the way to go for utmost reliability (see Outages below).

Outages

Both services had at least one major outage in the past - the respective post mortem analysis provides additional insight into the design of each system and allow you to account for this in backup and availability strategies accordingly:

section Overview of EBS System features an insightful summary of the EBS architecture

The latter outage sparked quite some discussion regarding reliability of cloud computing in general, which interestingly triggered the article Magical Block Store: When Abstractions Fail Us on Joyent's blog, exploring the differences between both approaches and explaining Joyent's respective architectural choices (including self-criticism of former failed attempts); while this article obviously might be considered biased, it should still allow you to draw your own conclusions in turn.

You don't have the data unless you have it in triplicate at two geographically different locations.

Depending on single RAID instance, virtual block device, single supplier, etc. to reliably store your data is careless at best.

That being said, unless nothing changed during past 2-3 years since I last checked, Amazon doesn't give any guarantee that S3 data will be there next time you look. They have been reliable during the past few years as far as storage is concerned so it's not like like the data regularly disappears.