New Storage time (next year anyway) - anyone unusual to look at?

Well, did my first non-benchmark test today. 8 simultaneous svmotions from a few datastores on the EVA targeting a single datastore on the 3Par. 950MB/s peak, 700MB/s average, read time average 2-3ms with two single peaks at 15.5ms and 45ms, write time average 1ms with a peak at 1.4ms. This is targeting the 96x 10k disks. So far you can color me impressed.

Can I ask how you've configured that array?

What kind of information are you looking for?

Is it writing to all 96 disks or a subset? What RAID are you using etc? Guessing you're also using FC as opposed to 10GbE? I'm basically trying to understand how its getting that performance.

4x8Gb FC per controller (we have 6 but 2 are in use for peermotion right now until we get our physical boxes migrated from the EVA). The vdisk was setup as RAID5 7+1 on all 10k disks (all disks of the same class are in a CPG by default, you have to do work to not use them all and it's not really recommended). There are 5 hosts in this cluster and each has 2x 4Gb HBAs. I'm not sure how many were involved in the test since I just selected all my sandbox servers to move but most likely all 5. The one thing I'm not sure about right now is how the multipathing was configured since we haven't installed the 3par plugin so it was using whatever vmware selected by default.

4x8Gb FC per controller (we have 6 but 2 are in use for peermotion right now until we get our physical boxes migrated from the EVA). The vdisk was setup as RAID5 7+1 on all 10k disks (all disks of the same class are in a CPG by default, you have to do work to not use them all and it's not really recommended). There are 5 hosts in this cluster and each has 2x 4Gb HBAs. I'm not sure how many were involved in the test since I just selected all my sandbox servers to move but most likely all 5. The one thing I'm not sure about right now is how the multipathing was configured since we haven't installed the 3par plugin so it was using whatever vmware selected by default.

The 7400 is active/active controller and as such the default path policy in VMWare should be round robin. There is decent real-time monitoring charts you can setup in infomrm.

Compression is algorithm based. No external data is required to compress or uncompress, just the algorithm.

De-dup is based on comparing data to other data. It removed identical chunks of data and replaces them with a reference/pointer to the removed data. Dedup can/does involve more I/O, and places different demands on the back-end storage.

Well, did my first non-benchmark test today. 8 simultaneous svmotions from a few datastores on the EVA targeting a single datastore on the 3Par. 950MB/s peak, 700MB/s average, read time average 2-3ms with two single peaks at 15.5ms and 45ms, write time average 1ms with a peak at 1.4ms. This is targeting the 96x 10k disks. So far you can color me impressed.

Can I ask how you've configured that array?

What kind of information are you looking for?

Is it writing to all 96 disks or a subset? What RAID are you using etc? Guessing you're also using FC as opposed to 10GbE? I'm basically trying to understand how its getting that performance.

If anyone is really interested I should be able to answer and real deep tech questions since I do work for HP and am now back from a much needed long vacation. As for the cache questions since 3Par wide stripes (far greater than EVA's) cache amounts aren't as critical like in other arrays, however in a 4 nodes and above it is nice to take a controller down and not have to worry about going into write-through mode.

Compression is algorithm based. No external data is required to compress or uncompress, just the algorithm.

De-dup is based on comparing data to other data. It removed identical chunks of data and replaces them with a reference/pointer to the removed data. Dedup can/does involve more I/O, and places different demands on the back-end storage.

I see dedup as a specialized form of compression or a specialized form of source coding. Doesn't compression require that you build a dictionary/metadata table? Isn't that similar to the metadata/hash table that dedup systems typically use? Compression looks at bit patterns, dedup looks at block or file patterns, that about right?

Not all compression algorithms require a dictionary/index. The oldest compression tactics include seeing either a single byte repeat more than "N" times, or seeing s string of bytes repeat more than "N" times and converting it to one occurrence of the byte/string with encoding to denote how many equal occurrences are omitted. The result is smaller than the original and self contained.

4x8Gb FC per controller (we have 6 but 2 are in use for peermotion right now until we get our physical boxes migrated from the EVA). The vdisk was setup as RAID5 7+1 on all 10k disks (all disks of the same class are in a CPG by default, you have to do work to not use them all and it's not really recommended). There are 5 hosts in this cluster and each has 2x 4Gb HBAs. I'm not sure how many were involved in the test since I just selected all my sandbox servers to move but most likely all 5. The one thing I'm not sure about right now is how the multipathing was configured since we haven't installed the 3par plugin so it was using whatever vmware selected by default.

Compression is algorithm based. No external data is required to compress or uncompress, just the algorithm.

De-dup is based on comparing data to other data. It removed identical chunks of data and replaces them with a reference/pointer to the removed data. Dedup can/does involve more I/O, and places different demands on the back-end storage.

I see dedup as a specialized form of compression or a specialized form of source coding. Doesn't compression require that you build a dictionary/metadata table? Isn't that similar to the metadata/hash table that dedup systems typically use? Compression looks at bit patterns, dedup looks at block or file patterns, that about right?

There are a number of compression options available for streaming data. That essentially allows you to have a black box that accepts a never-ending stream of bytes on one side, and outputs a (usually) smaller stream of bytes out the other. This is the same type of technology used by LTO Tape Drives for instance.

There are very good, highly capable software based solutions, as well as dedicated hardware offload cards and ASICs that can be applied to perform inline compression of essentially unlimited amounts of data at wire speed.

Deduplication is a completely different animal. The process of deduplicating data necessarily involves having to keep track of metadata to allow the original data streams to be re-constructed using pointers to blocks, in addition to performing hashing (and usually compression) operations against the data or handling lookups against the metadata. Hardware offloading can help increase the speed of the process to some extent, but scaling deduplication performance is orders of magnitude harder than simple compression.

4x8Gb FC per controller (we have 6 but 2 are in use for peermotion right now until we get our physical boxes migrated from the EVA). The vdisk was setup as RAID5 7+1 on all 10k disks (all disks of the same class are in a CPG by default, you have to do work to not use them all and it's not really recommended). There are 5 hosts in this cluster and each has 2x 4Gb HBAs. I'm not sure how many were involved in the test since I just selected all my sandbox servers to move but most likely all 5. The one thing I'm not sure about right now is how the multipathing was configured since we haven't installed the 3par plugin so it was using whatever vmware selected by default.

So this is an 8 shelf or more system?

Yeah, internal 2.5" + 7 external 2.5" + one external 3.5". If the tiering works as well as we hope and we can get a D2D2T with incremental forever working we'll end up with another shelf of 3.5" before we add the next two controllers and rebalance things.