Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.

Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.

Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.

Inside System Storage -- by Tony Pearson

Tony Pearson is a Master Inventor and Senior IT Specialist for the IBM System Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2011, Tony celebrated his 25th year anniversary with IBM Storage on the same day as the IBM's Centennial. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson
)

The old adage applies "You can't please everyone. Presidents can't. Prostitutes can't. Nobody can." I am reminded of that as I fielded a variety of interesting comments and emails about, of all things, my choice of order of things in recent blog posts.

Certainly, there are times when the order of things matters greatly. In my now-infamous blog post [Sock Sock Shoe Shoe], I use a scene from a popular 1970's television show to explain why compression should be done before encryption.

In my case, I put things in the order that I felt made sense to me, but not everyone agrees. Here are three recent examples:

Example 1:

In my blog post [Two IBMers Earn Their Retirement], I congratulated two of my colleagues on their retirement. Since their retirement happened on the same day, I decided to mention Mark Doumas first, and Jim Rymarczyk second.

However, one of my readers, who I will assume is a member of the unofficial "Jim Rymarczyk fan club", felt that I should have listed Jim first, as Jim served IBM for 44 years, and Mark only 32 years.

Really? I realize that movie stars insist on having their name listed first on the poster, but neither of these guys would be confused with George Clooney!

So, to Jim and all his fans out there, I assure you I did not mean this as a slight in any way. I have updated the post to indicate that the ordering was strictly alphabetical by last name.

Example 2:

In my blog post [IBM Announcements for February 2012], I presented tape products first, and disk second. Normally, I cover them alphabetically, disk first, then tape. However, I was asked to promote tape this year in preparation for the upcoming 60th anniversary of tape, so I mentioned the tape announcements first, and the disk second.

The feedback from the XIV community was swift. Many felt that I [buried the lede] in not mentioning the XIV Gen3 SSD caching first.

(Note: For those not familiar with the phrase used in journalism, 'burying the lede' refers to the failure to mention the most interesting or attention grabbing elements of a story in the first paragraph. In American news journalism, it is spelled "lede" and elsewhere it is spelled "lead". Major US dictionaries apparently accept both spellings for this phrase.)

Technically, my lead paragraph stated clearly that: "This week we have announcements for both disk and tape, but since 2012 is the 60th Diamond Anniversary for tape, I will start with tape systems first."

So, while I don't claim to be a journalist by any means, I think the lead paragraph accurately reflected that I would talk about both disk and tape products in the rest of the blog post, and if a reader didn't care to learn more about tape could bypass those sections and go directly to the section on disk instead.

Example 3:

I have had my head handed to me on a platter so many times here at IBM that I am considering installing a zipper around my neck. My friends in XIV land insisted that I write a secondary post about XIV Gen3 SSD caching that had no mention of tape whatsoever. One suggestion was to compare and contrast XIV Gen3 SSD caching with EMC's announcement for VFCache. The result was my blog post [IBM XIV Gen3 SSD Caching versus EMC VFCache].

What could go wrong with an apples-to-orange comparison of two different storage products sprinkled with a small amount of FUD against a major competitor?

I had two complaints on this one. First, is the order of products in my side-by-side table of comparisons. I put EMC VFCache in the left column, and IBM XIV Gen3 SSD caching in the right. I meant nothing sinister by this. Alphabetically, EMC comes before IBM, and VFCache comes before XIV. Chronologically, EMC's announcement came out on Monday, and IBM's announcement came out the following day.

(Note: The term [sinster] comes from the Latin word sinistra meaning "left hand". In the Middle Ages it was believed that when a person was writing with their left hand they were possessed by the Devil. Left-handed people were therefore considered to be evil. My poor mother was born left-handed and was forced as a child to write with her right hand to be accepted by society.)

Apparently, an unwritten convention within IBM is that comparison tables always have the newer product on the left column, followed by one or more older products to the right, or the IBM product on the left column, with one or more competitive alternatives to the right.

The second complaint came from a reader in the comments section: "... I think [what] you're doing is trying to ride EMC's release for your own marketing, did you really need to? XIV is an excellent array; adding SSD Cache to the Gen3 takes it further, Moshe would be fuming (which I think is a good thing), can you just stick to that and not ride someone else's wave?"

Seriously?

Both announcements relate to reducing latency of read IOPS through the use of Solid State Drives. That both companies would announce these were no surprise to any employee at either company, as both IBM and EMC have been talking about their intent to do so last year. IBM's announcement of XIV SSD Gen3 caching was certainly not in response to EMC's VFCache announcement, and I doubt EMC rushed out their VFCache announcement the day before as a pre-emptive strike against IBM's announcement of the XIV Gen3 SSD Caching feature.

(Note: I don't know her personally, but she has thousands of followers!)

There you have it. I will gladly fix false or misleading information, but I am not going to re-arrange the order of things just to please some readers, only to have other readers complain that they liked it better in the original order. As always, feel free to comment on any of this in the section below.

I can't believe we got snow this week on Valentine's Day! It didn't last long on the ground here in Tucson, but there are still some white caps in our mountains. For those of you "trapped" by snow, or too much work, here are two upcoming events you can attend from your desk and computer!

IBM Oracle Virtual University 2012

Please join us for the fourth annual IBM Oracle Virtual University that runs "live" for 24 hours, then continues 'on-demand' replay through the remainder of 2012.

This is a great educational event for IBM and Business Partner sales & technical teams who sell IBM Oracle solutions or have Oracle solutions installed in their account. It is for anyone who is new to or interested in the IBM Oracle Alliance as well as experienced sales & technical people who need all the latest on the IBM/Oracle co-opetition relationship for 2012 and beyond.

This VIRTUAL on-line event will cover key topics around the IBM Oracle Alliance. I am one of the speakers and will cover IBM System Storage offerings as they relate to Oracle software.

This is a chance for sellers to hear an update on what's new, unique and available to sell in 2012. The goal of this session is to help enable you to sell more IBM products and services with Oracle solutions in 2012! Learn where to go for help to better understand these solutions, close more deals and reach your targets.

Even through economic challenges, storage requirements have continued to grow along with the information explosion.
Join us for this informative webcast and hear from Jon Toigo, CEO and Managing Principal of Toigo Partners, as he discusses six cutting-edge storage technologies that are ready for prime time and can help transform your data center.

The featured speaker is fellow blogger Jon Toigo, CEO and Managing Principal, Toigo Partners, an outspoken technology consumer advocate and vendor watchdog whose articles, columns, and blog posts on [DrunkenData.com] are enjoyed by over a million readers per month.

Raj hails from Toronto, Canada and will be able to provide the Canadian perspective on all things Storage. I had the pleasure to meet Raj in person here in Tucson when him and dozens of his cohorts came down for a multi-customer briefing at the [IBM Executive Briefing Center] where I work.

It takes me 20-30 minutes to complete a crossword or Sudoku puzzle. I am in no hurry, and I find the process relaxing. But what if you were paid to complete a puzzle? In that case, finishing the puzzle sooner, in fewer minutes, means more money in your paycheck per hour worked! However, getting paid would mean that doing these puzzles may no longer be fun or relaxing.

The idea of converting a hobby into a revenue-generating activity is not new. Who wouldn't want to earn money doing something you were planning to do already? The television is full of commercial advertisements for credit cards where you can earn Double Miles or Cash Rewards just for spending money on things you were going to spend on anyways.

But is "earn" the right word? The merchants pay a percentage fee every time a patron uses a credit card, and the bank is just providing a marketing incentive in the form of a portion of those fees back to the consumer, to encourage more usage of their card versus other forms of payment. Sort of like "profit sharing".

(FTC Disclosure: I am a full-time employee and shareholder of the IBM Corporation. This blog post should not be considered an endorsement for anything. My opinions and writings are based on publicly available information and my own experiences doing freelance work prior to my employment at IBM. I have no hands-on experience with Amazon Mechanical Turk, neither as a worker nor requester, have not participated in TopCoder contests, nor have I used the Viggle app. I do not have any financial interest in Amazon, TopCoder, Viggle or any other third-party company mentioned on this blog post, nor has anyone paid me to mention their company names, brands or offerings.)

Here's how it works. You get the app on your phone, and register each television show as you watch it. You can watch the show live, or much later recorded on your Tivo. You watch the shows you were going to watch anyways, and just provide your demographics, all in the name of market research. You get two points per minute of watching, and after 7,500 points, you get a $5 gift card from retailers such as from retailers such as Burger King, Starbucks, Best Buy, Sephora, Fandango, and CVS drugstores. For the typical American, it would take about three weeks to watch that much television!

Of course, this is not the only way to earn money working from home. A reader asked me for my opinions of [Amazon Mechanical Turk]. While the other examples above are done for marketing purposes, Mechanical Turk can be used for a variety of other things. Up to now, the IT industry has regarded the Cloud as the delivery of computing as a service, with the infrastructure, hardware and software existing on internationally networked servers, effectively invisible to the end user. This model is now to being applied broadly to people.

Basically, Mechanical Turk acts as a marketplace, where employers post Human Intelligent Tasks (HITs) that workers can do. Most can be completed in minutes and you are paid pennies to do so. Some examples might help illustrate what a HIT looks like:

Call a business and get the email address of the manager in charge.

Review a photograph and describe its style or content in three words or less

Select among multiple choices to categorize a job listing or company position

As a Mechanical Turk worker, you only work on the HITs you choose to work on, presumably those that interest you, and that you can do well and quickly. Workers can do this anytime, anywhere, such as 2:00am in the morning, at home, when you can't sleep or taking care of children. You can choose to work as much or as little as you like.

The employers--referred to as Mechanical Turk requesters--put money into their payroll accounts, load up their tasks, and hit publish. This gives them immediate access to a global, on-demand 24-by-7 workforce that can help complete thousands of HITs in minutes. These employers won't have to put an advertisement in the want ads and interview potential candidates, just to let them go later when the project is over.

Just like any other job, Mechanical Turk wages are reported to the IRS, and each person's work is evaluated for quality. In doing these tasks, you build up your "digital reputation" that will either prevent you or allow you to work on certain HITs. You can also take tests to reach Qualification levels to be eligible to work on HITs not available to everyone else.

Software engineers would have a hard time writing an Artificial Intelligence [AI] program to do these simple tasks, so being able to generate a HIT for something in the middle of a computer program might be the easiest way to get past a difficult part of an algorithm. Amusingly, Amazon describes this form of [crowdsourcing] as an artificial form of Artificial Intelligence!

While this approach may work for small, easily defined tasks, what about works that require a high amount of Human Intelligence, like storage software or hardware development?

When I was working for IBM as a software engineer in the 1980s and 1990s, it took us years to get a project done, using the traditional [Waterfall Model]. My job as a software architect was to estimate the thousands of lines of code (KLOC) a project would require, estimate the number of Person-Years (PY) it would take, and recommend the appropriate sized team. Back then, each engineer averaged only about 1,000 lines of software code per year, so KLOC and PY were often used interchangeably. Fellow IBM author Fred Brooks wrote an excellent book on the process called [The Mythical Man-Month].

The Waterfall model has the advantage that people only have to work a portion of the cycle on the project. In between, there was plenty of downtime to attend training, improve your skills, or take vacation. As our director Lynn Yates would often complain, "if they are only writing two lines of code in the morning, and two in the afternoon, why do they need time to rest?"

The Waterfall model was not perfect, and had its share of critics. One downside was that the clients didn't see anything until General Availability (GA), with a few getting a glimpse a few months earlier during our Early Support Program (ESP). By the time clients could tell us it was not what they wanted or expected, it was too late to change until the next release.

To address this concern, 17 software engineers wrote the now famous [Agile Manifesto]. The authors felt that collaboration, between the developers and with the clients, is critical to success. Business people and developers must work together daily throughout the project. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. The best architectures, requirements, and designs emerge from self-organizing teams. The result is an iterative approach that allows the client to see working prototypes early in the process, allowing last-minute changes to requirements to influence the final product.

Combining the Mechanical Turk concept with Agile programming methodology gives you what IBM calls an "Outcomes Model" approach. In the IBM research paper [Software Economies] (PDF, 5 pages), the authors argue that there are four fundamental principles needed for an "Outcomes Model" approach:

Autonomy. All of the actions necessary to bring jobs to completion should be driven by market forces; the process is
never gated by an entity outside of the market.

Inclusiveness. Everyone who provides information or performs work that leads to improvements should share in the
rewards.

Transparency. The system should be transparent with respect to both the flow of money in the market and the tasks
performed by workers in the market.

Reliability. The system should be immune to manipulation, robust against attack (e.g., via insertion of untrusted code),
and prevent "shallow" work which would have to be re-done later.

I was surprised to see that [the TopCoder Community is 390,593 strong], nearly the size of the entire IBM company. TopCoder is focused on computer programming and digital creation using the Outcomes Model approach. Rather than paying everyone for their work, however, the platform is designed around challenges and competitions, and the top players or contributors are rewarded with cash prizes.

As an innovative company, IBM constantly explores a variety of means and approaches to offer value to its clients and customers. These new approaches may have some distinct advantages not just for IBM and its shareholders, but also for its clients and the freelancers hired to work on these projects. The global marketplace is getting flatter, smaller and smarter. It will be interesting how this plays out. If the discussion above encourages you to hone your technical skills, perhaps that is motivation enough to get off the couch and stop watching so much television!

Have you ever noticed that sometimes two movies come out that seem eerily similar to each other, released by different studios within months or weeks of each other? My sister used to review film scripts for a living, she would read ten of them and have to pick her top three favorites, and tells me that scripts for nearly identical concepts came all the time. Here are a few of my favorite examples:

1994: [Wyatt Earp] and [Tombstone] were Westerns recounting the famed gunfight at the O.K. Corral. Tombstone, Arizona is near Tucson, and the gunfight is recreated fairly often for tourists.

1998: [Armageddon] and [Deep Impact] were a pair of disaster movies dealing with a large rock heading to destroy all life on earth. I was in Mazatlan, Mexico to see the latter, dubbed in Spanish as "Impacto Profundo".

1998: [A Bug's Life] and [Antz] were computer-animated tales of the struggle of one individual ant in an ant colony.

This is different than copy-cat movies that are re-made or re-imagined many years later based on the previous successes of an original. Ever since my blog post [VPLEX: EMC's Latest Wheel is Round] in 2010 comparing EMC's copy-cat product that came our seven years after IBM's SAN Volume Controller (SVC), I've noticed EMC doesn't talk about VPLEX that much anymore.

This week, IBM announced [XIV Gen3 Solid-State Drive support] and our friends over at EMC announced [VFCache SSD-based PCIe cards]. Neither of these should be a surprise to anyone who follows the IT industry, as IBM had announced its XIV Gen3 as "SSD-Ready" last year specifically for this purpose, and EMC has been touting its "Project Lightning" since last May.

Fellow blogger Chuck Hollis from EMC has a blog post [VFCache means Very Fast Cache indeed] that provides additional detail. Chuck claims the VFCache is faster than popular [Fusion-IO PCIe cards] available for IBM servers. I haven't seen the performance spec sheets, but typically SSD is four to five times slower than the DRAM cache used in the XIV Gen3. The VFCache's SSD is probably similar in performance to the SSD supported in the IBM XIV Gen3, DS8000, DS5000, SVC, N series, and Storwize V7000 disk systems.

Nonetheless, I've been asked my opinions on the comparison between these two announcements, as they both deal with improving application performance through the use of Solid-State Drives as an added layer of read cache.

(FTC Disclosure: I am both a full-time employee and stockholder of the IBM Corporation. The U.S. Federal Trade Commission may consider this blog post as a paid celebrity endorsement of IBM servers and storage systems. This blog post is based on my interpretation and opinions of publicly-available information, as I have no hands-on access to any of these third-party PCIe cards. I have no financial interest in EMC, Fusion-IO, Texas Memory Systems, or any other third party vendor of PCIe cards designed to fit inside IBM servers, and I have not been paid by anyone to mention their name, brands or products on this blog post.)

The solutions are different in that IBM XIV Gen3 the SSD is "storage-side" in the external storage device, and EMC VFCache is "server-side" as a PCI Express [PCIe] card. Aside from that, both implement SSD as an additional read cache layer in front of spinning disk to boost performance. Neither is an industry first, as IBM has offered server-side SSD since 2007, and IBM and EMC have offered storage-side SSD in many of their other external storage devices. The use of SSD as read cache has already been available in IBM N series using [Performance Accelerator Module (PAM)] cards.

IBM has offered cooperative caching synergy between its servers and its storage arrays for some time now. The predecessor to today's POWER7-based were the iSeries i5 servers that used PCI-X IOP cards with cache to connect i5/OS applications to IBM's external disk and tape systems. To compete in this space, EMC created their own PCI-X cards to attach their own disk systems. In 2006, IBM did the right thing for our clients and fostered competition by entering in a [Landmark agreement] with EMC to [license the i5 interfaces]. Today, VIOS on IBM POWER systems allows a much broader choice of disk options for IBM i clients, including the IBM SVC, Storwize V7000 and XIV storage systems.

Can a little SSD really help performance? Yes! An IBM client running a [DB2 Universal Database] cluster across eight System x servers was able to replace an 800-drive EMC Symmetrix by putting eight SSD Fusion-IO cards in each server, for a total of 64 Solid-State drives, saving money and improving performance. DB2 has the Data Partitioning Feature that has multi-system DB2 configurations using a Grid-like architecture similar to how XIV is designed. Most IBM System x and BladeCenter servers support internal SSD storage options, and many offer PCIe slots for third-party SSD cards. Sadly, you can't do this with a VFCache card, since you can have only one VFCache card in each server, the data is unprotected, and only for ephemeral data like transaction logs or other temporary data. With multiple Fusion-IO cards in an IBM server, you can configure a RAID rank across the SSD, and use it for persistent storage like DB2 databases.

An advantage of the XIV Gen3 SSD caching approach is that the cache can be dynamically allocated to the busiest data from any server or servers.

Support for active/active server clusters

No

Yes!

Aware of changes made to back-end disk

No, it appears the VFCache has no direct interaction with the back-end disk array, so any changes to the data on the box itself are not communicated back to the VFCache card itself to invalidate the cache contents.

Yes!

Sequential-access detection

None identified. However, VFCache only caches blocks 64KB or smaller, so any sequential processing with larger blocks will bypass the VFCache.

Yes! XIV algorithms detect sequential access and avoid polluting the SSD with these blocks of data.

Number of SSD supported

One, which seems odd as IBM supports multiple Fusion-IO cards for its servers. However, this is not really a single point of failure (SPOF) as an application experiencing a VFCache failure merely drops down to external disk array speed, no data is lost since it is only read cache.

6 to 15 (one per XIV module) for high availability.

Pin data in SSD cache

Yes, using split-card mode, you can designate a portion of the 300GB to serve as Direct-attached storage (DAS). All data written to the DAS portion will be kept in SSD. However, since only one card is supported per server and the data is unprotected, this should only be used for ephemeral data like logs and temp files.

No, there is no option to designate an XIV Gen3 volume to be SSD-only. Consider using Fusion-IO PCIe card as a DAS alternative, or another IBM storage system for that requirement.

Hot-pluggable/Hot-swappable

Not identified

Yes!

Pre-sales Estimating tools

None identified

Yes! CDF and Disk Magic tools are available to help cost-justify the purchase of SSD based on workload performance analysis.

IBM has the advantage that it designs and manufactures both servers and storage, and can design optimal solutions for our clients in that regard.