Blogs

Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )

My books are available on Lulu.com! Order your copies today!
Featured Redbooks and Redpapers:

Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.

Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.

Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.

Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.

I haven't even finished blogging about all the other stuff that got announced last week, and here we are with more announcements. Since IBM's big [Pulse 2010 Conference] is next week, I thought I would cover this week's announcement on Tivoli Storage Manager (TSM) v6.2 release. Here are the highlights:

Client-Side Data Deduplication

This is sometimes referred to as "source-side" deduplication, as storage admins can get confused on which servers are clients in a TSM client-server deployment. The idea is to identify duplicates at the TSM client node, before sending to the TSM server. This is done at the block level, so even files that are similar but not identical, such as slight variations from a master copy, can benefit. The dedupe process is based on a shared index across all clients, and the TSM server, so if you have a file that is similar to a file on a different node, the duplicate blocks that are identical in both would be deduplicated.

This feature is available for both backup and archive data, and can also be useful for archives using the IBM System Storage Archive Manager (SSAM) v6.2 interface.

Simplified management of Server virtualization

TSM 6.2 improves its support of VMware guests by adding auto-discovery. Now, when you spontaneously create a new virtual machine OS guest image, you won't have to tell TSM, it will discover this automatically! TSM's legendary support of VMware Consolidated Backup (VCB) now eliminates the manual process of keeping track of guest images. TSM also added support of the Vstorage API for file level backup and recovery.

While IBM is the #1 reseller of VMware, we also support other forms of server virtualization. In this release, IBM adds support for Microsoft Hyper-V, including support using Microsoft's Volume Shadow Copy Services (VSS).

Automated Client Deployment

Do you have clients at all different levels of TSM backup-archive client code deployed all over the place? TSM v6.2 can upgrade these clients up to the latest client level automatically, using push technology, from any client running v5.4 and above. This can be scheduled so that only certain clients are upgraded at a time.

Simultaneous Background Tasks

The TSM server has many background administrative tasks:

Migration of data from one storage pool to another, based on policies, such as moving backups and archives on a disk pool over to a tape pools to make room for new incoming data.

Storage pool backup, typically data on a disk pool is copied to a tape pool to be kept off-site.

Copy active data. In TSM terminology, if you have multiple backup versions, the most recent version is called the active version, and the older versions are called inactive. TSM can copy just the active versions to a separate, smaller disk pool.

In previous releases, these were done one at a time, so it could make for a long service window. With TSM v6.2, these three tasks are now run simultaneously, in parallel, so that they all get done in less time, greatly reducing the server maintenance window, and freeing up tape drives for incoming backup and archive data. Often, the same file on a disk pool is going to be processed by two or more of these scheduled tasks, so it makes sense to read it once and do all the copies and migrations at one time while the data is in buffer memory.

Enhanced Security during Data Transmission

Previous releases of TSM offered secure in-flight transmission of data for Windows and AIX clients. This security uses Secure Socket Layer (SSL) with 256-bit AES encryption. With TSM v6.2, this feature is expanded to support Linux, HP-UX and Solaris.

Improved support for Enterprise Resource Planning (ERP) applications

I remember back when we used to call these TDPs (Tivoli Data Protectors). TSM for ERP allows backup of ERP applications, seemlessly integrating with database-specific tools like IBM DB2, Oracle RMAN, and SAP BR*Tools. This allows one-to-many and many-to-one configurations between SAP servers and TSM servers. In other words, you can have one SAP server backup to several TSM servers, or several SAP servers backup to a single TSM server. This is done by splitting up data bases into "sub-database objects", and then process each object separately. This can be extremely helpful if you have databases over 1TB in size. In the event that backing up an object fails and has to be re-started, it does not impact the backup of the other objects.

The technology industry is full of trade-offs. Take for example solar cells that convert sunlight to electricity. Every hour, more energy hits the Earth in the form of sunlight than the entire planet consumes in an entire year. The general trade-off is between energy conversion efficiency versus abundance of materials:

A second trade-off is exemplified by EMC's recent GeoProtect announcement. This appears similar to the geographic dispersal method introduced by a company called [CleverSafe]. The trade-off is between the amount of space to store one or more copies of data and the protection of data in the event of disaster. Here's an excerpt from fellow blogger Chuck Hollis (EMC) titled ["Cloud Storage Evolves"]:

"Imagine a average-sized Atmos network of 9 nodes, all in different time zones around the world. And imagine that we were using, say, a 6+3 protection scheme.

The implication is clear: any 3 nodes could be completely lost: failed, destroyed, seized by the government, etc.
-- and the information could be completely recovered from the surviving nodes."

For organizations worried about their information falling into the wrong hands (whether criminal or government sponsored!), any subset of the nodes would yield nothing of value -- not only would the information be presumably encrypted, but only a few slices of a far bigger picture would be lost.

Seized by the government?falling into the wrong hands? Is EMC positioning ATMOS as "Storage for Terrorists"? I can certainly appreciate the value of being able to protect 6PB of data with only 9PB of storage capacity, instead of keeping two copies of 6PB each, the trade-off means that you will be accessing the majority of your data across your intranet, which could impact performance. But, if you are in an illicit or illegal business that could have a third of your facilities "seized by the government", then perhaps you shouldn't house your data centers there in the first place. Having two copies of 6PB each, in two "friendly nations", might make more sense.

(In reality, companies often keep way more than just two copies of data. It is not unheard of for companies to keep three to five copies scattered across two or three locations. Facebook keeps SIX copies of photographs you upload to their website.)

ChuckH argues that the governments that seize the three nodes won't have a complete copy of the data. However, merely having pieces of data is enough for governments to capture terrorists. Even if the striping is done at the smallest 512-byte block level, those 512 bytes of data might contain names, phone numbers, email addresses, credit cards or social security numbers. Hackers and computer forensics professionals take advantage of this.

You might ask yourself, "Why not just encrypt the data instead?" That brings me to the third trade-off, protection versus application performance. Over the past 30 years, companies had a choice, they could encrypt and decrypt the data as needed, using server CPU cycles, but this would slow down application processing. Every time you wanted to read or update a database record, more cycles would be consumed. This forced companies to be very selective on what data they encrypted, which columns or fields within a database, which email attachments, and other documents or spreadsheets.

An initial attempt to address this was to introduce an outboard appliance between the server and the storage device. For example, the server would write to the appliance with data in the clear, the appliance would encrypt the data, and pass it along to the tape drive. When retrieving data, the appliance would read the encrypted data from tape, decrypt it, and pass the data in the clear back to the server. However, this had the unintended consequences of using 2x to 3x more tape cartridges. Why? Because the encrypted data does not compress well, so tape drives with built-in compression capabilities would not be able to shrink down the data onto fewer tapes.

Like the trade-off between energy efficiency and abundant materials, IBM eliminated the trade-off by offering compression and encryption on the tape drive itself. This is standard 256-bit AES encryption implemented on a chip, able to process the data as it arrives at near line speed. So now, instead of having to choose between protecting your data or running your applications with acceptable performance, you can now do both, encrypt all of your data without having to be selective. This approach has been extended over to disk drives, so that disk systems like the IBM System Storage DS8000 and DS5000 can support full-disk-encryption [FDE] drives.

A long time ago, perhaps in the early 1990s, I was an architect on the component known today as DFSMShsm on z/OS mainframe operationg system. One of my job responsibilities was to attend the biannual [SHARE conference to listen to the requirements of the attendees on what they would like added or changed to the DFSMS, and ask enough questions so that I can accurately present the reasoning to the rest of the architects and software designers on my team. One person requested that the DFSMShsm RELEASE HARDCOPY should release "all" the hardcopy. This command sends all the activity logs to the designated SYSOUT printer. I asked what he meant by "all", and the entire audience of 120 some attendees nearly fell on the floor laughing. He complained that some clever programmer wrote code to test if the activity log contained only "Starting" and "Ending" message, but no error messages, and skip those from being sent to SYSOUT. I explained that this was done to save paper, good for the environment, and so on. Again, howls of laughter. Most customers reroute the SYSOUT from DFSMS from a physical printer to a logical one that saves the logs as data sets, with date and time stamps, so having any "skipped" leaves gaps in the sequence. The client wanted a complete set of data sets for his records. Fair enough.

When I returned to Tucson, I presented the list of requests, and the immediate reaction when I presented the one above was, "What did he mean by ALL? Doesn't it release ALL of the logs already?" I then had to recap our entire dialogue, and then it all made sense to the rest of the team. At the following SHARE conference six months later, I was presented with my own official "All" tee-shirt that listed, and I am not kidding, some 33 definitions for the word "all", in small font covering the front of the shirt.

I am reminded of this story because of the challenges explaining complicated IT concepts using the English language which is so full of overloaded words that have multiple meanings. Take for example the word "protect". What does it mean when a client asks for a solution or system to "protect my data" or "protect my information". Let's take a look at three different meanings:

Unethical Tampering

The first meaning is to protect the integrity of the data from within, especially from executives or accountants that might want to "fudge the numbers" to make quarterly results look better than they are, or to "change the terms of the contract" after agreements have been signed. Clients need to make sure that the people authorized to read/write data can be trusted to do so, and to store data in Non-Erasable, Non-Rewriteable (NENR) protected storage for added confidence. NENR storage includes Write-Once, Read-Many (WORM) tape and optical media, disk and disk-and-tape blended solutions such as the IBM Grid Medical Archive Solution (GMAS) and IBM Information Archive integrated system.

Unauthorized Access

The second meaning is to protect access from without, especially hackers or other criminals that might want to gather personally-identifiably information (PII) such as social security numbers, health records, or credit card numbers and use these for identity theft. This is why it is so important to encrypt your data. As I mentioned in my post [Eliminating Technology Trade-Offs], IBM supports hardware-based encryption FDE drives in its IBM System Storage DS8000 and DS5000 series. These FDE drives have an AES-128 bit encryption built-in to perform the encryption in real-time. Neither HDS or EMC support these drives (yet). Fellow blogger Hu Yoshida (HDS) indicates that their USP-V has implemented data-at-rest in their array differently, using backend directors instead. I am told EMC relies on the consumption of CPU-cycles on the host servers to perform software-based encryption, either as MIPS consumed on the mainframe, or using their Powerpath multi-pathing driver on distributed systems.

There is also concern about internal employees have the right "need-to-know" of various research projects or upcoming acquisitions. On SANs, this is normally handled with zoning, and on NAS with appropriate group/owner bits and access control lists. That's fine for LUNs and files, but what about databases? IBM's DB2 offers Label-Based Access Control [LBAC] that provides a finer level of granularity, down to the row or column level. For example, if a hospital database contained patient information, the doctors and nurses would not see the columns containing credit card details, the accountants would not see the columnts containing healthcare details, and the individual patients, if they had any access at all, would only be able to access the rows related to their own records, and possibly the records of their children or other family members.

Unexpected Loss

The third meaning is to protect against the unexpected. There are lots of ways to lose data: physical failure, theft or even incorrect application logic. Whatever the way, you can protect against this by having multiple copies of the data. You can either have multiple copies of the data in its entirety, or use RAID or similar encoding scheme to store parts of the data in multiple separate locations. For example, with RAID-5 rank containing 6+P+S configuration, you would have six parts of data and one part parity code scattered across seven drives. If you lost one of the disk drives, the data can be rebuilt from the remaining portions and written to the spare disk set aside for this purpose.

But what if the drive is stolen? Someone can walk up to a disk system, snap out the hot-swappable drive, and walk off with it. Since it contains only part of the data, the thief would not have the entire copy of the data, so no reason to encrypt it, right? Wrong! Even with part of the data, people can get enough information to cause your company or customers harm, lose business, or otherwise get you in hot water. Encryption of the data at rest can help protect against unauthorized access to the data, even in the case when the data is scattered in this manner across multiple drives.

To protect against site-wide loss, such as from a natural disaster, fire, flood, earthquake and so on, you might consider having data replicated to remote locations. For example, IBM's DS8000 offers two-site and three-site mirroring. Two-site options include Metro Mirror (synchronous) and Global Mirror (asynchronous). The three-site is cascaded Metro/Global Mirror with the second site nearby (within 300km) and the third site far away. For example, you can have two copies of your data at site 1, a third copy at nearby site 2, and two more copies at site 3. Five copies of data in three locations. IBM DS8000 can send this data over from one box to another with only a single round trip (sending the data out, and getting an acknowledgment back). By comparison, EMC SRDF/S (synchronous) takes one or two trips depending on blocksize, for example blocks larger than 32KB require two trips, and EMC SRDF/A (asynchronous) always takes two trips. This is important because for many companies, disk is cheap but long-distance bandwidth is quite expensive. Having five copies in three locations could be less expensive than four copies in four locations.

Fellow blogger BarryB (EMC Storage Anarchist) felt I was unfair pointing out that their EMC Atmos GeoProtect feature only protects against "unexpected loss" and does not eliminate the need for encryption or appropriate access control lists to protect against "unauthorized access" or "unethical tampering".

(It appears I stepped too far on to ChuckH's lawn, as his Rottweiler BarryB came out barking, both in the [comments on my own blog post], as well as his latest titled [IBM dumbs down IBM marketing (again)]. Before I get another rash of comments, I want to emphasize this is a metaphor only, and that I am not accusing BarryB of having any canine DNA running through his veins, nor that Chuck Hollis has a lawn.)

As far as I know, the EMC Atmos does not support FDE disks that do this encryption for you, so you might need to find another way to encrypt the data and set up the appropriate access control lists. I agree with BarryB that "erasure codes" have been around for a while and that there is nothing unsafe about using them in this manner. All forms of RAID-5, RAID-6 and even RAID-X on the IBM XIV storage system can be considered a form of such encoding as well. As for the amount of long-distance bandwidth that Atmos GeoProtect would consume to provide this protection against loss, you might question any cost savings from this space-efficient solution. As always, you should consider both space and bandwidth costs in your total cost of ownership calculations.

Of course, if saving money is your main concern, you should consider tape, which can be ten to twenty times cheaper than disk, affording you to keep a dozen or more copies, in as many time zones, at substantially lower cost. These can be encrypted and written to WORM media for even more thorough protection.

Continuing my coverage of the annual [2010 System Storage Technical University], I participated in the storage free-for-all, which is a long-time tradition, started at SHARE User Group conference, and carried forward to other IT conferences. The free-for-all is a Q&A Panel of experts to allow anyone to ask any question. These are sometimes called "Birds of a Feather" (BOF). Last year, they were called "Meet the Experts", one for mainframe storage, and the other for storage attached to distributed systems. This year, we had two: one focused on Tivoli Storage software, and the second to cover storage hardware. This post provides a recap of the Storage Hardware free-for-all.

What can I do to improve performance on my DS8100 disk system? It is running a mix of sequential batch processing and my medical application (EPIC). I have 16GB of cache and everything is formatted as RAID-5.

We are familiar with EPIC. It does not "play well with others", so IBM recommends you consider dedicating resources for just the EPIC data. Also consider RAID-10 instead for the EPIC data.

How do I evaluate IBM storage solutions in regards to [PCI-DSS] requirements.

Well, we are not lawyers, and some aspects of the PCI-DSS requirements are outside the storage realm. In March 2010, IBM was named ["Best Security Company"] by SC Magazine, and we have secure storage solutions for both disk and tape systems. IBM DS8000 and DS5000 series offer Full Disk Encryption (FDE) disk drives. IBM LTO-4/LTO-5 and TS1120/TS1130 tape drives meet FIPS requirements for encryption. We will provide you contact information on an encryption expert to address the other parts of your PCI-DSS specific concerns.

My telco will only offer FCIP routing for long-distance disk replication, but my CIO wants to use Fibre Channel routing over CWDM, what do I do?

IBM XIV, DS8000 and DS5000 all support FC-based long distance replication across CWDM. However, if you don't have dark fiber, and your telco won't provide this option, you may need to re-negotiate your options.

My DS4800 sometimes reboots repeatedly, what should I do.

This was a known problem with microcode level 760.28, it was detecting a failed drive. You need to replace the drive, and upgrade to the latest microcode.

Should I use VMware snapshots or DS5000 FlashCopy?

VMware snapshots are not free, you need to upgrade to the appropriate level of VMware to get this function, and it would be limited to your VMware data only. The advantage of DS5000 FlashCopy is that it applies to all of your operating systems and hypervisors in use, and eliminates the consumption of VMware overhead. It provides crash-consistent copies of your data. If your DS5000 disk system is dedicated to VMware, then you may want to compare costs versus trade-offs.

Any truth to the rumor that Fibre Channel protocol will be replaced by SAS?

SAS has some definite cost advantages, but is limited to 8 meters in length. Therefore, you will see more and more usage of SAS within storage devices, but outside the box, there will continue to be Fibre Channel, including FCP, FICON and FCoE. The Fibre Channel Industry Alliance [FCIA] has a healthy roadmap for 16 Gbps support and 20 Gbps interswitch link (ISL) connections.

What about Fibre Channel drives, are these going away?

We need to differentiate the connector from the drive itself. Manufacturers are able to produce 10K and 15K RPM drives with SAS instead of FC connectors. While many have suggested that a "Flash-and-Stash" approach of SSD+SATA would eliminate the need for high-speed drives, IBM predicts that there just won't be enough SSD produced to meet the performance needs of our clients over the next five years, so 15K RPM drives, more likely with SAS instead of FC connectors, will continue to be deployed for the next five years.

We'd like more advanced hands-on labs, and to have the certification exams be more product-specific rather than exams for midrange disk or enterprise disk that are too wide-ranging.

Ok, we will take that feedback to the conference organizers.

IBM Tivoli Storage Manager is focused on disaster recovery from tape, how do I incorporate remote disk replication.

This is IBM's Unified Recovery Management, based on the seven tiers of disaster recovery established in 1983 at GUIDE conference. You can combine local recovery with FastBack, data center server recovery with TSM and FlashCopy manager, and combine that with IBM Tivoli Storage Productivity Center for Replication (TPC-R), GDOC and GDPS to manage disk replication across business continuity/disaster recovery (BC/DR) locations.

IBM Tivoli Storage Productivity Center for Replication only manages the LUNs, what about server failover and mapping the new servers to the replicated LUNs?

There are seven tiers of disaster recovery. The sixth tier is to manage the storage replication only, as TPC-R does. The seventh tier adds full server and network failover. For that you need something like IBM GDPS or GDOC that adds this capability.

All of my other vendor kit has bold advertising, prominent lettering, neon lights, bright colors, but our IBM kit is just black, often not even identifying the specific make or model, just "IBM" or "IBM System Storage".

IBM has opted for simplified packaging and our sleek, signature "raven black" color, and pass these savings on to you.

Bring back the SHARK fins!

We will bring that feedback to our development team. ("Shark" was the codename for IBM's ESS 800 disk model. Fiberglass "fins" were made as promotional items and placed on top of ESS 800 disk systems to help "identify them" on the data center floor. Unfortunately, professional golfer [<a href="http://www.shark.com/">Greg Norman</a>] complained, so IBM discontinued the use of the codename back in 2005.)

Where is Infiniband?

Like SAS, Infiniband had limited distance, about 10 to 15 meters, which proved unusable for server-to-storage network connections across data center floorspace. However, there are now 150 meter optical cables available, and you will find Infiniband used in server-to-server communications and inside storage systems. IBM SONAS uses Infiniband today internally. IBM DCS9900 offers Infiniband host-attachment for HPC customers.

We need midrange storage for our mainframe please?

In addition to the IBM System Storage DS8000 series, the IBM SAN Volume Controller and IBM XIV are able to connect to Linux on System z mainframes.

We need "Do's and Don'ts" on which software to run with which hardware.

IBM [Redbooks] are a good source for that, and we prioritize our efforts based on all those cards and letters you send the IBM Redbooks team.

The new TPC v4 reporting tool requires a bit of a learning curve.

The new reporting tool, based on Eclipse's Business Intelligence Reporting Tool [BIRT], is now standardized across the most of the Tivoli portfolio. Check out the [Tivoli Common Reporting] community page for assistance.

An unfortunate side-effect of using server virtualization like VMware is that it worsens management and backup issues. We now have many guests on each blade server.

IBM is the leading reseller of VMware, and understands that VMware adds an added layer of complexity. Thankfully, IBM Tivoli Storage Manager backups uses a lightweight agent. IBM [System Director VMcontrol] can help you manage a variety of hypervisor environments.

This was a great interactive session. I am glad everyone stayed late Thursday evening to participate in this discussion.

Centralized and simplified encryption key management through Tivoli Key Lifecycle Manager's lifecycle of creation, storage, rotation, and protection of encryption keys and key serving through industry standards. TKLM is available to manage the encryption keys for LTO-4, LTO-5, TS1120 and TS1130 tape drives enabled for encryption, as well as DS8000 and DS5000 disk systems using Full Disk Encryption (FDE) disk drives.

Partitioning of Access Control for Multitenancy

Access control and partitioning of the key serving functions, including end-to-end authentication of encryption clients and security of exchange of encryption keys, such that groups of devices have different sets of encryption keys with different administrators. This enables [multitenancy] or multilayer security of a shared infrastructure using encryption as an enforcement mechanism for access control. As Information Technology shifts from on-premises to the cloud, multitenancy will become growingly more important.

Support for KMIP 1.0 Standard

Support for the new key management standard, Key Management Interoperability Protocol (KMIP), released through the Organization for the Advancement of Structured Information Standards [OASIS]. This new standard enables encryption key management for a wide variety of devices and endpoints. See the
[22-page KMIP whitepaper] for more information.

As much as I like to poke fun at Oracle, with hundreds of their Sun/StorageTek clients switching over to IBM tape solutions every quarter, I have to give them kudos for working cooperatively with IBM to come up with this KMIP standard that we can both support.

Support for non-IBM devices from Emulex, Brocade and LSI

Support for IBM self-encrypting storage offerings as well as suppliers of IT components which support KMIP, including a number of supported non-IBM devices announced by business partners such as Emulex, Brocade, and LSI. KMIP support permits you to deploy Tivoli Key Lifecycle Manager without having to worry about being locked into a proprietary key management solution. If you are a client with multiple "Encryption Key Management" software packages, now is a good time to consolidate onto IBM TKLM.

Role-based Authorization

Role-based access control for administrators that allows multiple administrators with different roles and permissions to be defined, helping increase the security of sensitive key management operations and better separation of duties. For example, that new-hire college kid might get a read-only authorization level, so that he can generate reports, and pack the right tapes into cardboard boxes. Meanwhile, for that storage admin who has been running the tape operations for the past ten years, she might get full access. The advantage of role-based authorization is that for large organizations, you can assign people to their appropriate roles, and you can designate primary and secondary roles in case one has to provide backup while the other is out of town, for example.

(FTC Disclosure: I do not work or have any financial investments in ENC Security Systems. ENC Security Systems did not paid me to mention them on this blog. Their mention in this blog is not an endorsement of either their company or any of their products. Information about EncryptStick was based solely on publicly available information and my own personal experiences. My friends at ENC Security Systems provided me a full-version pre-loaded stick for this review.)

The EncryptStick software comes in two flavors, a free/trial version, and the full/paid version. The free trial version has [limits on capacity and time] but provides enough glimpse of the product to decide before you buy the full version. You can download the software yourself and put in on your own USB device, or purchase the pre-loaded stick that comes with the full-version license.

Whichever you choose, the EncryptStick offers three nice protection features:

Encryption for data organized in "storage vaults", which can be either on the stick itself, or on any other machine the stick is connected to. That is a nice feature, because you are not limited to the capacity of the USB stick.

Encrypted password list for all your websites and programs.

A secure browser, that prevents any key-logging or malware that might be on the host Windows machine.

I have tried out all three functions and everything works as advertised. However, there is always room for improvement, so here are my suggestions.

Plausible Deniability

The first problem is that the pre-loaded stick looks like it is worth a million dollars. It is in a shiny bronze color with "EncryptStick" emblazoned on it. This is NOT subtle advertising! This 8GB capacity stick looks like it would be worth stealing solely on being a nice piece of jewelry, and then the added bonus that there might be "valuable secrets" just makes that possibility even more likely.

If you want to keep your information secure, it would help to have "plausible deniability" that there is nothing of value on a stick. Either have some corporate logo on it, of have the stick look like a cute animal, like these pig or chicken USB sticks.

It reminds me how the first Apple iPod's were in bright [Mug-me White]. I use black headphones with my black iPod to avoid this problem.

Of course, you can always install the downloadable version of EncryptStick software onto a less conspicuous stick if you are concerned about theft. The full/paid version of EncryptStick offers an option for "lost key recovery" which would allow you to backup the contents of the stick and be able to retrieve them on a newly purchased stick in the event your first one is lost or stolen.

.

The Cap

Imagine how "unlucky" I felt when I notice that I had lost my "rabbits feet" on this cute animal-themed USB stick.

I sense trouble for losing the cap on my EncryptStick as well. This might seem trivial, but is a pet-peeve of mine that USB sticks should plan for this. Not only is there nothing to keep the cap on (it slides on and off quite smoothly), but there is no loop to attach the cap to anything if you wanted to.

Since then, I got smart and try to look for ways to keep the cap connected. Some designs, like this IBM-logoed stick shown above, just rotate around an axle, giving you access when you need it, and protection when it is folded closed.

Alternatively, get a little chain that allows you to attach the cap to the main stick. In the case of the pig and chicken, the memory section had a hole pre-drilled and a chain to put through it. I drilled an extra hole in the cap section of each USB stick, and connected the chain through both pieces.

(Warning: Kids, be sure to ask for assistance from your parents before using any power tools on small plastic objects.)

.

Multi-OS Support

The EncryptStick can run on either Microsoft Windows or Mac OS. The instructions indicate that you can install both versions of download software onto a single stick, so why not do that for the pre-loaded full version? The stick I have had only the Windows version pre-loaded. I don't know if the Windows and Mac OS versions can unlock the same "storage vaults" on the stick.

Certainly, I have been to many companies where either everyone runs Windows or everyone runs Mac OS. If the primary target audience is to use this stick at work in one of those places, then no changes are required. However, at IBM, we have employees using Windows, Mac OS and Linux. In my case, I have all three! Ideally, I would like a version of EncryptStick that I could take on trips with me that would allow me to use it regardless of the Operating System I encountered.

Since there isn't a Linux-version of EncryptStick software, I decided to modify my stick to support booting Linux. I am finding more and more Linux kiosks when I travel, especially at airports and high-traffic locations, so having a stick that works both in Windows or Linux would be useful. Here are some suggestions if you want to try this at home:

Use fdisk to change the FAT32 partition type from "b" to "c". Apparently, Grub2 requires type "c", but the pre-loaded EncryptStick was set to "b". The Windows version of EncryptStick> seems to work fine in either mode, so this is a harmless change.

Install Grub2 with "grub-install" from a working Linux system.

Once Grub2 is installed, you can boot ISO images of various Linux Rescue CDs, like [PartedMagic] which includes the open-source [TrueCrypt] encryption software that you could use for Linux purposes.

This USB stick could also be used to help repair a damaged or compromised Windows system. Consider installing [Ophcrack] or [Avira].

Certainly, 8GB is big enough to run a full Linux distribution. The latest 32-bit version of [Ubuntu] could run on any 32-bit or 64-bit Intel or AMD x86 machine, and have enough room to store an [encrypted home directory].

Since the stick is formatted FAT32, you should be able to run your original Windows or Mac OS version of EncryptStick with these changes.

Depending on where you are, you may not have the luxury to reboot a system from the USB memory stick. Certainly, this may require changes to the boot sequence in the BIOS and/or hitting the right keys at the right time during the boot sequence. I have been to some "Internet Cafes" that frown on this, or have blocked this altogether, forcing you to boot only from the hard drive.

Well, those are my suggestions. Whether you go on a trip with or without your laptop, it can't hurt to take this EncryptStick along. If you get a virus on your laptop, or have your laptop stolen, then it could be handy to have around. If you don't bring your laptop, you can use this at Internet cafes, hotel business centers, libraries, or other places where public computers are available.

An exciting new addition to the IBM storage line, the Storwize V7000 is a very versatile and solid choice as a midrange storage device. This session will cover a technical overview of the controller as well as its positioning within the overall IBM storage line.

xST04 - XIV Implementation, Migration and Optimization

Attend this session to learn how to integrate the IBM XIV Storage System in your IT environment. After this session, you should understand where the IBM XIV Storage system fits, and understand how to take full advantage of the performance capabilities of XIV Storage by using the massive parallelism of its grid architecture. You will learn how to migrate data onto the XIV and hear about real world client experiences.

xST05 - IBM's Storage Strategy in the Smarter Computing Era

Want to understand IBM's storage strategy better? This session will cover the three key themes of IBM's Smarter Computing initiative: Big Data, Optimized Systems, and Cloud. IBM System Storage strategy has been aligned to meet the storage efficiency, data protection and retention required to meet these challenges.

xST06 - Understanding IBM's Storage Encryption Options

IBM offers encryption in a variety of ways. Data can be encrypted on the server, in the SAN switch, or on the disk or tape drive. This session will explain how encryption works, and explain the pros and cons with each encryption option.

IBM has focused on data protection and retention, and the IBM Information Archive is the ideal product to achieve it. Come to this session to discuss archive solutions, compliance regulations, and support for full-text indexing and eDiscovery to support litigation.

sGE04 - IBM's Storage Strategy in the Smarter Computing Era

Want to understand IBM's storage strategy better? This session will cover the three key themes of IBM's Smarter Computing initiative: Big Data, Optimized Systems, and Cloud. IBM System Storage strategy has been aligned to meet the storage efficiency, data protection and retention required to meet these challenges.

sSM03 - IBM Tivoli Storage Productivity Center – Overview and Update

IBM's latest release of IBM Tivoli Storage Productivity Center is v4.2.2, a storage resource management tool that manages both IBM and non-IBM storage devices, including disk systems, tape libraries, and SAN switches. This session will give an overview of the various components of Tivoli Storage Productivity Center and provide an update on what's new in this product.

sSN06 - SONAS and the Smart Business Storage Cloud (SBSC)

Confused over IBM's Cloud strategy? Trying to figure out how IBM Storage plays in private, hybrid or public cloud offerings? This session will cover both the SONAS integrated appliance and the Smart Business Storage Cloud customized solution, and will review available storage services on the IBM Cloud.

sTA01 - Tape Storage Reinvented: What's New and Exciting in the Tape World?

This very informative session will keep you up to date with the latest tape developments. These include the TS3500 tape library connector Model SC1 (Shuttle). The shuttle enables extreme scalability of over 300,000 tape cartridges in a single library image by interconnecting multiple tape libraries with a unique, high speed transport system. The world's fastest tape drive, the TS1140 3592-E07, will also be presented. The performance and functionality of the new TS1140 as well as the new 4TB tape media will be discussed. Also, the IBM System Storage Linear Tape File System (LTFS), including the Library Edition, will be presented. LTFS allows a disk-like, drag-and-drop interface for tape. This is a not-to-be-missed session for all you tape lovers out there!

In December, I will be going to Gartner's Data Center Conference in Las Vegas, but the agenda has not been finalized, so I will save that for another post.

With my colleague, Mike Griese, presenting TPC 5.1 and the IBM SmartCloud Virtual Storage Center earlier this week, you might wonder what is left to say. Mike's session was intended more for clients who already have TPC deployed, but my session is more of an introductory session.

I was the original architect of the product back in 2000-2003, so have some insight into the history, motivations and design principles applied to each version of the product. It has evolved nicely over the years, and while I am no longer working full-time on the product, I am still very much involved, and am consulted by the current architects and product managers for direction and opinion going forward.

I presented an overview of the overall product as it stands today in its current v4.2.2 version, and gave a few highlights of what to expect in the upcoming TPC 5.1 announced this week.

Encryption and Key Management in the Cloud: The Top 6 Concerns to Ensure a Secure and Reliable Solution

This was a split session with two speakers. The first speaker was Richard Moulds, VP of Strategy and Marketing from Thales, and the second speaker was Gordon Arnold, IBM Senior Technical Staff Member (STSM) and Software Architect for Tivoli Security Management.

Richard presented security issues in the cloud. He is an author of several books, including "Key Management for Dummies" and "Data Protection and PCI Compliance for Dummies". Thales is a large French companay of 70,000 people nobody in the USA has heard of, but is a major company in the area of IT Security. He presented survey results about people's perceptions and attitudes towards encryption and security issues in the cloud.

The security threats in the Cloud were presented as the "Seven Deadly Sins":

Data loss and leakage, including data that is not deleted with resources are re-used for other purposes

Shared technologies, especially in Cloud environments that do not have robust multi-tenancy

Malicious insiders, such as administrators being bribed to provide access to sensitive data

Account or service hijacking, including those that pretend to be someone else, asking for password resets

Insecure APIs for applications and services, many of these APIs were developed quickly, recently, and perhaps without the robust review from a security perspective

Abuse of the Cloud, such as using the Cloud itself to crack passwords or break decryption passwords through parallel processing

IBM's use of Key Encrypting Keys on disk and tape has proven to be quite useful. The only copy of the encryption key is on the media, and is then encrypted by an authorization key. If you need to defensibly delete the data for compliance reasons, you can simply destroy the encrption key.

At lunch, I spoke with Scott Laningham who was doing video interviews. For years, Scott was the #1 blogger on IBM developerWorks until I took over the title last year. We discussed working on a video in the future on this.

For the past three decades, IBM has offered security solutions to protect against unauthorized access. Let's take a look at three different approaches available today for the encryption of data.

Approach 1: Server-based

Server-based encryption has been around for a while. This can be implemented in the operating system itself, such as z/OS on the System z mainframe platform, or with an applicaiton, such as IBM Tivoli Storage Manager for backup and archive.

While this has the advantage that you can selectively encrypt individual files, data sets, or columns in databases, it has several drawbacks. First, you consume server resources to perform the encryption. Secondly, as I mention in the video above, if you only encrypt selected data, the data you forget to, or choose not to, encrypt may result in data exposure. Third, you have to manage your encryption keys on a server-by-server basis. Fourth, you need encryption capability in the operating system or application. And fifth, encrypting the data first will undermine any storage or network compression capability down-line.

Approach 2: Network-based

Network-based solutions perform the encryption between the server and the storage device. Last year, when I was in Auckland, New Zealand, I covered the IBM SAN32B-E4 switch in my presentation [Understanding IBM's Storage Encryption Options]. This switch receives data from the server, encrypts it, and sends it on down to the storage device.

This has several advantages over the server-based approach. First, we offload the server resources to the switch. Second, you can encrypt all the files on the volume. You can select which volumes get encrypted, so there is still the risk that you encrypt only some volumes, and not others, and accidently expose your data. Third, the SAN32B-E4 can centralized the encryption key management to the IBM Tivoli Key Lifecycle Manager (TKLM). This is also operating system and application agnostic. However, network-based encryption has the same problem of undermining any storage device compression capability, and often has a limit on the amount of data bandwidth it can process. The SAN32B-E4 can handle 48 GB/sec, with a turbo-mode option to double this to 96 GB/sec.

Approach 3: Device-based

Device-based solutions perform the encryption at the storage device itself. Back in 2006, IBM was the first to introduce this method on its [TS1120 tape drive]. Later, it was offered on Linear Tape Open (LTO-4) drives. IBM was also first to introduce Full Disk Encryption (FDE) on its IBM System Storage DS8000. See my blog post [1Q09 Disk Announcements] for details.

As with the network-based approach, the device-based method offloads server resources, allows you to encrypt all the files on each volume, can centrally manage all of your keys with TKLM, and is agnostic to operating system and application used. The device can compress the data first, then encrypt, resulting in fewer tape cartridges or less disk capacity consumed. IBM's device-based approach scales nicely. IBM has an encryption chip is placed in each tape drive or disk drive. No matter how many drives you have, you will have all the encryption horsepower you need to scale up.

Not all device-based solutions use an encryption chip per drive. Some of our competitors encrypt in the controller instead, which operates much like the network-based approach. As more and more disk drives are added to your storage system, the controller may get overwhelmed to perform the encryption.

The need for security grows every year. Enterprise Systems are Security-ready to protect your most mission critical application data.

Earlier this year, IBM mandated that every employee provided a laptop had to implement Full-Disk Encryption for their primary hard drive, and any other drive, internal or external, that contained sensitive information. An exception was granted to anyone who NEVER took their laptop out of the IBM building. At IBM Tucson, we have five buildings, so if you are in the habit of taking your laptop from one building to another, then encryption is required!

The need to secure the information on your laptop has existed ever since laptops were given to employees. In my blog post [Biggest Mistakes of 2006], I wrote the following:

"Laptops made the news this year in a variety of ways. #1 was exploding batteries, and #6 were the stolen laptops that exposed private personal information. Someone I know was listed in one of these stolen databases, so this last one hits close to home. Security is becoming a bigger issue now, and IBM was the first to deliver device-based encryption with the TS1120 enterprise tape drive."

Not surprisingly, IBM laptops are tracked and monitored. In my blog post [Using ILM to Save Trees], I wrote the following:

"Some assets might be declared a 'necessary evil' like laptops, but are tracked to the n'th degree to ensure they are not lost, stolen or taken out of the building. Other assets are declared "strategically important" but are readily discarded, or at least allowed to [walk out the door each evening]."

For those of us who may need access to both Operating Systems, we have to choose. Select one as the primary OS, and run the other as a guest virtual machine. I opted for Red Hat Enterprise Linux 6 as my primary, with LUKS encryption, and Linux KVM to run Windows as the guest.

I am not alone. While I chose the Linux method voluntarily, IBM has decided that 70,000 employees must also set up their systems this way, switching them from Windows to Linux by year end, but allowing them to run Windows as a KVM guest image if needed.

Let's take a look at the pros and cons:

Pros

Cons

LUKS allows for up to 8 passphrases, so you can give one to your boss, one to your admin assistant, and in the event they leave the company, you can disable their passphrase without impacting anyone else or having to memorize a new one. PGP on Windows supports only a single passphrase.

Linux is a rock-solid operating system. I found that Windows as a KVM guest runs better than running it natively in a dual-boot configuration.

Linux is more secure against viruses. Most viruses run only on Windows operating systems. The Windows guest is well isolated from the Linux operating system files. Recovering from an infected or corrupted Windows guest is merely re-cloning a new "raw" image file.

Linux has a vibrant community of support. I am very impressed that anytime I need help, I can find answers or assistance quickly from other Linux users. Linux is also supported by our help desk, although in my experience, not as well as the community offers.

Employees that work with multiple clients can have a separate Windows guest for each one, preventing any cross-contamination between systems.

Linux is different from Windows, and some learning curve may be required. Not everyone is happy with this change.

(I often joke that the only people who are comfortable with change are babies with soiled diapers and prisoners on death row!)

Implementation is a full re-install of Linux, followed by a fresh install of Windows.

Not all software required for our jobs at IBM runs on Linux, so a Windows guest VM is a necessity. If you thought Windows ran slowly on a fully-encrypted disk, imagine how much slower it runs as a VM guest with limited memory resources.

In theory, I could have tried the Windows/PGP method for a few weeks, then gone through the entire process to switch over to Linux/LUKS, and then draw my comparisons that way. Instead, I just chose the Linux/LUKS method, and am happy with my decision.

In my last blog post [Full Disk Encryption for Your Laptop] explained my decisions relating to Full-Disk Encryption (FDE) for my laptop. Wrapping up my week's theme of Full-Disk Encryption, I thought I would explain the steps involved to make it happen.

Last April, I switched from running Windows and Linux dual-boot, to one with Linux running as the primary operating system, and Windows running as a Linux KVM guest. I have Full Disk Encryption (FDE) implemented using Linux Unified Key Setup (LUKS).

Here were the steps involved for encrypting my Thinkpad T410:

Step 0: Backup my System

Long-time readers know how I feel about taking backups. In my blog post [Separating Programs from Data], I emphasized this by calling it "Step 0". I backed up my system three ways:

Backed up all of my documents and home user directory with IBM Tivoli Storage Manager.

Backed up all of my files, including programs, bookmarks and operating settings, to an external disk drive (I used rsync for this). If you have a lot of bookmarks on your browser, there are ways to dump these out to a file to load them back in the later step.

Clonezilla allows me to do a "Bare Machine Recovery" of my laptop back to its original dual-boot state in less than an hour, in case I need to start all over again.

Step 1: Re-Partition the Drive

"Full Disk Encryption" is a slight misnomer. For external drives, like the Maxtor BlackArmor from Seagate (Thank you Allen!), there is a small unencrypted portion that contains the encryption/decryption software to access the rest of the drive. Internal boot drives for laptops work the same way. I created two partitions:

A small unencrypted partition (2 GB) to hold the Master Boot Record [MBR], Grand Unified Bootlloader [GRUB], and the /boot directory. Even though there is no sensitive information on this partition, it is still protected the "old way" with the hard-drive password in the BIOS.

The rest of the drive (318GB) will be one big encrypted Logical Volume Manager [LVM] container, often referred to as a "Physical Volume" in LVM terminology.

Having one big encrypted partition means I only have to enter my ridiculously-long encryption password once during boot-up.

Step 2: Create Logical Volumes in the LVM container

I create three logical volumes on the encrypted physical container: swap, slash (/) directory, and home (/home). Some might question the logic behind putting swap space on an encrypted container. In theory, swap could contain sensitive information after a system [hybernation]. I separated /home from slash(/) so that in the event I completely fill up my home directory, I can still boot up my system.

Step 3: Install Linux

Ideally, I would have lifted my Linux partition "as is" for the primary OS, and a Physical-to-Virtual [P2V] conversion of my Windows image for the guest VM. Ha! To get the encryption, it was a lot simpler to just install Linux from scratch, so I did that.

Step 4: Install Windows guest KVM image

The folks in our "Open Client for Linux" team made this step super-easy. Select Windows XP or Windows 7, and press the "Install" button. This is a fresh install of the Windows operating system onto a 30GB "raw" image file.

(Note: Since my Thinkpad T410 is Intel-based, I had to turn on the 'Intel (R) Virtualization Technology' option in the BIOS!)

There are only a few programs that I need to run on Windows, so I installed them here in this step.

Step 5: Set up File Sharing between Linux and Windows

In my dual-boot set up, I had a separate "D:" drive that I could access from either Windows or Linux, so that I would only have to store each file once. For this new configuration, all of my files will be in my home directory on Linux, and then shared to the Windows guest via CIFS protocol using [samba].

In theory, I can share any of my Linux directories using this approach, but I decide to only share my home directory. This way, any Windows viruses will not be able to touch my Linux operating system kernels, programs or settings. This makes for a more secure platform.

Step 6: Transfer all of my files back

Here I used the external drive from "Step 0" to bring my data back to my home directory. This was a good time to re-organize my directory folders and do some [Spring cleaning].

Step 7: Re-establish my backup routine

Previously in my dual-boot configuration, I was using the TSM backup/archive client on the Windows partition to backup my C: and D: drives. Occasionally I would tar a few of my Linux directories and storage the tarball on D: so that it got included in the backup process. With my new Linux-based system, I switched over to the Linux version of TSM client. I had to re-work the include/exclude list, as the files are different on Linux than Windows.

One of my problems with the dual-boot configuration was that I had to manually boot up in Windows to do the TSM backup, which was disruptive if I was using Linux. With this new scheme, I am always running Linux, and so can run the TSM client any time, 24x7. I made this even better by automatically scheduling the backup every Monday and Thursday at lunch time.

There is no Linux support for my Maxtor BlackArmor external USB drive, but it is simple enough to LUKS-encrypt any regular external USB drive, and rsync files over. In fact, I have a fully running (and encrypted) version of my Linux system that I can boot directly from a 32GB USB memory stick. It has everyting I need except Windows (the "raw" image file didn't fit.)

I can still use Clonezilla to make a "Bare Machine Recovery" version to restore from. However, with the LVM container encrypted, this renders the compression capability worthless, and so takes a lot longer and consumes over 300GB of space on my external disk drive.

Backing up my Windows guest VM is just a matter of copying the "raw" image file to another file for safe keeping. I do this monthly, and keep two previous generations in case I get hit with viruses or "Patch Tuesday" destroys my working Windows image. Each is 30GB in size, so it was a trade-off between the number of versions and the amount of space on my hard drive. TSM backup puts these onto a system far away, for added protection.

Step 8: Protect your Encryption setup

In addition to backing up your data, there are a few extra things to do for added protection:

Add a second passphrase. The first one is the ridiculously-long one you memorize faithfully to boot the system every morning. The second one is a ridiculously-longer one that you give to your boss or admin assistant in case you get hit by a bus. In the event that your boss or admin assistant leaves the company, you can easily disable this second passprhase without affecting your original.

Backup the crypt-header. This is the small section in front that contains your passphrases, so if it gets corrupted, you would not be able to access the rest of your data. Create a backup image file and store it on an encrypted USB memory stick or external drive.

If you are one of the lucky 70,000 IBM employees switching from Windows to Linux this year, Welcome!