activestor – HPCwirehttps://www.hpcwire.com
Since 1987 - Covering the Fastest Computers in the World and the People Who Run ThemFri, 09 Dec 2016 21:51:05 +0000en-UShourly1https://wordpress.org/?v=4.760365857Panasas Rolls Out ActiveStor 18https://www.hpcwire.com/2015/07/09/panasas-rolls-out-activestor-18/?utm_source=rss&utm_medium=rss&utm_campaign=panasas-rolls-out-activestor-18
https://www.hpcwire.com/2015/07/09/panasas-rolls-out-activestor-18/#respondThu, 09 Jul 2015 12:51:06 +0000http://www.hpcwire.com/?p=19328Hybrid scale-out NAS specialist Panasas today introduced ActiveStor 18, the latest in its ActiveStor appliance line. According to the company, the new appliance features a 33 percent increase in density, is available in 4TB and 8TB hard drive configurations, and delivers 20PB and 200GB/s scalability. The product will ship in September. “We’re now using Western […]

]]>Hybrid scale-out NAS specialist Panasas today introduced ActiveStor 18, the latest in its ActiveStor appliance line. According to the company, the new appliance features a 33 percent increase in density, is available in 4TB and 8TB hard drive configurations, and delivers 20PB and 200GB/s scalability. The product will ship in September.

“We’re now using Western Digital HGST He8 technology – a seven platter, helium filled design. This is also our first generation of ActiveStor with the file system support necessary to leverage 4KB native sector support,” said Geoffrey Noer, vice president of product management at Panasas. The amount of cache per storage blade was also doubled. (Specs snapshot below)

Panasas has mainly served HPC workflows with strength in traditional technical computing markets such as Oil & Gas, government, and manufacturing. The new release continues that focus but also supports the company’s new thrust into media and entertainment, which Panasas says is a market in transition and increasingly turning to HPC technologies. Company literature indicates ActiveStor 18 is well suited for mixed workloads: large file throughput, IOPS performance, and lowest cost per TB.

Of note, Panasas is one of the few storage vendors sticking with its proprietary storage operating system, PanFS, which the company considers an important differentiator: By closely integrating PanFS with the hardware, “You substantially improve reliability, manageability, and performance. PanFS provides a single global namespace for simplified storage management,” said Noer. That said, three protocols are supported: Panasas DirectFlow, NFS (Linux), and SMB (Windows).

The most important differentiator may be the Panasas two-blade hybrid architecture and DirectFlow protocol. “We have a storage blade and a director blade. The storage blade scales the amount of storage capacity and throughput performance; the director blades are responsible for metadata, small files, and transactional type of performance. Both of those resources scale as you scale the system,” said Noer. Separating the two functions (main storage and metadata handling) reduces the need to frequently interrupt the storage blade and boosts throughput performance.

In typical HPC workflows, said Noer, 60 to 80 percent of all of the files by count are smaller than 64 Kilobytes, they’re very small files, but all of those small files take up less than one percent of the file systems’ capacity.

Also included on the appliance is PanFS RAID 6+, first introduced last summer by Panasas. PanFS RAID6+ is an intelligent per-file distributed RAID architecture implemented with erasure codes in software instead of traditional hardware RAID controllers. “Rebuilding drives was starting to be a problem even at 1TB or 2TB hard drives. Now with 6TB and 8TB drives you can be looking at upwards of a week or certainly days to do a rebuild in that sort of an environment with hardware RAID.”

“If you approach things very differently [as in RAID 6+] and protect the files instead of the blocks of hard drives you can limit the rebuilds to only the work that absolutely has to be done. So if we have a clump of bad sectors for example we only have to rebuild the files that touched the bad sectors, we don’t have to rebuild the entire storage,” said Noer. “If the whole hard drive actually dies, again we don’t have to rebuild the unused capacity, we only rebuild the files that were affected by that drive and avoiding rebuilds means not having to bring the performance of the systems down unnecessarily.”

Interest in RAID 6+ is growing according to a recent survey of 90 Panasas ActiveStor users in which 62 percent reported RAID 6+ will be important and 16 percent characterized it as critical. “About the only place RAID 6+ is not so valuable is in a very pure HPC scratch environment where the data is temporary and can easily be regenerated,” Noer said.

Though at home in HPC, Panasas has begun looking towards broader enterprise opportunities. Media and entertainment is the first.“That segment has already gone from standard resolution to HD and now a lot of filming is being done in 4K resolution and media is being distributed in more formats. There’s mounting pressure on data storage they are staring to consider scale out solutions over older SAN technology,” said Noer.

]]>https://www.hpcwire.com/2015/07/09/panasas-rolls-out-activestor-18/feed/019328Panasas Boosts ActiveStor with Fat Drives, RAID 6+https://www.hpcwire.com/2014/06/12/panasas-boosts-activestor-fat-drives-raid-6/?utm_source=rss&utm_medium=rss&utm_campaign=panasas-boosts-activestor-fat-drives-raid-6
https://www.hpcwire.com/2014/06/12/panasas-boosts-activestor-fat-drives-raid-6/#respondThu, 12 Jun 2014 15:20:16 +0000http://www.hpcwire.com/?p=13149Disk drives keep getting fatter, and the time to rebuild the data on them keeps taking longer and longer. And so the engineers at disk array maker Panasas have been working for the past several years to rework RAID data protection so it is more suitable to large scale parallel file systems. The culmination of […]

]]>Disk drives keep getting fatter, and the time to rebuild the data on them keeps taking longer and longer. And so the engineers at disk array maker Panasas have been working for the past several years to rework RAID data protection so it is more suitable to large scale parallel file systems. The culmination of this work is the new RAID 6+ feature of the PanFS 6.0 parallel file system, which launched this week concurrent with an update to the ActiveStor arrays that will see much more capacious disk drives added to the machines.

Garth Gibson, chief scientist at Panasas, was one of the co-authors of the original RAID research paper that came out of the University of California at Berkeley, suggesting that with the right data protection algorithms, an array of relatively cheap disk drives could be made more resilient than very expensive and larger disk subsystems commonly attached to high-end systems. The RAID paper is a quintessential example of using a parallel architecture to boost throughput and increase capacity while lowering cost and power consumption compared to monolithic systems. (In this case, 14-inch IBM 3380 mainframe disk drives versus an array of 3.5-inch SCSI disks from Conner Peripherals.) It is fitting that Gibson, who is still very much active in Panasas, has taken another swipe at RAID data protection and has come up with a new RAID 6+ triple parity protection scheme for PanFS 6.0. The update also includes what Panasas calls per-file distributed RAID, which as the name suggests protects at the file level and does not require for a whole disk drive to be rebuilt when there is a failure in a RAID set.

With RAID data protection being around for a long time, and with many variations that dice and slice parity data (used to recover lost disks) in different ways, it might seem strange that Panasas is talking about improving RAID algorithms. But in many cases, RAID controllers are bottlenecks in array performance, or RAID protection is not used and more brute force data replication methods are used instead. Panasas would content that this is wasteful and has invested to make RAID data protection better as it scales across larger and larger arrays.

“We know that unstructured data growth is driving requirements for next-generation storage arrays in the enterprise and in HPC,” explains Faye Pairman, president and chief executive officer at Panasas, and, citing statistics that data growth is expected to increase by 800 percent over the next five years, “four-fifths of that data is going to be unstructured.”

“We think the explosion of data drives a different view on availability and reliability,” notes Pairman, “HPC always leads the way, and there is almost an insatiable desire for more processing and that always drives storage attachment rates. Whether it is traditional HPC or scale-out enterprise with unstructured data, we think that the size of the deployments and the size of the disk drives used today really dictated a need for a different approach to scalability and availability.”

The ActiveStor arrays differ from many network-attached storage arrays in that the architecture of the hardware and the software is such that there are no filers or traffic managers in the datapath between the systems requesting data and the storage blades that are the building block of the ActiveStor machines. The file system is parallel and the data paths are parallel, so a blade can pass data directly from a blade in the ActiveStor array to a cluster node; there is no bottleneck.

The problem with traditional RAID arrays (whether they are based on disk or flash drives or a mix of the two) is that reliability worsens linearly as you scale up the array. The more devices you have, the higher the probability of a failure at any given time. Also, on RAID arrays, if you lose single sectors on a disk drive, you have to rebuild an entire drive. With RAID 5 and RAID 6 protection, the parity data that is used to rebuild missing files using the RAID algorithm is spread across multiple drives and is used to recreate the data when a disk crashes (you basically run the algorithm that spread a file across the drives backwards, adding in the parity data to calculate the missing bits). This is all well and good until you have 4 TB or 6 TB disk drives, which take forever to rebuild, and it is even less practical when you have hundreds to thousands of such fat disks in an array. At any given time, a disk is failing and recovering, and this impacts performance for a RAID group. In some arrays, losing a RAID group means the whole file system is down, and in a worst case scenario, it can take weeks to restore an entire file system. While the file system is down, the system is down, even if only one file is actually the only thing that is corrupted.

“We don’t want to rebuild an entire gigantic array just to recover a number of files,” explains Pairman. “And we are addressing this notion that the system is either all up or all down. Up until now, there was no process to be able to access unaffected files.”

Concurrent with the launch of the new PanFS 6.0 is a set of new hardware, called the ActiveStor 16 arrays. The new arrays employ the UltraStar He6 disk drives from HGST (formerly a unit of Hitachi and now owned by Western Digital). These are the first 6 TB drives on the market, and that 50 percent increase in density is made possible because helium gas is less turbulent than air. The lower turbulence also cuts energy use by the 3.5-inch disk drive by 23 percent.

The ActiveStor arrays have two types of blades, a storage blade and a director blade. As the name suggests, the director blade manages the system and also keeps metadata about where files are stored on the parallel file system. With the ActiveStor 16 update, Panasas is shifting to a faster quad-core 2.53 GHz “Jasper Forest” processor from Intel. (This is a chip made for embedded applications). This director blade also has 48 GB of if own memory used as metadata cache, and this extra CPU and memory capacity helps improve RAID rebuilds as well as small file serving and metadata performance.

The storage blades on the ActiveStor 16 have their components right sized for the fatter 6 TB disks, with a larger 240 GB solid state disk for serving up small files and metadata and optimized to run the RAID6+ protocol. The storage blade has a single-core version of the Jasper Forest Intel processor, and it has 8 GB of its own memory that is used as cache plus two 6 TB drives. A 4U shelf of the ActiveStor 16 arrays has 122.4 TB of capacity and 1.5 GB/sec of bandwidth across its 20 storage blades. Up to 100 shelves, with a maximum of 2,000 disks and 1,000 SSDs, can be lashed together in a single global namespace that spans 12 PB of capacity and delivers 150 GB/sec of bandwidth out of the PanFS file system.

With RAID 6, two copies of the parity data used to reconstruct a failed disk drive are spread across the RAID group. With the RAID 6+ triple parity protection cooked up by Gibson and his colleagues at Panasas, the three copies of the parity data allow for protection against two simultaneous drive failures and single sector errors on multiple drives. This is about 150X more reliable than dual parity approaches in RAID arrays, explains Geoffrey Noer, senior director of product marketing at Panasas. The RAID 6+ algorithm carries about a 25 percent capacity overhead, compared to around 18 percent with most dual parity RAID 6 controllers, according to Noer.

While the RAID 6+ triple parity is important, so is the per-file distributed RAID that is also part of the new PanFS 6.0. With this feature, the rebuilding process scales linearly with the number of directors in the whole parallel file system, and importantly, the more drives you have, the less dramatic the recovery measures have to be. Here is a visual to illustrate how the recovery is less arduous when three drives fail on an array with twenty drives compared to one with only ten drives:

On a traditional RAID 6 array with ten fixed drives and a RAID controller, if you lose three drives, you have to restore all the files. On an ActiveStor array running the PanFS software, as you scale up the drives, the percent of files that need to be restored goes down because data is spread further apart on the increasing number of drives. So, for instance, Panasas says that on an ActiveStor with 40 drives, three disk failures could mean having to restore a few percent of the files, but as you scale up to 2,000 drives the share of files that needs to be restored gets very close to zero. On an array with 1,000 drives, about one in 200 million files will need to be restored after a three disk failure, according to Noer. And, thanks to the Extended File System Availability feature in PanFS, all of the files that are not affected by a three-drive failure event can be accessed normally. The dead files have to be restored from the RAID parity data or from an archive.

“A file system that is ten times larger rebuilds ten times faster,” says Noer, providing a rule of thumb. “This is important because if you have ten times the number of drives, but the rebuild is one tenth of the time, your risk has stayed the same.” When you add in the per-file distributed RAID, then scaling up the file system by a factor of ten actually increases the reliability of the data in the file system by a factor of a thousand.

Here is the pricing on the ActiveStor 14 and 16 arrays:

Panasas is taking orders for the ActiveStor 16 systems now and expects to start shipping the PanFS file system and the new arrays in September. PanFS 6.0 will be available to customers using ActiveStor 11, 12, and 14 systems (there was no 13 generation) who have their systems under current maintenance contracts. PanFS 6.0 ships by default on the ActiveStor 16 arrays.

To gain a competitive edge, most organizations are incorporating ever-large data sets and more variable data formats into these computational workflows to help derive better information upon which to make smarter decisions.

In most industries today, (whether it is financial services, manufacturing, academic research, healthcare and life sciences, or energy exploration) data analysis, modeling, and visualization efforts are critical to success.

To gain a competitive edge, most organizations are incorporating ever-large data sets and more variable data formats into these computational workflows to help derive better information upon which to make smarter decisions.

These big data applications are placing new attention on the high performance computing (HPC) solutions used to run the algorithms and process the raw data. Due to the larger volumes and greater variety of data types, as well as the desire to use more robust analysis, modeling, and visualization routines, HPC solutions can be used to provide high sustained I/O and throughput, while being optimized to cost-effectively handle highly variable workflows.

The essential element in all of this work is a need for speed. Organizations need fast time-to-results so that they can make the right decisions (which well to drill, which new drug candidate to develop, which product design to produce, which customer to award a lower rate loan to) before their competitors.

Complications and challenges that can impede HPC workflows

When looking to accelerate HPC workloads, there are several factors that can play a major role in overall performance.

To start, today’s analysis, modeling, and visualization efforts are carried out using much more sophisticated algorithms in order to derive more detailed and realistic results. The output from these routines offers finer spatial or temporal resolution and consequently results in much larger size output data sets. In a typical workflow, those output files might be used as input to another analysis, modeling, or visualization application.

These operations can impact HPC workflows since the great volumes of data produced by the initial run must be written to disk and saved and then the data must be ingested by yet another routine. Both operations can generate high I/O and throughput demands on an infrastructure. And if the infrastructure is not capable of sustaining these data transfers, the computational workflows can slow significantly.

Another factor has to do with the data that is being used in today’s analysis, modeling, and visualization efforts. Nearly every industry is now making use of much larger data sets, richer sets (such as that produced from newer seismic imaging tools or next-generation sequencers), and many more types of data. However, most users, even those who primarily have large data sets, also have large numbers of small files – even if they consume a relatively small percentage of the total capacity.

Big data and HPC solutions must therefore not only be capable of quickly accessing the large volumes of data required for the computations, they also must intelligently stage the different types of data, which comes in varying file formats and sizes, on suitably high performance storage.

Required storage solution characteristics

Organizations continually deploy new servers with more powerful CPUs to improve and speed up their analysis, modeling, and visualization efforts. To make the best use of such computing resources, an HPC solution must have a suitable storage solution to sustain HPC workflows.

A storage solution for today’s big data and HPC environments must be able to easily scale. Some solutions offer help meeting the growing data volume demands, but fall short when trying to keep CPUs satiated. To help accelerate HPC workflows, a storage solution must also scale in performance so that as the data volumes grow, the system supports the higher I/O and throughput required to get faster results.

Finally, a storage solution must be optimized to handle today’s HPC big data workflows consisting of data sets of files of all sizes. If all data used were in the same format – a structured database, for example – or of the same relative file size, a solution could be highly optimized to handle the specific data. Working with the mixed data sets used today requires a storage solution that optimizes workflow performance for each data type.

Panasas introduces an integrated SSD/SATA approach

Panasas ActiveStor storage systems have a modular blade architecture integrated with its PanFS parallel file system. The design eliminates the bottleneck of a single RAID controller to deliver high-performance, scalable storage. Prior generations of ActiveStor have been based solely on SATA drives and were well-tuned for high throughput.

With the fifth-generation ActiveStor 14, Panasas has taken a unique approach, leveraging lightning fast SSDs integrated with high capacity SATA disk to improve storage performance while keeping costs down. Rather than use SSD for caching or for “most recent” file access as many other vendors have done, ActiveStor 14 stores all metadata and small files (less than 60KB) on the SSDs and larger files on SATA drives.

Metadata is accessed frequently so fast metadata access benefits all types of workloads. All file operations, including reads and writes, require access to metadata. In many cases, such as directory listings, access to the metadata is all that is required to satisfy an I/O request. Storing metadata on SSD boosts performance for all storage operations, especially for directory functions (listing, searches, etc.) and RAID rebuilds in the event of a drive error. Rebuild performance has been improved so that the new 4TB drives can be rebuilt in the same amount of time as the 3TB drives in the prior generation ActiveStor 12, maintaining a high level of data integrity and system reliability.

Small file access can be disproportionately slow when read from, or written to, standard hard disk drives. Accesses of less than a full sector are inefficient, particularly for random I/O. Furthermore, reads and writes of small files can conflict with streaming reads or writes of large files to the same disk. By maintaining small files on SSD, such conflicts are eliminated. In addition, ActiveStor 14 stores the first 12KB of all files inside the file system metadata, improving SSD efficiency while increasing small file performance. This efficient storage of small files on SSD, dramatically improves response time and IOPS, as evidenced by very impressive SPEC sfs2008 NFS IOPS results that Panasas has published.

ActiveStor 14 is available in three configurations with varying sizes of SSD, SATA and cache. The amount of SSD for acceleration ranges from 1.5 percent up to 10.7 percent of total storage capacity. The bulk of the storage capacity, however, is on cost-effective SATA drives, keeping the overall cost per terabyte lower than the prior generation, and very competitive in the market today.

The Importance of Ease of Use and Management

Equally important to the performance and reliability of any storage system is the ease of use and management of the product. With ActiveStor, organizations can simply add blade enclosures to non-disruptively increase capacity and performance of the global file system as storage requirements grow. Parallel access to data and automated load balancing ensure that performance is optimized. This makes it easy to linearly scale capacity to over eight petabytes and performance to 150GB/s or 1.4M IOPS.

Conclusion

The end result is a high-performance storage system that delivers high throughput and IOPS, ideal for the most demanding HPC and big data workloads and accelerates time-to-results. ActiveStor delivers unmatched scale-out NAS performance in addition to the manageability, reliability, and value required by demanding computing organizations in the biosciences, energy, finance, government, manufacturing, media, and other research sectors.

]]>https://www.hpcwire.com/2012/10/01/intelligent_application_of_ssds_to_accelerate_hpc_workloads/feed/04330Panasas Backfills ActiveStor Lineuphttps://www.hpcwire.com/2011/06/23/panasas_backfills_activestor_lineup/?utm_source=rss&utm_medium=rss&utm_campaign=panasas_backfills_activestor_lineup
https://www.hpcwire.com/2011/06/23/panasas_backfills_activestor_lineup/#respondThu, 23 Jun 2011 07:00:00 +0000http://www.hpcwire.com/?p=4801High performance parallel storage vendor Panasas is eyeing the technical computing and big data markets with the release of its ActiveStore 11 parallel storage system appliance, a step back from 12 that comes with a more balanced storage profile and lower price per terabyte than the performance-driven ActiveStor 12.

]]>High performance parallel storage vendor Panasas is once again eyeing the technical computing and big data markets with the release of its ActiveStore 11 parallel storage system appliance–this time with keen emphasis on customer needs to consider storage in the face of budget constraints versus simply no-holds-barred performance.

Last year Panasas launched its PAS 12 high performance NAS storage line, which represented the fourth generation of the flagship product, which they dubbed as the world’s “fastest parallel storage system.” Notice that they are moving backwards in their numbering system with ActiveStor 11, a fact that represents going in reverse on pricing and compute-side features.

According Geoffrey Noer, Senior Director of Product Management at Pansas this backpedalling was intentional. He claims that when they built out to ActiveStor 12, they left some room in the middle to add value and create a line that was intended to offer more balanced storage capacity that is tailored to a wider range of storage budgets.

While ActiveStor 12 was made for extreme performance, Panasas has tried to find middle ground with 11. The company scaled back to make more room on the hardware end that could be spent on drives, allowing Panasas offer 11 with what Noer described as “a more pleasing dollar per terabyte package.”

The degree to which Panasas is offering something far below ActiveStor 12 can be debated since both 11 and 12 are available with 3 terabyte drives. As Noer said, “whereas before we were talking about 40 terabytes per chassis scaling up to 4 petabytes, we’re now talking about 60 terabytes per chassis, scaling up to 6 petabytes in a single file system.” The real news is contained in the fact that there is no price premium for the bump, customers who bought 40 terabyte ActiveStor 12 systems can but the same system in essence with the 3 terabyte drives and get 50% more capacity.

Noer says that moving “back” to ActiveStor 11 is not necessarily much of a downgrade from 12, given the system’s ability to scale to 6 petabytes and 115 GB/s of throughput via a single global namespace. He claims that the fully parallel performance driven by a blade design that meshes capacity and speed can scale linearly due to the shedding of filer heads and hardware RAID controllers that can buffer performance. These distinctive discards of filer heads and controllers are a hallmark of the Panasas line of storage products and drive their reputation as a suitable HPC storage option, Noer explained.

Noer also touted the deployment, use and manageability advantages that can come from a tightly integrated system versus software-only approaches. He said that Panasas is always being compared to Lustre or IBM GPSS running on SAN storage but with that taking a software-driven angle means you’re trying to marry to a hardware architecture that wasn’t explicitly designed to handle such a system. Noer says by doing this, you introduce manageability issues that Panasas is freed from due to their Object RAID model wherein are stored on the blades. Since these blades have been designed at ground level to deliver maximum performance all elements of the hardware architecture are driven to mirror advantages on the storage OS side.

Panasas is pulling in 3 terabyte SATA drives with the introduction of ActiveStor 11, which is another density bump that Noer says stretch traditional hardware RAID controllers to the limit. He claims that with each density upgrade the reconstruction times haven’t been going up as fast as the drive capacity, which is a problem since it can take several days to rebuild a single drive after failure in many hardware environments. He says that under the object-oriented paradigm, it is possible to take advantage of the massive parallelism and throw a slew of processors at the problem to reconstruct in the tens of minutes. Additionally, although it might seem a bit illogical, Noer argued that the larger the scale of the single file system the faster a rebuild takes place.

He said that when all is said and done, “we still have the software and hardware redundancy on the hardware and software services side, but in many ways our object storage architecutre enables the use of these high speed drives–and the high performance we’re getting for disk allows us to take advantage of them without becoming an archival solution.”

On a side note, if you’re off looking for something that falls just below ActiveStor 11 and find yourself on a fruitless chase for ActiveStor 10 there isn’t one. According to Noer, plans for 10 were on the table but we scrapped. He says the performance and price appeal of 11 provides a balance between previous NAS solutions and the more expensive, performance-driven ActiveStor 12.

PAS HC is just the latest offering in Panasas’ NAS storage line. PAS, by the way, is the company’s new shorthand for all the Panasas ActiveStor lines, although, in this case, the architecture of PAS HC is quite different from that of the existing PAS 7, 8, and 9 series products. The latter are blade-based servers that use SATA drives, while the HC is implemented as a rackmount Nehalem-based storage system filled with Fibre Channel JBODs — a first for Panasas.

The glue that ties all the PAS’es together is Panasas’ home-grown parallel file system, PanFS. The idea is to abstract file access from the underlying hardware, and, perhaps more importantly, from the function of the different storage pools. With PanFS, a single global namespace can be created that spans multiple Panasas storage silos. So, for example, you can have very high performance storage for scratch space, robust NFS storage for home directories, and low-cost, scalable storage for archives, all under one file system.

“It’s a single mount point,” explains Larry Jones, vice president of marketing at Panasas. “All you have to do to move between a scratch space where you’re running a particular experiment and the archives where you have last year’s experiments is to change directories.”

In the past, less active data (like the raw data for a simulation) often got dumped to tape after it was initially processed. But workflows are often circular rather than linear, so for many applications, the raw data need to be continually accessible. A lot of technical computing projects have gotten so large that it’s not feasible to restore from tape every time a user wants to re-crunch the numbers. For petascale-sized datasets, this could take days or even weeks. Dumping data to cheap disk-based RAID devices is no solution either. The capacity of such systems is too small to act as a unified pool, so users end up creating multiple archives as disks fill up.

For example, an oil and gas company might process 100-300 terabytes of seismic data, and turn it into 20 to 40 TB that is subsequently funneled into seismic interpreters. This process is repeated for multiple seismic datasets. Ideally, what these companies want to do is keep both the raw and intermediate data around so they have the flexibility to rerun the models with different parameters. In workflows like this, the advantage of having all the data under a global namespace becomes apparent.

Panasas says the PAS HC solves the too-slow and too-fragmented dilemmas by marrying PanFS and ActiveStor-level performance with lower-cost and denser capacity of JBODs. A single HC RAID controller can deliver 5 GB/sec while reading and 3 GB/sec while writing. (Those numbers reflect file I/O, not block I/O.) Each 4U JBOD enclosure has room for 60 drives. At 2 TB per drive, that works out to 120 TB per shelf and a maximum of 960 TB per rack.

Because of the storage density, Panasas says it can offer the an HC setup at about a dollar per GB, or even less when discounted through a channel partner. That’s certainly a lot more expensive than tape storage, but for a high-performance disk-based set-up, fronted by a parallel file system, it’s relatively cheap.

Panasas has already sold a few PAS HC systems. One is destined for an oil and gas firm and another for an unnamed government agency. The only public deployment at this date is the system being deployed at Los Alamos National Laboratory, to support its nuclear security mission.

The PAS HC at Los Alamos will be used in conjunction with the Cray “Cielo” Baker-class super being installed in the second half of 2010. According to Jones, though, 6 PB of HC storage is already in place, with plans to expand to 12 PB. Performance will top out at 160 GB/sec of aggregate throughput. The new system is intended for active archiving, primary storage, as well as a scratch file area for checkpointing Cielo runs. The Panasas gear joins 2 PB of other ActiveStor storage (mostly PAS 8) at the lab.

The PAS HC release comes on the heels of a good year, revenue-wise, for Panasas. Despite the recession, the company recorded 25 percent year-over year growth in 2009, with Q4 being the best quarter in the company’s history. The first quarter of 2010 started with 46 percent year-over-year growth. With storage demand seemingly unsatiable, Panasas is hoping PAS HC keeps that momentum going.