How To Win With Mixed Workloads in Storage: A Case Study

Have you ever wondered what happens inside your stomach when you eat a cheeseburger? If you have, then you are in luck, thanks to the Department of Embryology at the Carnegie Institution for Science. New work from the organization for scientific discovery sheds light on how intestinal cells respond to high-fat foods. Turns out those cells have to do some shape shifting in order to digest that cheeseburger you had for lunch! But that is only one of the many projects happening at Carnegie Science.

The Department of Embryology operates with the goal of understanding fundamental developmental mechanisms at the cellular and molecular level. To achieve this goal, researchers not only look closely at intestinal cells, but also explore topics like amphibian metamorphosis and brain development. For these projects, researchers typically work with two kinds of data: images collected from microscopes and other imaging systems or sequencing data from next-generation sequencers. Add into the equation the usual variety of common documents used to report on findings, and you have multiple workloads all competing for your storage performance. This mixed workload storage environment presented technical challenges for the Carnegie Science team. Keep reading to find out how they set up a winning storage strategy, or download the full case study here.

Encountering the mixed workloads in storage challenge

The Department’s microscopy systems often create image sets that can be as large as 1TB per experiment. But just down the hall, another researcher could be working with sequencing data that comprises of millions of kilobyte-sized files. For Bill Kupiec, the IT Manager for the Department of Embryology, it became critical that he find the right solution for his mixed workload storage needs. “One of our major criteria was finding a storage system that could bridge that file volume and variety. It had to handle both the streaming needed for very large data sets and the fast processing required for millions of small files,” he said.

Finding a system that could reasonably handle all of the complex workflows was going to be difficult. There was the possibility that the system he chose would be optimized for one kind of workflow over the other. If that were the case, half of the researchers would see a performance hit in their applications, slowing down their productivity. Or worse, two separate storage systems would have to be purchased and managed separately. Luckily for Kupiec and his team, they had looked at QF2 for a previous project and remembered its ability to support mixed workloads in storage. “We started our search looking for an appliance: an all-in-one solution we can just turn on, that handles anything we throw at it,” Kupiec said. “From the get-go Qumulo delivered with the right capacity and performance, really checking all the boxes with a single cluster that just works.”

Life after the challenge of mixed workloads in storage

With the Qumulo cluster in place, the Department’s challenge of maintaining system performance across file types and sizes is a thing of the past. The team finds their new system is able to quickly traverse large directories, feed high file volumes and easily discover or ingest large streaming files. Considering the importance of the developmental science happening at the Department of Embryology at the Carnegie Institution of Science, it is critical that Qumulo’s data-aware scale-out storage system feeds researchers the data they want when they need it.

The Department of Embryology is one of six departments within the Carnegie Institution of Science. Researchers there have uncovered the role played by genes during embryogenesis, developed widely used experimental methodologies, and trained generations of biologists. You can download our full Qumulo case study here.

More material

How QF2 is used

You’ll find QF2 in many exciting places, particularly in data-intensive industries where innovation is the name of the game. With QF2, it’s possible to achieve medical breakthroughs from multi-petabyte experimental data sets. Read how Qumulo customers are leading the way.

Try QF2

Carnegie Institution for Science Tackles Volume and Variety of Research Data with QF2

Wrestling with both terabyte-size data sets and millions of small sequencing files is a daily occurrence for one of the world’s top research institutions. After considering many other storage systems, the Department of Embryology at the Carnegie Institution for Science turned to QF2