Ocarina's software provides general file deduplication and more efficient coding of many image file formats, such as TIFFs and JPEGs, that reduce file size without any loss of image quality. Files to be optimised are read in and processed by an Ocarina appliance and then the original data is regenerated from deduplicated storage (rehydrated) by Ocarina reader software.

The embedding plan is to have an OEM, such as a system or storage array vendor, include Ocarina's software in its own software and so provide the data reduction in-line, rather than by using a separate appliance. Last week Permabit, an archiving vendor, announced its Albireo deduplication software with the same aim in mind.

Mike Davis, Ocarina's senior director for marketing, said: "Deduplication and compression will be part of array manufacturer OEMs' storage product in a year or so."

The embedded Ocarina software "is independent from the existing product. We developed a new code base." The current, appliance-based product is focused on large-scale, unstructured data stores. The new one is "focussed on a single API and architectural simplicity; it's got to be as simple as possible and as flexible as possible".

Why should OEMs and customers be interested? "Data should stay compressed and deduplicated all the way through its lifecycle and only get rehydrated when it is accessed." This approach is better than the point products seen today with separate dedupe code bases and excess use of network bandwidth: "That's a waste of resources."

Ocarina has talked about this to its existing OEM partners BlueArc, HDS and HP, and other potential customers. It has already signed a deal with one OEM and has another six in its pipeline. We may have OEM announcements this year or next, but they might not identify Ocarina as a source for their dedupe code.

Ocarina's appliance is content-aware - it has to be to deal separately with different image file formats. In a block array the data is opaque and you can't use a content-aware mechanism. Davis said that the dedupe process needs to be capable of being throttled and run in the background. The situation is different again in an object store where you know what and where objects are, and there are generally more spare CPU cycles available.

Ocarina is not including its content-aware deduplication or optimisation code in the embedded product. OEM customers are being given the choice of running the code in-line, in a post-process fashion or in a hybrid way.

Davis said the Permabit Albireo product makes claims around its logic for fast look-up - its IP is really in that area: "We don't think there's any special IP there though; we're at least as fast as them. You really have to have more of an end-to-end view.

"We know we have a better chunking algorithm. Permabit has no compression; we do."

The story being painted here is that Permabit is a storage company whose object store is not making good progress and faces a lot of competition. "It's striking out … We respect Permabit but we don't think they're going to be significant competition," Davis said.

One aspect of the entry of Ocarina and Permabit into the embedded dedupe space is that other companies seeking OEM deduplication deals, such as Quantum with its DXi technology, will find it more difficult to make progress. Has Quantum been looking to supply DXi deduplication in software-only form to potential OEM partners? ®