The title is a little messy, but think of your unstructured data in the same way you might look at the third drawer down in your kitchen. There’s a bunch of stuff in there and no-one knows what it all does, but you know it has some value. Aparavi describes File Protect and Insight (FPI), as “[f]ile by file data protection and archive for servers, endpoints and storage devices featuring data classification, content level search, and hybrid cloud retention and versioning”. It takes the data you’re not necessarily sure about, and makes it useful. Potentially.

It comes with a range of features out of the box, including:

Data Awareness

Data classification

Metadata aggregation

Policy driven workflows

Global Security

Role-based permissions

Encryption (in-flight and at rest)

File versioning

Data Search and Access

Anywhere / anytime file access

Seamless cloud integration

Full-content search

How Does It Work?

The solution is fairly simple to deploy. There’s a software appliance installed on-premises (this is known as the aggregator). There’s a web-accessible management console, and you configure your sources to be protected via network access.

[image courtesy of Aparavi]

You get the ability to mount backup data from any point in time, and you can provide a path that can be shared via the network to users to access that data. Regardless of where you end up storing the data, you leave the index on-premises, and search against the index, not the source. This saves you in terms of performance and speed. There’s also a good story to be had in terms of cloud provider compatibility. And if you’re looking to work with an on-premises / generic S3 provider, chances are high that the solution won’t have too many issues with that either.

Thoughts

Data protection is hard to do well at the best of times, and data management is even harder to get right. Enterprises are busy generating terabytes of data and are struggling to a) protect it successfully, and b) make use of that protected data in an intelligent fashion. It seems that it’s no longer enough to have a good story around periodic data protection – most of the vendors have proven themselves capable in this regard. What differentiates companies is the ability to make use of that protected data in new and innovative ways that can increase the value to that data to the business that’s generating it.

Companies like Aparavi are doing a pretty good job of taking the madness that is your third drawer down and providing you with some semblance of order in the chaos. This can be a real advantage in the enterprise, not only for day to day data protection activities, but also for extended retention and compliance challenges, as well as storage optimisation challenges that you may face. You still need to understand what the data is, but something like FPI can help you to declutter what that data is, making it easier to understand.

I also like some of the ransomware detection capabilities being built into the product. It’s relatively rudimentary for the moment, but keeping a close eye on the percentage of changed data is a good indicator of wether or not something is going badly wrong with the data sources you’re trying to protect. And if you find yourself the victim of a ransomware attack, the theory is that Aparavi has been storing a secondary, immutable copy of your data that you can recover from.

People want a lot of different things from their data protection solutions, and sometimes it’s easy to expect more than is reasonable from these products without really considering some of the complexity that can arise from that increased level of expectation. That said, it’s not unreasonable that your data protection vendors should be talking to you about data management challenges and deriving extra value from your secondary data. A number of people have a number of ways to do this, and not every way will be right for you. But if you’ve started noticing a data sprawl problem, or you’re looking to get a bit more from your data protection solution, particularly for unstructured data, Aparavi might be of some interest. You can read the announcement here.

Data is archived to cloud or on-premises based on policies for long-term lifecycle management;

Data is organised for easy access and retrieval; and

Data is accessible via Contextual Search.

Sounds pretty neat. So what’s new?

What’s New?

Direct-to-cloud

Direct-to-cloud provides the ability to archive data directly from source systems to the cloud destination of choice, with minimal local storage requirements. Instead of having to sotre archive data locally, you can now send bits of it straight to cloud, minimising your on-premises footprint.

Trickle or bulk data migration – Adding bulk migration of data from one storage destination to another; and

Dynamic translation from cloud to cloud.

[image courtesy of Aparavi]

Data Classification

The Active Archive solution can now index, classify, and tag archived data. This makes it simple to classify data based on individual words, phrases, dates, file types, and patterns. Users can easily identify and tag data for future retrieval purposes such as compliance, reference, or analysis.

I was pretty enthusiastic about Aparavi when they came out of stealth, and I’m excited about some of the new features they’ve added to the solution. Data management is a hard nut to crack. Primarily because a lot of different organisations have a lot of different requirements for storing data long term. And there are a lot of different types of data that need to be stored. Aparavi isn’t a silver bullet for data management by any stretch, but it certainly seems to meet a lot of the foundational requirements for a solid archive strategy. There are some excellent options in terms of storage by location, search, and organisation.

The cool thing isn’t just that they’ve developed a solid multi-cloud story. Rather, it’s that there are options when it comes to the type of data mobility the user might require. They can choose to do bulk migrations, or take it slower by trickling data to the destination. This provides for some neat flexibility in terms of infrastructure requirements and windows of opportunity. It strikes me that it’s the sort of solution that can be tailored to work with a business’s requirements, rather than pushing it in a certain direction.

I’m also a big fan of Aparavi’s “Open Data” access approach, with an open API that “enables access to archived data for use outside of Aparavi”, along with a published data format for independent data access. It’s a nice change from platforms that feel the need to lock data into proprietary formats in order to store them long term. There’s a good chance the type of data you want to archive in the long term will be around longer than some of these archive solutions, so it’s nice to know you’ve got a chance of getting the data back if something doesn’t work out for the archive software vendor. I think it’s worth keeping an eye on Aparavi, they seem to be taking a fresh approach to what has become a vexing problem for many.

Santa Monica-based (I love that place) SaaS data protection outfit, Aparavi, recently came out of stealth, and I thought it was worthwhile covering their initial offering.

So Latin Then?

What’s an Aparavi? It’s apparently Latin and means “[t]o prepare, make ready, and equip”. The way we consume infrastructure has changed, but a lot of data protection products haven’t changed to accommodate this. Aparavi are keen to change that, and tell me that their product is “designed to work seamlessly alongside your business continuity plan to ease the burden of compliance and data protection for mid market companies”. Sounds pretty neat, so how does it work?

Architecture

Aparavi uses a three tiered architecture written in Node.js and C++. It consists of:

The Aparavi hosted platform;

An on-premises software appliance; and

A source client.

[image courtesy of Aparavi]

The platform is available as a separate module if required, otherwise it’s hosted on Aparavi’s infrastructure. The software appliance is the relationship manager in the solution. It performs in-line deduplication and compression. The source client can be used as a temporary recovery location if required. AES-256 encryption is done at the source, and the metadata is also encrypted. Key storage is all handled via keyring-style encryption mechanisms. There is communication between the web platform and the appliance, but the appliance can operate when the platform is off-line if required.

Cool Features

There are a number of cool features of the Aparavi solution, including:

Patented point-in-time recovery – you can recover data from any combination of local and cloud storage (you don’t need the backup set to live in one place);

Aparavi runs on all Microsoft-supported Windows platforms as well as most major Linux distributions (including Ubuntu and RedHat). They use the Amazon S3 API, and support GCP and are working on OpenStack and Azure. They’ve also got some good working relationships with Cloudian and Scality, amongst others.

[image courtesy of Aparavi]

Availability?

Aparavi are having a “soft launch” on October 25th. The product is licensed on the amount of source data protected. From a pricing perspective, the first TB is always free. Expect to pay US $999/year for 3TB.

Conclusion

Aparavi are looking to focus on the mid-market to begin with, and stressed to me that it isn’t really a tool that will replace your day to day business continuity tool. That said, they recognize that customers may end up using the tool in ways that they hadn’t anticipated.

Aparavi’s founding team of Adrian Knapp, Rod Christensen, Jonathan Calmes and Jay Hill have a whole lot of experience with data protection engines and a bunch of use cases. Speaking to Jonathan it feels like they’ve certainly thought about a lot the issues facing folks leveraging cloud for data protection. I like the open approach to storing the data, and the multi-cloud friendliness takes the story well beyond the hybrid slideware I’m accustomed to seeing from some companies.

Cloud has opened up a lot of possibilities for companies that were traditionally constrained by their own ability to deliver functional, scalable and efficient infrastructure internally. It’s since come to people’s attention that, much like the days of internal-only deployments, a whole lot of people who should know better still don’t understand what they’re doing with data protection, and there’s crap scattered everywhere. Products like Aparavi are a positive step towards taking control of data protection in fluid environments, potentially helping companies to get it together in an effective manner. I’m looking forward to diving further into the solution, and am interested to see how the industry reacts to Aparavi over the coming months.

working for minimum rage

taking the social out of social networking

buy me a pony

photos of food

disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by my employer and does not necessarily reflect the views and opinions of my employers, previous or current. This is my blog.

Search

Search

Subscribe to PenguinPunk.net by email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.