putting the idiot in idiot savant

Menu

Tag Archives: Cloudtenna

I’ve coveredCloudtenna in the past and had the good fortune to chat with Aaron Ganek about the general availability of Cloudtenna’s universal search product – DirectSearch. I thought I’d share some of my thoughts here.

About Cloudtenna

Cloudtenna are focussed on delivering “[t]urn-key search infrastructure designed specifically for files”. If you think of Elasticsearch as being synonymous with log search, then you might also like to think of Cloudtenna delivering an equivalent capability with file search.

The Challenge

According to Cloudtenna, the problem is that “[e]nterprises can’t keep track of files that are pattered across on-premises, cloud, and SaaS apps” and traditional search is a one-size-fits-all solution. In Cloudtenna’s opinion though, file search requires personalised search that reflects things such as ACLs. It’s expensive and difficult to scale.

Cloudtenna’s Solution

So what do Cloudtenna do then? The key features are the ability to:

Efficiently ingress massive amounts of data

Understand and adhere to user permissions

Return queries in near real-time

Reduce index storage and compute costs

“DirectSearch” is now generally available, and allows for cross-silo search across services such as DropBox, Gmail, Slack, Confluence, and so on. It seems reasonably priced at $10 US per user per month. Note that users who sign-up before December 1st 2018 can get 3 months of a free trial with no credit card details required).

DirectSearch CORE

In parallel to the release of DirectSearch, Cloudtenna are also announcing DirectSearch CORE – delivered via an OEM Model. I asked Ganek where he thought this kind of solution was a good fit. He told me that he saw it falling into three main categories:

Digital workspace category – eg. VMware, Citrix. Companies that want to be able to connect files into virtual digital workspaces;

Storage space – large storage vendors with SMB and NFS solutions – they might want to provide a global namespace over those transports; and

One of the big challenges with delivering a solution like DirectSearch is that every data source has its own permissions and ACL enforcement is a big challenge. Keep in mind that all of these different applications have their own version of authentication mechanisms, with some using open directory standards, and others doing proprietary stuff. And once you have authentication sorted out, you still need to ensure that users only get access to what they’re allowed to see. Cloudtenna tackle this challenge by ingesting “native ACLs” and normalising those ACLs with metadata.

Thoughts

Search is hard to do well. You want it to be quick, accurate, and easy to use. You also generally want it to be able to find stuff in all kinds of places. One of the problems with modern infrastructure is that we have access to a whole bunch of content repositories as part of our everyday corporate endeavours. I work with Slack, Dropbox, Box, OneDrive, SharePoint, file servers, Microsoft Teams, iMessage, email, and all kinds of systems as part of my job. I’m the first to admit that I don’t always have a good handle on where some stuff is. And sometimes I use the wrong system because it’s more convenient to access than the correct one is. Now multiply this problem out by the thousands of users in a decent-sized enterprise and you’ve got a recipe for disaster in terms of finding corporate knowledge in a timely fashion. Combine that with billions of files and you’re a passenger on Terry Tate’s pain train. Cloudtenna has quite a job on its hands in terms of delivering on the promise of “[b]ringing order to file chaos”, but if they can do that, it’ll be pretty cool. I’ll be signing up for a trial in the very near future and, if chaotic files aren’t your bag, then maybe you should give it a spin too.

Ganek told me that there are three major issues with file management and the plethora of collaboration tools used in the modern enterprise:

Search is too much effort

Security tends to fall through the cracks

Enterprise IT is dangerously non-compliant

Search

Most of these collaboration tools are geared up for search, because people don’t tend to remember where they put files, or what they’ve called them. So you might have some files in your corporate Box account, and some in Dropbox, and then some sitting in Confluence. The problem with trying to find something is that you need to search each application individually. According to Cloudtenna, this:

Wastes time;

Leads to frustration; and

Often yields poor results.

Security

Security also becomes a problem when you have multiple storage repositories for corporate files.

There are too many apps to manage

It’s difficult to track users across applications

There’s no consolidated audit trail

Exposure

As a result of this, enterprises find themselves facing exposure to litigation, primarily because they can’t answer these questions:

Who accessed what?

When and from where?

What changed?

As some of my friends like to say “people die from exposure”.

Cloudtenna – The DirectSearch Solution

Enter DirectSearch. At its core it’s a SaaS offering that

Catalogues file activity across disparate data silos; and

Delivers machine learning services to mitigate the “chaos”.

Basically you point it at all of your data repositories and you can then search across all of those from one screen. The cool thing about the catalogue is not just that it tracks metadata and leverages full-text indexing, it also tracks user activity. It supports a variety of on-premises, cloud and SaaS applications (6 at the moment, 16 by September). You only need to login once and there’s full ACL support – so users can only see what they’re meant to see.

According to Ganek, it also delivers some pretty fast search results, in the order of 400 – 600ms.

[image courtesy of Cloudtenna]

I was interested to know a little more about how the machine learning could identify files that were being worked on by people in the same workgroup. Ganek said they didn’t rely on Active Directory group membership, as these were often outdated. Instead, they tracked file activity to create a “Shadow IT organisational chart” that could be used to identify who was collaborating on what, and tailor the search results accordingly.

Thoughts and Further Reading

I’ve spent a good part of my career in the data centre providing storage solutions for enterprises to host their critical data on. I talk a lot about data and how important it is to the business. I’ve worked at some established companies where thousands of files are created every day and terabytes of data is moved around. Almost without fail, file management has been a pain in the rear. Whether I’ve been using Box to collaborate, or sending links to files with Dropbox, or been stuck using Microsoft Teams (great for collaboration but hopeless from a management perspective), invariably files get misplaced or I find myself firing up a search window to try and track down this file or that one. It’s a mess because we don’t juts work from a single desktop and carefully curated filesystem any more. We’re creating files on mobile devices, emailing them about, and gathering data from systems that don’t necessarily play well on some platforms. It’s a mess, but we need access to the data to get our jobs done. That’s why something like Cloudtenna has my attention. I’m looking forward to seeing them progress with the beta of DirectSearch, and I have a feeling they’re on to something pretty cool with their product. You can also read Rich’s thoughts on Cloudtenna over at the Gestalt IT website.

working for minimum rage

taking the social out of social networking

buy me a pony

photos of food

disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by my employer and does not necessarily reflect the views and opinions of my employers, previous or current. This is my blog.

Search

Search

Subscribe to PenguinPunk.net by email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.