Top Story

As a foundational technology, text analytics has a lot of versatility, but its broad potential use can make it difficult to explain to prospective users. Whatever a company's pain point, chances are that text analytics can be part of the solution, and there is plenty of room for growth, both across different industry verticals and horizontally within an organization.

Unifi Software, a provider of self-service data tools, has added new features to its data platform to make the process of cataloging data and discovering datasets faster and easier for business users, data stewards, and data analysts. The new capabilities leverage artificial intelligence (AI) and natural language processing (NLP).

According to the vendor, the more users that engage across the Unifi Data Platform, the more intuitive data insights become as the AI engine, OneMind, learns to predict patterns and to recommend datasets to serve up to users.

The new release includes the ability to find similar datasets through the platform’s Dataset Explorer. When displaying a dataset in this view, users can choose to display other datasets that are the same or similar simply by clicking Similar Datasets. The result displays the percentage of similarity based upon sample stats such as comparing properties of the primary dataset. The AI-engine parses for these similarities to build the recommendation.

For data stewards or data engineers, finding similar datasets allows duplicate datasets to easily be discovered and cleansed. Often there are datasets that are generated on a recurring basis and, in most instances, the latest version will provide the highest value. In other instances, governance rules may be applied to one dataset and open in another or assigned to users with varying policies for use such as to mask PII data. In this way, a data steward or data engineer can easily find those datasets, combine information, delete or archive them based on lack of use over time.

The Unifi Data Platform also include a new feature called Tag Recommendations that allows users to Tag to a dataset to indicate what type of information it is such as “Sales” or “Finance” which then becomes searchable to other users later.

Unifi has also expanded its use of NLP in its search feature to auto-complete a query with information that is frequently requested by that user or team. An extension of that capability is to show the relationship between those aspects in a dataset such as, ‘Show me the permissions’ would indicate the governance rules applied to that dataset.