Channels

Services

Talend wants to clean your data

Talend, developer of Open Studio and Open Profiler data integration tools, has announced that it will release Talend Data Quality, a tool for identifying "dirty data" in databases, in September.

Talend Data Quality, a GPL2 licensed application, is designed to look for nicknames, duplicate records and truncated street addresses and checks and cleans data by comparing it to a suitable postal database.

Data Quality is an Eclipse-based graphical application which lets the user drag, drop and connect components in a cleansing workflow. These components can, for example, reformat an address, check it in an official mail address database or pull data from other repositories. An SDK allows users to create their own components.

As well as being used as a standalone tool to clean data, Data Quality can generate Java or Perl code which can be integrated into the data management infrastructure. Talend Data Quality will be available as a standalone application or as part of Talend's Data Integration suite, which is also available under the GPL2.