Data quality

With Anatella, you can easily perform any data quality and data cleaning tasks.

You can easily (non-limitative list):

Check the validity of character fields
For example, check for the right formats using powerful regular expressions.
You can use the following Anatella operator to perform this task:

Check the validity of Numeric fields
For example: you can compute means, number of unique values, look for the highest & lowest number, count the number of missings, etc. You can use the following Anatella operator to perform this task:

Check for missing values
You can use the following Anatella operator to perform this task:

Check dates
For example: is it the right format, is it in Range?
You can use the following Anatella operators to perform this task: and

Looking for duplicates
Anatella contains a “box” to remove duplicates.
You can use the following Anatella operator to perform this task:

Looking for consistency between a set of (primary) keys between different datasources.
You can use the following Anatella operator to perform this task: and

Compare 2 datasets.
You can compare a selection of the character fields & numeric fields inside the 2 dataset.
You can use the following Anatella operator to perform this task:

You can design any complex test that you want using the powerful Javascript scripting engine included inside Anatella.

Once an error has been detected you can easily correct it, for example using the ?Anatella operator.

Automated text-spelling correction

Anatella also include a unique operator that checks & corrects the spelling mistakes in any text field. For example, let’s assume that your database contains a field named “City of Birth”. This field will usually contains many different orthography of the same city. For example, the city "RIO DE JANEIRO" can be mis-spelled in a number of different ways (this is a real-world example):?